utf8: fix overflow when returning string width
The return type of both `utf8_strwidth()` and `utf8_strnwidth()` is
`int`, but we operate on string lengths which are typically of type
`size_t`. This means that when the string is longer than `INT_MAX`, we
will overflow and thus return a negative result.
This can lead to an out-of-bounds write with `--pretty=format:%<1)%B`
and a commit message that is 2^31+1 bytes long:
=================================================================
==26009==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000001168 at pc 0x7f95c4e5f427 bp 0x7ffd8541c900 sp 0x7ffd8541c0a8
WRITE of size
2147483649 at 0x603000001168 thread T0
#0 0x7f95c4e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
#1 0x5612bbb1068c in format_and_pad_commit pretty.c:1763
#2 0x5612bbb1087a in format_commit_item pretty.c:1801
#3 0x5612bbc33bab in strbuf_expand strbuf.c:429
#4 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
#5 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
#6 0x5612bba0a4d5 in show_log log-tree.c:781
#7 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
#8 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
#9 0x5612bb69235b in cmd_log_walk builtin/log.c:549
#10 0x5612bb6951a2 in cmd_log builtin/log.c:883
#11 0x5612bb56c993 in run_builtin git.c:466
#12 0x5612bb56d397 in handle_builtin git.c:721
#13 0x5612bb56db07 in run_argv git.c:788
#14 0x5612bb56e8a7 in cmd_main git.c:923
#15 0x5612bb803682 in main common-main.c:57
#16 0x7f95c4c3c28f (/usr/lib/libc.so.6+0x2328f)
#17 0x7f95c4c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#18 0x5612bb5680e4 in _start ../sysdeps/x86_64/start.S:115
0x603000001168 is located 0 bytes to the right of 24-byte region [0x603000001150,0x603000001168)
allocated by thread T0 here:
#0 0x7f95c4ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
#1 0x5612bbcdd556 in xrealloc wrapper.c:136
#2 0x5612bbc310a3 in strbuf_grow strbuf.c:99
#3 0x5612bbc32acd in strbuf_add strbuf.c:298
#4 0x5612bbc33aec in strbuf_expand strbuf.c:418
#5 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
#6 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
#7 0x5612bba0a4d5 in show_log log-tree.c:781
#8 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
#9 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
#10 0x5612bb69235b in cmd_log_walk builtin/log.c:549
#11 0x5612bb6951a2 in cmd_log builtin/log.c:883
#12 0x5612bb56c993 in run_builtin git.c:466
#13 0x5612bb56d397 in handle_builtin git.c:721
#14 0x5612bb56db07 in run_argv git.c:788
#15 0x5612bb56e8a7 in cmd_main git.c:923
#16 0x5612bb803682 in main common-main.c:57
#17 0x7f95c4c3c28f (/usr/lib/libc.so.6+0x2328f)
SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
Shadow bytes around the buggy address:
0x0c067fff81d0: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
0x0c067fff81e0: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
0x0c067fff81f0: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
0x0c067fff8200: fd fd fd fa fa fa fd fd fd fd fa fa 00 00 00 fa
0x0c067fff8210: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
=>0x0c067fff8220: fd fa fa fa fd fd fd fa fa fa 00 00 00[fa]fa fa
0x0c067fff8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff8240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff8250: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff8260: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff8270: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==26009==ABORTING
Now the proper fix for this would be to convert both functions to return
an `size_t` instead of an `int`. But given that this commit may be part
of a security release, let's instead do the minimal viable fix and die
in case we see an overflow.
Add a test that would have previously caused us to crash.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>