ldb:utf8: ldb_ascii_toupper() avoids real toupper()
commitc49c48afe09a1a78989628bbffd49dd3efc154dd
authorDouglas Bagnall <douglas.bagnall@catalyst.net.nz>
Fri, 19 Apr 2024 21:57:15 +0000 (20 09:57 +1200)
committerAndrew Bartlett <abartlet@samba.org>
Tue, 23 Apr 2024 02:37:25 +0000 (23 02:37 +0000)
tree9b24e0cb1e01bf68d9444381cdfb1818f605ae44
parentdca6b2d25529288eaf7b31baf37ca4f6de4f4b9d
ldb:utf8: ldb_ascii_toupper() avoids real toupper()

If a non-lowercase ASCII character has an uppercase counterpart in
some locale, toupper() will convert it to an int codepoint. Probably
that codepoint is too big to fit in our char return type, so we would
truncate it to 8 bit. So it becomes an arbitrary mapping.

It would also behave strangely with a byte with the top bit set, say
0xE2. If char is unsigned on this system, that is 'â', which
uppercases to 'Â', with the codepoint 0xC2. That seems fine in
isolation, but remember this is ldb_utf8.c, and that byte was not a
codepoint but a piece of a long utf-8 encoding. In the more likely
case where char is signed, toupper() is being passed a negative
number, the result of which is undefined.

Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Autobuild-User(master): Andrew Bartlett <abartlet@samba.org>
Autobuild-Date(master): Tue Apr 23 02:37:25 UTC 2024 on atb-devel-224
lib/ldb/common/ldb_utf8.c