Public Git Hosting - tinycc.git/commit

commit	ffb95c2e0ced9c6f9ca9109e37ea039a67797118
author	Petr Skocik <pskocik@gmail.com>
	Sun, 17 Jan 2021 21:21:07 +0000 (17 22:21 +0100)
committer	Petr Skocik <pskocik@gmail.com>
	Sun, 17 Jan 2021 23:49:24 +0000 (18 00:49 +0100)
tree	2292a8fa22dee91a96c162eed49773f87c83ecad	tree \| snapshot (tar.gz zip)
parent	6b614c4debb82931644e9c5cfaf93ee8b6840e2d	commit \| diff

Better handling of UCNs in strings

As the standard requires, take 4 hex digits after the \u opener of a
Universal Character Name, or take 8 hex digits after \U, but reject
smaller counts and don't consume more (https://port70.net/~nsz/c/c11/n1570.html#6.4.3,
https://port70.net/~nsz/c/c99/n1256.html#6.4.3).

The unicode codepoint used to get truncated to 1 byte. Now it gets expanded into UTF-8,
matching gcc & clang behavior on Linux.

TODO: Universal character names should also be supported in identifiers,
as in, e.g., char \u010dau_sv\u011bte[]="čau_světe";

tccpp.c		diff \| blob \| blame \| history
tests/tests2/97_utf8_string_literal.c		diff \| blob \| blame \| history