gitweb: workaround surrogate code point problem
When gitweb attempts to automatically treat repository data as
UTF-8, it uses utf8::decode to activate Perl's UTF-8 flag.
Unfortunately, surrogate pairs (codepoints 0xD800-0xDFFF) are
also converted to UTF-8 if present in the input. However those
codepoints are only valid in UTF-16. Attempting to do any kind
of pattern match substitution on the strings that contain these
UTF-8 surrogate pair code points will result in a fatal
'Malformed UTF-8 character' error.
The substitution in question is attempting to replace control
characters with nice-looking escapes sequences. It only needs
to detect character values 0x00-0x1f, so switch into bytes mode
for the substitution to avoid the fatal error.
This results in the surrogates actually being sent back to the
browser for display which typically results in them being
rendered as a replacement character (0xfffd).
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>