Fix thinko in `decodeStringUtf8`
commited3290740fd116318edec77b3ab3b628bac4460a
authorHerbert Valerio Riedel <hvr@gnu.org>
Tue, 4 Oct 2016 06:15:38 +0000 (4 08:15 +0200)
committerHerbert Valerio Riedel <hvr@gnu.org>
Tue, 4 Oct 2016 06:15:38 +0000 (4 08:15 +0200)
tree5ef85f860ebd543b15e9d78d4aa30614b23c1f64
parenta87fcd103c61dc170b5affd724f75c7991ccf260
Fix thinko in `decodeStringUtf8`

This resulted in some two-bytes utf8 encodings to be decoded
into U+FFFD unintentionally (such as e.g. U+0142).

With this fix, the property

    [ c | c <- [minBound..maxBound]
        , c < '\xD800' || c >= '\xE000' -- surrogate pair codes
        , (decodeStringUtf8 . encodeStringUtf8) [c] /= [c]
        ] == ['\xfffe','\xffff']

holds. It's not clear to me why U+FFFE and U+FFFF ought to be singled
out. Needs more investigation.

TODO: testsuite coverage
Cabal/Distribution/Utils/String.hs