Public Git Hosting - elinks.git/commit

commit	450f227ea14e4ce5afa023f2d2c2d4dde4485e76
author	Kalle Olavi Niemitalo <kon@iki.fi>
	Sun, 17 Apr 2011 15:09:29 +0000 (17 18:09 +0300)
committer	Kalle Olavi Niemitalo <Kalle@Pulska.kon.iki.fi>
	Sun, 1 May 2011 19:14:55 +0000 (1 22:14 +0300)
tree	2e7a8c9c441c001f3faac8c608dfafcf19cf7022	tree \| snapshot (tar.gz zip)
parent	17712f9cf3287eb2059d4b593525acd3c65a48c9	commit \| diff

I18N bug 1112: Use strange_chars[] for UTF-8 output too

Make u2cp_() map code points U+0080 to U+009F via strange_chars[] even
if the target codepage is UTF-8.  This helps with buggy web pages that
use  when they mean ’.  This change does not affect how
ELinks decodes raw bytes 0x80 to 0x9F in HTML.

u2cp_() is used only via the u2cp and u2cp_no_nbsp macros.
Possible side effects of this change at each use of these macros:

* get_translation_table(): Not affected because it does not call u2cp
  if the target codepage is UTF-8.
* get_entity_string(): Numeric character references are affected, as intended.
  Character entity references are not affected because entities[]
  does not define any entities in the U+0080...U+009F range.
* kbd_field(), term_send_ucs(), field_op(): Affected.  It is no longer
  possible to enter code points U+0080...U+009F from the terminal.
  This should not be a problem in practice because those would be
  control characters anyway and should therefore be filtered by the
  slave process (which doesn't yet recognize them; bug 777).

NEWS		diff \| blob \| blame \| history
src/intl/charsets.c		diff \| blob \| blame \| history