Bug
1840519 - Make typing surrogate pair behavior switchable with prefs r=m_kato,smaug
A lone surrogate should not appear in `DOMString` at least when the attribute
values of events because of ill-formed UTF-16 string.
`TextEventDispatcher` does not handle surrogate pairs correctly. It should not
split surrogate pairs when it sets `KeyboardEvent.key` value for avoiding the
problem in some DOM API wrappers, e.g., Rust-running-as-wasm.
On the other hand, `.charCode` is an unsigned long attribute and web apps
may use `String.fromCharCode(event.charCode)` to convert the input to string,
and unfortunately, `fromCharCode` does not support Unicode character code
points over `0xFFFF`. Therefore, we may need to keep dispatching 2 `keypress`
events per surrogate pair for the backward compatibility.
Therefore, this patch creates 2 prefs. One is for using single-keypress
event model and double-keypress event model. The other is for the latter,
whether `.key` value never has ill-formed UTF-16 or it's allowed.
If using the single-keypress event model --this is compatible with Safari and
Chrome in non-Windows platforms--, one `keypress` event is dispatched for
typing a surrogate pair. Then, its `.charCode` is over `0xFFFF` which can
work with `String.fromCodePoint()` instead of `String.fromCharCode()` and
`.key` value is set to simply the surrogate pair (i.e., its length is 2).
If using the double-keypress event model and disallowing ill-formed UTF-16
--this is the new default behavior for both avoiding ill-formed UTF-16 string
creation and keeping backward compatibility with not-maintained web apps using
`String.fromCharCode`--, 2 `keypress` events are dispatched. `.charCode` for
first one is the code of the high-surrogate, but `.key` is the surrogate pair.
Then, `.charCode` for second one is the low-surrogate and `.key` is empty
string. In this mode, `TextEditor` and `HTMLEditor` ignores the second
`keypress`. Therefore, web apps can cancel it only with the first `keypress`,
but it indicates the `keypress` introduces a surrogate pair with `.key`
attribute.
Otherwise, if using the double-keypress event model and allowing ill-formed
UTF-16 --this is the traditional our behavior and compatible with Chrome in
Windows--, 2 `keypress` events are dispatched with same `.charCode` values as
the previous mode, but first `.key` is the high-surrogate and the other's is
the low surrogate. Therefore, web apps can cancel either one of them or
both of them.
Finally, this patch makes `TextEditor` and `HTMLEditor` handle text input
with `keypress` events properly. Except in the last mode, `beforeinput` and
`input` events are fired once and their `data` values are the surrogate pair.
On the other hand, in the last mode, 2 sets of `beforeinput` and `input` are
fired and their `.data` values has only the surrogate so that ill-formed
UTF-16 values.
Note that this patch also fixes an issue on Windows. Windows may send a high
surrogate and a low surrogate with 2 sets of `WM_KEYDOWN` and `WM_KEYUP` whose
virtual keycode is `VK_PACKET` (typically, this occurs with `SendInput` API).
For handling this correctly, this patch changes `NativeKey` class to make it
just store the high surrogate for the first `WM_KEYDOWN` and `WM_KEYUP` and use
it when it'll receive another `WM_KEYDOWN` for a low surrogate.
Differential Revision: https://phabricator.services.mozilla.com/
D182142