Fix content-length mismatches
Summary:
This diff tweaks a few things that were manifest in a failing test, arising from a .php file which included multibyte characters.
(1) The Content-length header that our tests emit must be the exact number of bytes. We were using `Content-length: ${#CMD}` but this is wrong because it counts the number of chars, not bytes. I switched to `$(printf "%s" "$CMD" | wc -c)` which counts bytes.
(2) If we got a malformed json packet which lacked the trailing }, for example because the Content-length header was short, then Hh_json would raise a "Index out of bounds" error. That's not right. Hh_json is supposed to raise a "Hh_json.Syntax_error" in this case. I fixed it.
(3) There's an unfixable vulnerability in Jsonrpc. I documented it:
```
CARE!!! Jsonrpc is vulnerable to incomplete requests and malformed
Content-length headers...
The way it works around lack of threading in ocaml is with the
assumption that if any data is available on stdin then a complete
jsonrpc request can be read from stdin. If this is violated e.g.
if the Content-length header is one byte short, then Jsonrpc will
read a json string that lacks the final }, and will report this as
a recoverable error (malformed json). Next, Jsonrpc will see that
there is more data available on stdin, namely that final }, and so
will block until a header+body has been read on stdin -- but nothing
further will come beyond that }, so it blocks indefinitely.
The only solution is to take care that Content-length is exact!
```
(4) I haven't been able to think of scenarios for it, but the way clientLsp was checking "is there a message available" seemed incorrect in the case where a message was available in the Jsonrpc queue (i.e. had already been received by the daemon). I changed the API:
```
(* BEFORE *)
val get_read_fd : t -> Unix.file_descr (* can be used for 'select' *)
(* AFTER *)
(** similar to [has_message], but can be used to power a 'select' syscall
which will fire when a message is available. *)
val await_until_message :
t -> [> `Already_has_message | `Wait_for_data_here of Unix.file_descr ]
```
Really we should be returning an Lwt.t, but I'm not doing that because we're in a module that can be used without Lwt.
Reviewed By: Wilfred
Differential Revision:
D34909642
fbshipit-source-id:
249832ce4afe8965fd55567cd3aeb3601d1d9148