vcl PDF tokenizer: fix EOF position when \r is not followed by \n
commit6b1d5bafdc722d07d3dc4980764275a6caa707ba
authorMiklos Vajna <vmiklos@collabora.com>
Wed, 12 May 2021 08:51:09 +0000 (12 10:51 +0200)
committerMiklos Vajna <vmiklos@collabora.com>
Wed, 12 May 2021 12:58:51 +0000 (12 14:58 +0200)
treeb6445e3a24d5a5372a6c3c77b06a6ea706a57970
parent21f92ce72ada2ea92ef4997a7a0fae986f023b6c
vcl PDF tokenizer: fix EOF position when \r is not followed by \n

Otherwise this would break partial tokenize when we only read a trailer
in the middle of the file: m_aEOFs.back() is one byte larger than
rStream.Tell(), so we reader past the end of the trailer, resulting in a
tokenize failure.

What's special about the bugdoc:

- it has 2 xrefs, the first is incomplete, and refers to a second which
is later in the file
- the object length is as indirect object, triggering an xref lookup
- the first EOF is followed by a \r, but then not with a \n

This results in reading past the end of the first trailer and then
triggering a lookup failure.

FWIW, pdfium does the same in
<https://pdfium.googlesource.com/pdfium/+/59d107323f6727bbd5f8a4d0843081790638a1dd/core/fpdfapi/parser/cpdf_syntax_parser.cpp#446>,
we're on in sync with it.

Change-Id: Ia556a25e333b5e4f1418d92a98d74358862120e2
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/115466
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
Tested-by: Jenkins
vcl/qa/cppunit/filter/ipdf/data/comment-end.pdf [new file with mode: 0644]
vcl/qa/cppunit/filter/ipdf/ipdf.cxx
vcl/source/filter/ipdf/pdfdocument.cxx