Rewrite FixNesting implementation to be tree-based.
commit0767bbc12dbbc8a9c31cc235f055443257ffa51e
authorEdward Z. Yang <ezyang@mit.edu>
Mon, 21 Oct 2013 05:18:59 +0000 (20 22:18 -0700)
committerEdward Z. Yang <ezyang@mit.edu>
Mon, 21 Oct 2013 05:37:01 +0000 (20 22:37 -0700)
tree05b7a03b33bb1f59b6c770609aa381ed2942147a
parentb3640e1af6cef99dbfe1a8bdfd9acefc11fc8549
Rewrite FixNesting implementation to be tree-based.

This mega-patch rips out the FixNesting implementation and the related
ChildDef components.  The primary algorithmic change is to convert from
use of tokens to tree nodes, which are far more amenable to the style
of processing that FixNesting uses.  Additionally, FixNesting has been
changed to go bottom-up rather than top-down, in order to avoid needing
to implement backtracking.

This patch simplifies a good deal of the relevant logic, since we no
longer need to continually recalculate the nesting structure when
processing things.  However, the conversion to the alternate format
incurs some overhead, so for small inputs these changes are not a win.
One possibility to greatly reduce the constant factors here is to switch
to entirely using libxml's representation, and never serializing tokens;
this would require one to rewrite injectors, however.

The iterative post-order traversal in FixNesting is a bit subtle, but
we have essentially reified the stack and continuations.

We've removed support for %Core.EscapeInvalidChildren.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
22 files changed:
NEWS
configdoc/usage.xml
library/HTMLPurifier/ChildDef.php
library/HTMLPurifier/ChildDef/Chameleon.php
library/HTMLPurifier/ChildDef/Custom.php
library/HTMLPurifier/ChildDef/Empty.php
library/HTMLPurifier/ChildDef/List.php
library/HTMLPurifier/ChildDef/Optional.php
library/HTMLPurifier/ChildDef/Required.php
library/HTMLPurifier/ChildDef/StrictBlockquote.php
library/HTMLPurifier/ChildDef/Table.php
library/HTMLPurifier/ConfigSchema/schema/Core.EscapeInvalidChildren.txt
library/HTMLPurifier/Node.php
library/HTMLPurifier/Node/Text.php
library/HTMLPurifier/Strategy/FixNesting.php
tests/HTMLPurifier/ChildDef/CustomTest.php
tests/HTMLPurifier/ChildDef/ListTest.php
tests/HTMLPurifier/ChildDef/RequiredTest.php
tests/HTMLPurifier/ChildDef/TableTest.php
tests/HTMLPurifier/ChildDefHarness.php
tests/HTMLPurifier/ComplexHarness.php
tests/HTMLPurifier/Strategy/FixNestingTest.php