1 <?xml version=
"1.0" encoding=
"UTF-8"?>
2 <!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
4 <html xmlns=
"http://www.w3.org/1999/xhtml"
5 xmlns:
xi=
"http://www.w3.org/2001/XInclude"
6 xmlns:
xc=
"urn:xhtml-compiler"
7 xmlns:
svn=
"urn:xhtml-compiler:Subversion"
8 svn:
head-url=
"$HeadURL: svn+ssh://ezyang@htmlpurifier.org/svnroot/htmlpurifier-web/trunk/comparison.xhtml $"
9 svn:
revision=
"$Revision: 1389 $"
13 <title>HTML Purifier Sucks - HTML Purifier
</title>
14 <xi:include href=
"common-meta.xml" xpointer=
"xpointer(/*/node())" />
15 <meta name=
"keywords" content=
"HTMLPurifier, HTML Purifier, HTML, filter, sucks, devils advocate, evil, bad" />
19 <xi:include href=
"common-header.xml" xpointer=
"xpointer(/*/node())" />
20 <h1 id=
"title">HTML Purifier Sucks
</h1>
24 <blockquote class=
"fancy">
26 ...needless to say, I don't think I'll bother investigating further!
29 — Stormrider on
<a href=
"http://www.sitepoint.com/forums/showpost.php?p=3621314&postcount=119">SitePoint Forums
</a>
34 Contrary to what
<a href=
"comparison.html">this comparison page
</a>
35 suggests, HTML Purifier sucks. It swallows oceans, it drinks blood,
36 and it is more effective than your dust-busting Hoover
3000. Why does it
37 suck? How can we make it un-sucky?
41 This document is currently under construction.
49 As of version
2.1.3, HTML Purifier's library folder contains
50 <strong>164 files
</strong> in
<strong>30 folders
</strong>, weighing
51 at about
696 kilobytes. For comparison, the CodeIgniter
52 web application framework contains
147 files,
29 folders and weighs
57 These back-of-a-napkin statistics are very telling about HTML Purifier's
58 internal architecture: object-oriented, one class per file and small
59 components, to the extreme. It also works against HTML Purifier when
60 it comes to the performance department. For most input strings, the
61 memory footprint from this library's source code is higher than the
62 memory used actually processing the HTML (four megabytes,
63 <a href=
"http://forums.devnetwork.net/viewtopic.php?p=405175#405175">last I checked
</a>.)
69 HTML Purifier is extremely slow. Various benchmarks have shown HTML
70 Purifier to be an order of a magnitude slower than comparable solutions.
76 The
<a href=
"http://www.sitepoint.com/forums/showpost.php?p=3621314&postcount=119">Stormrider
77 quote
</a> at the very beginning of this document is for one very
78 specific problem: whitespace.
84 It is trivially easy to nuke the contents of a document by inserting
85 a
<code></div
></code> tag near the beginning, when DOMLex