1 <?xml version=
"1.0" encoding=
"UTF-8"?>
2 <!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
5 xmlns=
"http://www.w3.org/1999/xhtml"
6 xmlns:
xi=
"http://www.w3.org/2001/XInclude"
7 xmlns:
xc=
"urn:xhtml-compiler"
8 xmlns:
svn=
"urn:xhtml-compiler:Subversion"
9 svn:
head-url=
"$HeadURL$"
10 svn:
revision=
"$Revision$"
14 <title>HTML Purifier - Filter your HTML the standards-compliant way!
</title>
15 <xi:include href=
"common-meta.xml" xpointer=
"xpointer(/*/node())" />
16 <meta name=
"description"
17 content=
"HTML filter that guards against XSS and ensures standards-compliant output." />
19 content=
"HTMLPurifier, HTML Purifier, HTML, filter, filtering, standards, compliant, w3c, XSS, PHP, security, library, open source, LGPL, whitelist" />
20 <!-- See news.xhtml for definition -->
21 <link rel=
"alternate" type=
"application/rss+xml" title=
"News - HTML Purifier" href=
"news.rss" />
22 <script defer=
"defer" type=
"text/javascript" src=
"del.icio.us.js" xc:
absolute=
"src"></script>
23 <!-- OpenID for Edward Z. Yang -->
24 <link rel=
"openid.server" href=
"https://pip.verisignlabs.com/server" />
25 <link rel=
"openid.delegate" href=
"http://edwardzyang.pip.verisignlabs.com/" />
26 <!-- Google OpenSearch -->
27 <link rel=
"search" href=
"opensearchdescription.xml"
28 type=
"application/opensearchdescription+xml"
29 title=
"HTML Purifier" />
35 <span class=
"html">HTML
</span>
36 <span class=
"purifier">Purifier
</span>
40 Standards-Compliant HTML Filtering
45 <xi:include href=
"common-navigation.xml" xpointer=
"xpointer(/*/node())" />
51 <div id=
"summary-safe">
54 HTML Purifier defeats XSS with an audited whitelist
57 <div id=
"summary-clean">
60 HTML Purifier ensures standards-compliant output
63 <div id=
"summary-open">
66 HTML Purifier is open-source and highly customizable
72 <div class=
"warning" style=
"margin-left:0; margin-right:0;">
73 <strong>Most recent release is a security update.
</strong> Please upgrade
74 to HTML Purifier
3.1.1 or
2.1.5 as soon as possible.
77 <p><strong>HTML Purifier
</strong> is a standards-compliant
78 <abbr>HTML
</abbr> filter library written in
79 <abbr>PHP
</abbr>. HTML Purifier will not only remove all malicious
80 code (better known as
<abbr>XSS
</abbr>) with a thoroughly audited,
81 secure
<em>yet
</em> permissive
<strong><a
82 href=
"live/smoketests/printDefinition.php">whitelist
</a></strong>,
83 it will also make sure your documents are
84 <strong>standards compliant
</strong>, something only achievable with a
85 comprehensive knowledge of
<abbr>W3C
</abbr>'s specifications.
86 Tired of using BBCode due to the current landscape of deficient or
87 insecure
<abbr>HTML
</abbr> filters? Have a
88 <strong><acronym>WYSIWYG
</acronym></strong> editor but never been able to use it? Looking
89 for high-quality, standards-compliant, open-source components for that
90 application you're building? HTML Purifier is for you!
</p>
92 <blockquote class=
"fancy">
94 I'd just like to say we use HTML Purifier in
<a href=
"http://www.iris.ac/">IRIS
</a> for
95 filtering emails against XSS attacks and we've been more than impressed.
97 <div class=
"origin">— Chris Corbyn,
<em>Senior IRIS Developer
</em></div>
100 <xi:include href=
"download-box.xml" xpointer=
"xpointer(/*/node())" />
104 <h2 id=
"Background" class=
"clear">Background
</h2>
106 <p>There are a number of open-source
<abbr>HTML
</abbr> filtering solutions out
107 there on the web already
108 (i.e.
<acronym>PEAR
</acronym>'s
109 <a href=
"http://pear.php.net/package/HTML_Safe">HTML_Safe
</a>,
110 <a href=
"http://sourceforge.net/projects/kses">kses
</a>
112 <a href=
"http://simon.incutio.com/archive/2003/02/23/safeHtmlChecker">
113 SafeHtmlChecker.class.php
</a>). What sets HTML Purifier apart from them?
114 Aren't all of these choices
<q>secure
</q>?
</p>
116 <p>When it comes to
<abbr>HTML
</abbr>,
<strong>attention to
117 detail
</strong> is key. Does the library demonstrate an in-depth
118 knowledge of the
<abbr>DTD
</abbr> that defines
119 <abbr>HTML
</abbr>? Does it perform its filtering off a robust
120 whitelist rather than a usually out-dated blacklist? Does it go through
121 the care to check every single attribute in the document for validity?
122 Does it actually understand tag markup, or pay lip-service with a series
123 of deficient regexes and str_replace's?
</p>
125 <p>Somewhere along the way, all of HTML Purifier's predecessors fall
126 flat. HTML_Safe dooms itself to attacks of the future by using a
127 blacklist. Configurable filters like kses and PHP Input Filter still
128 cannot validate the contents inside attributes. With all these gaps in
129 coverage, none of the usual libraries come close to achieving
130 <strong>standards-compliance
</strong>. There is a user-unfriendly,
131 draconic
<abbr>XML
</abbr>-based filter called Safe HTML Checker,
132 but even it forgets that
<code><a
></code> tags cannot be nested
133 within each other!
</p>
135 <p><strong>Know thy enemy.
</strong> Wily hackers have a huge arsenal of
136 <abbr>XSS
</abbr> hidden within the depths of the
137 <abbr>HTML
</abbr> specification. HTML Purifier takes its
138 effectiveness from the fact that it will decompose the whole document
139 into tokens, and rigorously process the tokens by removing
140 non-whitelisted elements, transforming bad practice tags like font into
141 span, properly checking the nesting of tags and their children and
142 validating all attributes according to their
<abbr>RFC
</abbr>s.
143 HTML Purifier's comprehensive algorithms are complemented by a
144 <strong>breadth of knowledge
</strong>, ensuring that richly formatted
145 documents pass through unstripped.
</p>
147 <p>To my knowledge, there is nothing else in the wild that offers
148 protection from
<abbr>XSS
</abbr>, standards-compliance, and the
149 corrective processing of poorly formed
<abbr>HTML
</abbr>
150 simultaneously. Don't take my word for it though:
151 do your research. Investigate the other libraries, and decide for
152 yourself who you would prefer to be the
<strong>gatekeeper
</strong> to
155 <p>To find out more, you can read the
156 <a href=
"comparison.html"><strong>Comparison
</strong></a>
157 for a play-by-play analysis of the major filter libraries currently
160 <blockquote class=
"fancy">
162 [Y]ou save my day by allowing me not to write another damned HTML parser.
165 — Joseph Halter,
<em>Technical Director at Akira Web
</em>
169 <h2 id=
"Plugins">Plugins
</h2>
171 <p>HTML Purifier is a great library to integrate with existing
172 <abbr>CMS
</abbr>es and other applications or
<acronym>WYSIWYG
</acronym>
173 editors. Currently, we have plugins for these applications:
</p>
176 <li><a href=
"http://www.phorum.org/phorum5/read.php?62,127035">Phorum
</a> (in use at our very own forums!)
</li>
177 <li><a href=
"http://htmlpurifier.org/svnroot/htmlpurifier/trunk/plugins/modx.txt">MODx
</a></li>
178 <li><a href=
"http://bart.motd.be/projects/html-purifier-drupal-module">Drupal
</a> by Bart Jansens
</li>
179 <li><a href=
"http://urbangiraffe.com/plugins/html-purified/">Wordpress
</a> by John Godley
</li>
180 <li><a href=
"http://extensions.joomla.org/component/option,com_mtree/task,viewlink/link_id,4094/Itemid,35/">Joomla
</a> by Double D
</li>
181 <li><a href=
"http://www.mindloop.be/nieuws/nieuwe-ontwikkelingen/htmlpurifier-and-the-codeigniter-framework">CodeIgniter
</a> by Andy Mathijs
</li>
185 <strong>Notice:
</strong>
186 Any plugin provided by a third party has not been vetted by us: use
187 them at your own risk. If you are having a problem with the plugin,
188 please consult the plugin author before asking for help here (we'll
189 be more than happy to help, but it might be a problem with the
190 plugin rather than HTML Purifier.)
193 <blockquote class=
"fancy">
195 This plugin is on top of my favorite list[.] I am going to heavily
196 depend on it since my clients insist on having
<acronym>WYSIWYG
</acronym> and I insist on
197 having pages that validate and are semantically sound.
200 — David Molliere,
<em>MODx Marketing
& Design Team
</em>
204 <p>Plugins for other major applications gladly accepted!
</p>
207 <h2 id=
"Users">Users
</h2>
209 <p>Here are some open-source applications that use HTML Purifier:
</p>
212 <tr><td><a href=
"http://www.aliro.org/">Aliro
</a></td><td><a href=
"http://aliro-svn.cvsdude.com/aliro/trunk/extclasses/HTMLPurifier.php">3.1.0</a></td></tr>
213 <tr><td><a href=
"http://code.google.com/p/jibberbook/">Jibberbook
</a></td><td><a href=
"http://jibberbook.googlecode.com/svn/trunk/source/htmlpurifier/HTMLPurifier.standalone.php">3.1.0</a></td></tr>
214 <tr><td><a href=
"http://brilaps.com/index.php?content=mia">Mia
</a></td><td><a href=
"http://code.google.com/p/mia-chat/source/browse/trunk/mia_0_8_x/includes/htmlpurifier/HTMLPurifier.php">3.1.0</a></td></tr>
215 <tr><td><a href=
"http://kohanaphp.com/home.html">Kohana
</a></td><td><a href=
"http://trac.kohanaphp.com/browser/trunk/system/vendor">3.1.0</a></td></tr>
216 <tr><td><a href=
"http://www.midgard-project.org/">Midgard
</a></td><td>via PEAR
</td></tr>
217 <tr><td><a href=
"http://www.bitweaver.org/">BitWeaver
</a></td><td><a href=
"http://www.bitweaver.org/wiki/HTMLPurifier">via PEAR
</a>, see
<a href=
"http://bitweaver.cvs.sourceforge.net/bitweaver/_bit_install/install_checks.php?view=markup">install_checks.php
</a></td></tr>
218 <tr><td><a href=
"http://code.google.com/p/project-babel/issues/entry">Project Babel
</a></td><td>via PEAR and Midgard
</td></tr>
219 <tr><td><a href=
"http://code.google.com/p/php-atompub-server/">PHP Atompub Server
</a></td><td><a href=
"http://code.google.com/p/php-atompub-server/wiki/SanitizingInput">via download
</a></td></tr>
222 <p>If I've forgotten anyone, drop me a line with a link to both
223 your application and the use of HTML Purifier in your code repository,
224 and I'll add your application to this list.
</p>
226 <h3>Hall of Limbo: PHP4
</h3>
228 <p>The following applications are using HTML Purifier
2.1, for PHP4 compatibility.
229 While this is fine, I would much rather they go PHP5!
</p>
232 <tr><td>There are currently no applications using an up-to-date version of HTML Purifier
2.1.
</td></tr>
236 <h3>Hall of the Past
</h3>
238 <p>The following projects package HTML Purifier with their software, but are
239 not up-to-date. They are putting their userbase at risk of security attacks
240 by not keeping HTML Purifier updated. If you're a user or developer for these projects, please
241 raise your voice and help to get them fixed!
</p>
244 <tr><td><!--<a href="http://code.google.com/p/wpids/">-->WPIDS
<!--</a>--></td><td><a href=
"http://code.google.com/p/wpids/source/browse/trunk/htmlpurifier/HTMLPurifier.php">3.0.0</a></td></tr>
245 <tr><td><!--<a href="http://noserub.com/">-->NoseRub
<!--</a>--></td><td><a href=
"http://code.google.com/p/noserub/source/browse/trunk/vendors/htmlpurifier/HTMLPurifier.php">3.0.0</a></td></tr>
246 <tr><td><!--<a href="http://getlilina.org/">-->Lilina News Aggregator
<!--</a>--></td><td><a href=
"http://lilina.googlecode.com/svn/trunk/lilina/inc/contrib/HTMLPurifier.standalone.php">2.1.3</a></td></tr>
247 <tr><td><!--<a href="http://info.tikiwiki.org/tiki-index.php">-->TikiWiki
<!--</a>--></td><td><a href=
"http://tikiwiki.svn.sourceforge.net/viewvc/tikiwiki/branches/1.10/lib/HTMLPurifier.php?view=markup">2.1.3</a></td></tr>
248 <tr><td><!--<a href="http://code.google.com/p/xoopsbrasil/">-->XOOPS Cube BRASIL
<!--</a>--></td><td><a href=
"http://code.google.com/p/xoopsbrasil/source/browse/xoops_trust_path/PEAR/HTMLPurifier.php">2.1.3</a></td></tr>
249 <tr><td>Lichen Webmail
</td><td><a href=
"http://trac.lichen-mail.org/browser/trunk/libs/HTMLPurifier.php">2.0.1</a>, see
<a href=
"https://trac.lichen-mail.org/ticket/79">ticket #
79</a></td></tr>
250 <tr><td>PHProjekt
</td><td><a href=
"http://thinkforge.org/plugins/scmcvs/cvsweb.php/phprojekt50/lib/html/library/HTMLPurifier.php?rev=HEAD;content-type=text%2Fplain;cvsroot=phprojekt5">1.6.0</a></td></tr>
251 <tr><td>XDForum
</td><td><a href=
"http://xdforum.svn.sourceforge.net/viewvc/xdforum/trunk/xdforum/includes/htmlpurifier/library/HTMLPurifier.php?view=markup">1.3.2</a></td></tr>
254 <h2 id=
"Propaganda">Spread the Word!
</h2>
256 <p>Help spread awareness about HTML Purifier by:
</p>
260 href=
"http://del.icio.us/post?v=4&noui&url=http://htmlpurifier.org/&title=HTML%20Purifier%20-%20Filter%20your%20HTML%20the%20standards-compliant%20way!"
261 id=
"delicious">Bookmarking this website
</a> on your
<strong>del.icio.us
</strong> account, and/or
</li>
263 <div>Including this little
<strong>label
</strong> on your website:
264 <a href=
"http://htmlpurifier.org/"><img
265 src=
"live/art/powered.png"
266 alt=
"Powered by HTML Purifier" border=
"0" /></a>, with this code:
268 <pre class=
"long"><a href=
"http://htmlpurifier.org/
"><img
269 src=
"http://htmlpurifier.org/live/art/powered.png
"
270 alt=
"Powered by HTML Purifier
" border=
"0" /
></a
></pre>