Relink index page to use demo, fix demo's library inclusion paths, de-absolute-ify...
[htmlpurifier-web.git] / index.xhtml
blob0fc8d3f4e9b03c1298c1f9e3b63a202f4d47245a
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
4 <html xmlns="http://www.w3.org/1999/xhtml"
5 xmlns:xi="http://www.w3.org/2001/XInclude"
6 xmlns:xc="urn:xhtml-compiler"
7 xmlns:rss="urn:xhtml-compiler:RSSGenerator"
8 xmlns:svn="urn:xhtml-compiler:Subversion"
9 svn:head-url="$HeadURL$"
10 svn:revision="$Revision$"
11 xc:rss-from-svn="yes"
12 xml:lang="en" lang="en">
13 <head>
14 <title>HTML Purifier - Filter your HTML the standards-compliant way!</title>
15 <xi:include href="common-meta.xml" xpointer="xpointer(/*/node())" />
16 <link rel="stylesheet" href="index.css" type="text/css" />
17 <meta name="description"
18 content="HTML filter that guards against XSS and ensures standards-compliant output." />
19 <meta name="keywords"
20 content="HTMLPurifier, HTML Purifier, HTML, filter, filtering, standards, compliant, w3c, XSS, PHP, security, library, open source, LGPL, whitelist" />
21 <link rel="alternate" type="application/rss+xml"
22 title="News - HTML Purifier" href="news.rss"
23 rss:for="news-container"
24 rss:description="Recent news and updates on HTML Purifier" />
25 </head>
26 <body>
28 <img src="logo.png" id="logo" alt="HTML Purifier" />
29 <h1 id="header"><span class="html">HTML</span>
30 <span class="purifier">Purifier</span></h1>
32 <div id="navigation">
33 <h2>Navigation</h2>
34 <ol>
35 <li><a href="#News">News</a></li>
36 <li><a href="#Plugins">Plugins</a></li>
37 <li><a href="#Demo">Demo</a></li>
38 <li><strong><a href="#Download">Download</a></strong></li>
39 <li><a href="#Resources">Resources</a></li>
40 <li><a href="http://hp.jpsband.org/phorum/">Forum</a></li>
41 <li><a href="#Contact">Contact</a></li>
42 </ol>
43 </div>
45 <div id="content">
47 <a href="#Download"><img src="download.png" class="download-button" alt="Download HTML Purifier" /></a>
49 <p><strong>HTML Purifier</strong> is a standards-compliant
50 <abbr>HTML</abbr> filter library written in
51 <abbr>PHP</abbr>. HTML Purifier will not only remove all malicious
52 code (better known as <abbr>XSS</abbr>) with a thoroughly audited,
53 secure <em>yet</em> permissive <strong><a
54 href="live/smoketests/printDefinition.php">whitelist</a></strong>,
55 it will also make sure your documents are
56 <strong>standards compliant</strong>, something only achievable with a
57 comprehensive knowledge of <abbr>W3C</abbr>'s specifications.
58 Tired of using BBCode due to the current landscape of deficient or
59 insecure <abbr>HTML</abbr> filters? Have a
60 <strong><acronym>WYSIWYG</acronym></strong> editor but never been able to use it? Looking
61 for high-quality, standards-compliant, open-source components for that
62 application you're building? HTML Purifier is for you!</p>
64 <blockquote class="fancy">
65 <div class="quote">
66 I'd just like to say we use HTML Purifier in <a href="http://www.iris.ac/">IRIS</a> for
67 filtering emails against XSS attacks and we've been more than impressed.
68 </div>
69 <div class="origin">&mdash; Chris Corbyn, <em>Senior IRIS Developer</em></div>
70 </blockquote>
72 <h2 id="Background">Background</h2>
74 <p>There are a number of open-source <abbr>HTML</abbr> filtering solutions out
75 there on the web already
76 (i.e. <acronym>PEAR</acronym>'s
77 <a href="http://pear.php.net/package/HTML_Safe">HTML_Safe</a>,
78 <a href="http://sourceforge.net/projects/kses">kses</a>
79 and
80 <a href="http://simon.incutio.com/archive/2003/02/23/safeHtmlChecker">
81 SafeHtmlChecker.class.php</a>). What sets HTML Purifier apart from them?
82 Aren't all of these choices <q>secure</q>?</p>
84 <p>When it comes to <abbr>HTML</abbr>, <strong>attention to
85 detail</strong> is key. Does the library demonstrate an in-depth
86 knowledge of the <abbr>DTD</abbr> that defines
87 <abbr>HTML</abbr>? Does it perform its filtering off a robust
88 whitelist rather than a usually out-dated blacklist? Does it go through
89 the care to check every single attribute in the document for validity?
90 Does it actually understand tag markup, or pay lip-service with a series
91 of deficient regexes and str_replace's?</p>
93 <p>Somewhere along the way, all of HTML Purifier's predecessors fall
94 flat. HTML_Safe dooms itself to attacks of the future by using a
95 blacklist. Configurable filters like kses and PHP Input Filter still
96 cannot validate the contents inside attributes. With all these gaps in
97 coverage, none of the usual libraries come close to achieving
98 <strong>standards-compliance</strong>. There is a user-unfriendly,
99 draconic <abbr>XML</abbr>-based filter called Safe HTML Checker,
100 but even it forgets that <code>&lt;a&gt;</code> tags cannot be nested
101 within each other!</p>
103 <p><strong>Know thy enemy.</strong> Wily hackers have a huge arsenal of
104 <abbr>XSS</abbr> hidden within the depths of the
105 <abbr>HTML</abbr> specification. HTML Purifier takes its
106 effectiveness from the fact that it will decompose the whole document
107 into tokens, and rigorously process the tokens by removing
108 non-whitelisted elements, transforming bad practice tags like font into
109 span, properly checking the nesting of tags and their children and
110 validating all attributes according to their <abbr>RFC</abbr>s.
111 HTML Purifier's comprehensive algorithms are complemented by a
112 <strong>breadth of knowledge</strong>, ensuring that richly formatted
113 documents pass through unstripped.</p>
115 <p><a href="comparison.html"><img src="compare.png" class="compare-button" alt="Compare HTML Purifier with other filters" /></a></p>
117 <p>To my knowledge, there is nothing else in the wild that offers
118 protection from <abbr>XSS</abbr>, standards-compliance, and the
119 corrective processing of poorly formed <abbr>HTML</abbr>
120 simultaneously. Don't take my word for it though:
121 do your research. Investigate the other libraries, and decide for
122 yourself who you would prefer to be the <strong>gatekeeper</strong> to
123 your system.</p>
125 <p>To find out more, you can read the
126 <a href="comparison.html"><strong>Comparison</strong></a>
127 for a play-by-play analysis of the major filter libraries currently
128 out there.</p>
130 <blockquote class="fancy">
131 <div class="quote">
132 [Y]ou save my day by allowing me not to write another damned HTML parser.
133 </div>
134 <div class="origin">
135 &mdash; Joseph Halter, <em>Technical Director at Akira Web</em>
136 </div>
137 </blockquote>
140 <h2 id="News">News</h2>
142 <div id="news-container" class="news">
144 <div class="item" id="news-svn-and-misc">
145 <h3 class="title"><abbr>SVN</abbr> viewer and migration</h3>
146 <div class="date">Tue, 17 April 2007 20:08:11 EDT</div>
148 <div class="body">
149 <p><a href="http://htmlpurifier.org/viewvc.cgi/htmlpurifier/trunk/">ViewVC</a>
150 for viewing our <abbr>SVN</abbr> repository and <abbr>RSS</abbr> changelog feeds for most of our
151 <abbr>HTML</abbr> pages (for example, the changelog for this page
152 is at <a href="index.rss">index.rss</a>) were rolled out a few weeks
153 ago. Feel free to check them out.</p>
154 <p>Also, I've purchased the <code>htmlpurifier.org</code> domain
155 so this website will be migrating to that address soon. I'm not in any
156 particular hurry to get the migration done, but I hope to see some
157 other changes in the website as well when the move is made. <code>;-)</code></p>
158 </div>
159 </div>
161 <div class="item" id="news-pro-php-podcast">
162 <h3 class="title">Pro:PHP Podcast mention</h3>
163 <div class="date">Mon, 09 April 2007 23:23:44 EDT</div>
165 <div class="body">
166 <p>I'd like to thank
167 <a href="http://podcast.phparch.com/main/index.php/main">Pro::PHP</a>
168 podcast for mentioning HTML Purifier on their
169 <a href="http://podcast.phparch.com/main/index.php/episodes:20070405">April
170 5, 2007 show</a>. I've always been a fan of their informative
171 podcasts, and was delighted to discover that
172 they had decided to include HTML Purifier on the program list
173 (even though it was at the very end).
174 </p>
176 <p>Against my better judgment, I have a few clarifications I'd like
177 to make about the podcast:</p>
179 <ul>
180 <li>While HTML Purifier can use Tidy, it's completely optional.
181 Tidy is exploited for pretty-printed <abbr>HTML</abbr>.</li>
182 <li>We do use the <abbr>XSS</abbr> cheatsheet for
183 <a href="live/smoketests/xssAttacks.php">testing
184 the library</a>, but I actually did not know about the cheat-sheet
185 until the library was well under development.</li>
186 <li>Yes, the top domain is actually a school band website that
187 I'm borrowing hosting from. I'm playing around with getting
188 a dedicated domain at htmlpurifier.org.</li>
189 </ul>
191 <p>Once again, thanks for mentioning the library, perhaps someday
192 I'll do a screencast going through some of HTML Purifier's major
193 features.</p>
194 </div>
195 </div>
197 <div class="item" id="news-1.6.0-released">
198 <h3 class="title">HTML Purifier 1.6.0 released</h3>
199 <div class="date">Sun, 01 April 2007 23:40:59 EDT</div>
201 <div class="body">
202 <p>Sorry, no April Fool's joke this year. To compensate, we have
203 the 1.6.0 <q>Long Overdue</q> release. This version contains support
204 for a number of deprecated attributes HTML Purifier should have
205 had from the very beginning, including the name, bgcolor, border,
206 width and height attributes. The <abbr>CSS</abbr> property 'height',
207 rel and rev attributes and ID blacklist regexps are also available.
208 In addition, HTML Purifier will give a friendly error message
209 when you try to enable an element or attribute that doesn't exist.</p>
211 <p>All in all, this is a fairly compact release, but it does
212 address some common requests brought up in the Forums, so I suggest
213 you upgrade anyway. You can check <a
214 href="http://htmlpurifier.org/svnroot/htmlpurifier/tags/1.6.0/NEWS">News</a>
215 for a complete changelog, but there's not much else.</p>
216 </div>
217 </div>
219 <div class="item" id="news-keep-me-updated">
220 <h3 class="title">A note to you distributors</h3>
221 <div class="date">Wed, 28 March 2007 21:05:12 EDT</div>
223 <div class="body">
224 <p>Yes, <strong>TikiWiki</strong> and <strong>PHProjekt</strong>,
225 I'm looking at you. I am absolutely delighted that these two fairly
226 popular and robust open-source projects are using my library.
227 However, I am not at all pleased at the fact that you have not
228 been keeping up to date with HTML Purifier releases.</p>
229 <ul>
230 <li>TikiWiki: <a href="http://tikiwiki.cvs.sourceforge.net/tikiwiki/tiki/lib/HTMLPurifier.php?view=log">1.3.0</a></li>
231 <li>PHProjekt: <del><a href="http://thinkforge.org/plugins/scmcvs/cvsweb.php/phprojekt50/lib/html/library/HTMLPurifier.php?cvsroot=phprojekt5">1.3.2</a></del> <ins>1.6.0</ins></li>
232 </ul>
233 <p>I entreat yea, please sign up for the announcement list and
234 keep my library up-to-date! It's not difficult, I keep backwards
235 compatibility, and it makes your users happy! Especially that
236 <acronym>DOM</acronym> <abbr>XML</abbr> bug, which seems was
237 far more serious than I originally thought it was. That is all.</p>
238 <p><strong>Update</strong>: I'm happy to say that PHProjekt has updated the library
239 to 1.6.0. Still waiting on a response from TikiWiki though.</p>
240 </div>
241 </div>
243 <div class="item" id="news-pear-channel">
244 <h3 class="title"><acronym>PEAR</acronym> channel available</h3>
245 <div class="date">Sat, 24 March 2007 20:27:42 EDT</div>
247 <div class="body">
248 <p>At the prompting of Lars Olesen, HTML Purifier now
249 has its very own <acronym>PEAR</acronym> channel. This means that
250 installing HTML Purifier is as simple as:</p>
251 <pre class="command">pear channel-discover htmlpurifier.org
252 pear install hp/HTMLPurifier</pre>
253 </div>
254 </div>
256 </div> <!-- end news-container -->
258 <h2 id="Plugins">Plugins</h2>
260 <p>HTML Purifier is a great library to integrate with existing
261 <abbr>CMS</abbr>es and other applications or <acronym>WYSIWYG</acronym>
262 editors. Currently, we have plugins for:</p>
264 <ul>
265 <li><a href="http://bart.motd.be/projects/html-purifier-drupal-module">Drupal HTML Purifier Module</a> (beta) by Bart Jansens</li>
266 <li><a href="http://htmlpurifier.org/svnroot/htmlpurifier/trunk/plugins/modx.txt">MODx Content Management System</a></li>
267 </ul>
269 <blockquote class="fancy">
270 <div class="quote">
271 This plugin is on top of my favorite list[.] I am going to heavily
272 depend on it since my clients insist on having <acronym>WYSIWYG</acronym> and I insist on
273 having pages that validate and are semantically sound.
274 </div>
275 <div class="origin">
276 &mdash; David Molliere, <em>MODx Marketing &amp; Design Team</em>
277 </div>
278 </blockquote>
280 <p>Plugins for other major applications gladly accepted!</p>
283 <h2 id="Demo">Demo</h2>
285 <p>Enter your <abbr>HTML</abbr> and see how it will be filtered!</p>
286 <form id="filter" action="demo.php?post" method="post">
287 <fieldset>
288 <legend>HTML Purifier Input</legend>
289 <textarea name="html" cols="50" rows="10" id="html"></textarea>
290 <div><abbr>XHTML</abbr> 1.0 Strict output? <input type="checkbox" value="1" name="strict" /></div>
291 <div>
292 <input type="submit" value="Submit" name="submit" class="button" />
293 </div>
294 </fieldset>
295 </form>
297 <p>...or try these sample inputs:</p>
299 <ul>
300 <li><a href="demo.php?get&amp;html=%3Cimg+src%3D%22javascript%3Aevil%28%29%3B%22+onload%3D%22evil%28%29%3B%22+%2F%3E">Malicious code removed</a></li>
301 <li><a href="demo.php?html=%3Cb%3EBold&amp;submit=Submit">Missing end tags fixed</a></li>
302 <li><a href="demo.php?html=%3Cb%3EInline+%3Cdel%3Econtext+%3Cdiv%3ENo+block+allowed%3C%2Fdiv%3E%3C%2Fdel%3E%3C%2Fb%3E&amp;submit=Submit">Illegal nesting fixed</a></li>
303 <li><a href="demo.php?html=%3Ccenter%3ECentered%3C%2Fcenter%3E&amp;strict=1&amp;submit=Submit">Deprecated tags converted</a></li>
304 <li><a href="demo.php?html=%3Cspan+style%3D%22color%3A%23COW%3Bfloat%3Aaround%3Btext-decoration%3Ablink%3B%22%3EText%3C%2Fspan%3E&amp;submit=Submit"><abbr>CSS</abbr> validated</a></li>
305 <li><a href="demo.php?html=%3Ctable%3E%0D%0A++%3Ccaption%3E%0D%0A++++Cool+table%0D%0A++%3C%2Fcaption%3E%0D%0A++%3Ctfoot%3E%0D%0A++++%3Ctr%3E%0D%0A++++++%3Cth%3EI+can+do+so+much%21%3C%2Fth%3E%0D%0A++++%3C%2Ftr%3E%0D%0A++%3C%2Ftfoot%3E%0D%0A++%3Ctr%3E%0D%0A++++%3Ctd+style%3D%22font-size%3A16pt%3B%0D%0A++++++color%3A%23F00%3Bfont-family%3Asans-serif%3B%0D%0A++++++text-align%3Acenter%3B%22%3EWow%3C%2Ftd%3E%0D%0A++%3C%2Ftr%3E%0D%0A%3C%2Ftable%3E&amp;submit=Submit">Rich formatting preserved</a></li>
306 </ul>
308 <h2 id="Download">Download</h2>
310 <p>The current version is
311 <strong>1.6.0</strong>. Pick your distribution:</p>
313 <ul>
314 <li><a class="download" href="releases/htmlpurifier-1.6.0.tar.gz">HTML Purifier 1.6.0 (.tar.gz)</a> [<a href="releases/htmlpurifier-1.6.0.tar.gz.sig">sig</a>]</li>
315 <li><a class="download" href="releases/htmlpurifier-1.6.0.zip">HTML Purifier 1.6.0 (.zip)</a> [<a href="releases/htmlpurifier-1.6.0.zip.sig">sig</a>]</li>
316 <li><a class="download" href="releases/htmlpurifier-1.6.0-strict.tar.gz">HTML Purifier 1.6.0 PHP5-strict (.tar.gz)</a> [<a href="releases/htmlpurifier-1.6.0-strict.tar.gz.sig">sig</a>]</li>
317 <li><a class="download" href="releases/htmlpurifier-1.6.0-strict.zip">HTML Purifier 1.6.0 PHP5-strict (.zip)</a> [<a href="releases/htmlpurifier-1.6.0-strict.zip.sig">sig</a>]</li>
318 </ul>
320 <p>The <abbr>PHP</abbr>5-strict version is exactly the same
321 as the regular version with a few tweaks
322 to prevent it from complaining with
323 <a href="http://php.net/manual/en/ref.errorfunc.php#e-strict">E_STRICT</a>
324 warnings.This library is open-source, licensed under the
325 <a href="http://www.gnu.org/licenses/lgpl.html"><abbr>LGPL</abbr> v2.1+</a>.</p>
327 <p>HTML Purifier is also available as a <acronym>PEAR</acronym> package.
328 You can install it by executing:</p>
330 <pre class="command">pear channel-discover htmlpurifier.org
331 pear install hp/HTMLPurifier</pre>
333 <p>You can also grab the latest developmental code from our Subversion
334 repository. Simply execute this command:</p>
336 <pre class="command">svn co http://htmlpurifier.org/svnroot/htmlpurifier/trunk ./</pre>
338 <p>...or <a href="http://htmlpurifier.org/svnroot/htmlpurifier/trunk/">browse
339 anonymously</a> at that address. Previous releases can be obtained by browsing
340 the <a href="releases/">release directory</a>
341 or checking code out of the
342 <a href="http://htmlpurifier.org/svnroot/htmlpurifier/tags/">tags/
343 directory</a>. You can also use
344 <a href="http://htmlpurifier.org/viewvc.cgi/htmlpurifier/trunk/">ViewVC to view the repository</a>.</p>
346 <p><acronym>SHA-1</acronym> checksums:</p>
348 <pre>
349 088569ae55d99bdbbee6031215ecc26f60489b70 htmlpurifier-1.6.0-strict.tar.gz
350 3deb033d6b20c22e7883cf2f7f719605fe6dd161 htmlpurifier-1.6.0-strict.zip
351 b4eed7787b84b7a86b24beaa5394616600780ceb htmlpurifier-1.6.0.tar.gz
352 3e375e83bc782e031362ce49c559e0d4f2511b6f htmlpurifier-1.6.0.zip
353 </pre>
355 <p>There are also <tt>.sig</tt> files which you can use to cryptographically verify
356 that the release is from me, Edward Z. Yang. You can find
357 my <a href="http://www.thewritingpot.com/gpgpubkey.asc">public key
358 here (0x869C48DA)</a>. My key's fingerprint is:
359 <tt>3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA</tt>.</p>
361 <p>Verify with these commands:</p>
363 <pre class="command">gpg --verify <strong>$filename</strong>.sig</pre>
365 <p>You can be notified of new releases by a low-traffic announce list. Subscribe
366 here:</p>
368 <form method="post" action="http://scripts.dreamhost.com/add_list.cgi">
369 <input type="hidden" name="list" value="htmlpurifier@jpsband.org" />
370 <input type="hidden" name="domain" value="jpsband.org" />
371 <input type="hidden" name="emailit" value="1" />
372 <div>Name: <input name="name" /> E-mail: <input name="email" /></div>
373 <div><input type="submit" name="submit" value="Suscribe to Announcement List" />
374 <input type="submit" name="unsub" value="Unsubscribe" /></div>
375 </form>
377 <h2 id="Resources">Resources</h2>
378 <ul>
379 <li><strong><a href="docs/">End-User
380 Documentation</a></strong> &mdash; In-depth documents on how to get
381 the most out of HTML Purifier.</li>
382 <li><a href="http://hp.jpsband.org/mantis/">Mantis Bugtracker</a> &mdash; Found a bug? Report
383 it here!</li>
384 <li><a href="http://hp.jpsband.org/phorum/">Support Forum</a> &mdash; Talk about all things
385 HTML Purifier.</li>
386 <li><a href="live/smoketests/printDefinition.php">Print
387 Definition</a> &mdash; If you want to actually see what HTML Purifier's
388 filtering rules are, look no further than to this page. You can even
389 experiment with the configuration to see how things respond to different
390 directives.</li>
391 <li><a href="live/smoketests/xssAttacks.php"><abbr>XSS</abbr>
392 Attacks Smoketest</a> &mdash; Tests how well HTML Purifier fares
393 against RSnake's famous cheatsheet of <abbr>XSS</abbr> attacks.</li>
394 <li><a href="live/TODO">Roadmap</a>
395 &mdash; Subject to lots of delays, but it's a glimpse of the future</li>
396 <li><a href="live/art/">Artwork</a>
397 &mdash; Extra media goodies.</li>
398 <li><a href="live/configdoc/plain.html">Configuration
399 documentation</a> &mdash; See the <code>INSTALL</code> document on how to
400 configure your HTML Purifier installation.</li>
401 <li><a href="http://htmlpurifier.org/doxygen/html/">Doxygen-generated
402 Documentation</a> &mdash; No class left undocumented! Cross-referenced
403 code! A must-read for any prospective HTML Purifier hacker.
404 (close by, <a href="http://htmlpurifier.org/phpdoc/">PHPDoc-generated
405 Documentation.</a>)</li>
406 </ul>
408 <h2 id="Propaganda">Spread the Word!</h2>
410 <p>Help spread awareness about HTML Purifier by:</p>
412 <ul>
413 <li><a
414 href="http://del.icio.us/post?v=4&amp;noui&amp;url=http://htmlpurifier.org/&amp;title=HTML%20Purifier%20-%20Filter%20your%20HTML%20the%20standards-compliant%20way!"
415 id="delicious">Bookmarking this website</a> on your <strong>del.icio.us</strong> account, and/or</li>
416 <li>
417 <div>Including this little <strong>label</strong> on your website:
418 <a href="http://htmlpurifier.org/"><img
419 src="live/art/powered.png"
420 alt="Powered by HTML Purifier" border="0" /></a>, with this code:
421 </div>
422 <pre>&lt;a href=&quot;http://htmlpurifier.org/&quot;&gt;&lt;img
423 src=&quot;http://htmlpurifier.org/live/art/powered.png&quot;
424 alt=&quot;Powered by HTML Purifier&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;</pre>
425 </li>
426 </ul>
428 <h2 id="Contact">Contact</h2>
430 <p>You can send me an email at
431 <a href="mailto:admin@htmlpurifier.org">htmlpurifier@jpsband.org</a>.
432 However, I prefer that you use the forums for asking general support
433 questions (response time will be the same, I promise!)
434 Any emails I receive will be considered public: if I think a
435 solution I thought up to help you would be particularly useful to others,
436 expect it to show up on the website.</p>
438 </div>
440 </body>
441 </html>