Add PEAR server PHP code to repository.
[htmlpurifier-web.git] / index.xhtml
blobcb0d474e6d6b02ea13698709955f31c684d223da
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
4 <html xmlns="http://www.w3.org/1999/xhtml"
5 xmlns:xi="http://www.w3.org/2001/XInclude"
6 xmlns:xc="urn:xhtml-compiler"
7 xmlns:rss="urn:xhtml-compiler:RSSGenerator"
8 xml:lang="en" lang="en">
9 <head>
10 <title>HTML Purifier - Filter your HTML the standards-compliant way!</title>
11 <xi:include href="common-meta.xml" xpointer="xpointer(/*/node())" />
12 <link rel="stylesheet" href="index.css" type="text/css" />
13 <meta name="description"
14 content="HTML filter that guards against XSS and ensures standards-compliant output." />
15 <meta name="keywords"
16 content="HTMLPurifier, HTML Purifier, HTML, filter, filtering, standards, compliant, w3c, XSS, PHP, security, library, open source, LGPL, whitelist" />
17 <link rel="alternate" type="application/rss+xml"
18 title="News - HTML Purifier" href="news.rss"
19 rss:for="news-container"
20 rss:description="Recent news and updates on HTML Purifier" />
21 </head>
22 <body>
24 <img src="logo.png" id="logo" alt="HTML Purifier" />
25 <h1 id="header"><span class="html">HTML</span>
26 <span class="purifier">Purifier</span></h1>
28 <div id="navigation">
29 <h2>Navigation</h2>
30 <ol>
31 <li><a href="#News">News</a></li>
32 <li><a href="#Plugins">Plugins</a></li>
33 <li><a href="#Demo">Demo</a></li>
34 <li><strong><a href="#Download">Download</a></strong></li>
35 <li><a href="#Resources">Resources</a></li>
36 <li><a href="phorum/">Forum</a></li>
37 <li><a href="#Contact">Contact</a></li>
38 </ol>
39 </div>
41 <div id="content">
43 <a href="#Download"><img src="download.png" class="download-button" alt="Download HTML Purifier" /></a>
45 <p><strong>HTML Purifier</strong> is a standards-compliant
46 <abbr>HTML</abbr> filter library written in
47 <abbr>PHP</abbr>. HTML Purifier will not only remove all malicious
48 code (better known as <abbr>XSS</abbr>) with a thoroughly audited,
49 secure <em>yet</em> permissive <strong><a
50 href="http://hp.jpsband.org/live/smoketests/printDefinition.php">whitelist</a></strong>,
51 it will also make sure your documents are
52 <strong>standards compliant</strong>, something only achievable with a
53 comprehensive knowledge of <abbr>W3C</abbr>'s specifications.
54 Tired of using BBCode due to the current landscape of deficient or
55 insecure <abbr>HTML</abbr> filters? Have a
56 <strong><acronym>WYSIWYG</acronym></strong> editor but never been able to use it? Looking
57 for high-quality, standards-compliant, open-source components for that
58 application you're building? HTML Purifier is for you!</p>
60 <blockquote class="fancy">
61 <div class="quote">
62 I'd just like to say we use HTML Purifier in <a href="http://www.iris.ac/">IRIS</a> for
63 filtering emails against XSS attacks and we've been more than impressed.
64 </div>
65 <div class="origin">&mdash; Chris Corbyn, <em>Senior IRIS Developer</em></div>
66 </blockquote>
68 <h2 id="Background">Background</h2>
70 <p>There are a number of open-source <abbr>HTML</abbr> filtering solutions out
71 there on the web already
72 (i.e. <acronym>PEAR</acronym>'s
73 <a href="http://pear.php.net/package/HTML_Safe">HTML_Safe</a>,
74 <a href="http://sourceforge.net/projects/kses">kses</a>
75 and
76 <a href="http://simon.incutio.com/archive/2003/02/23/safeHtmlChecker">
77 SafeHtmlChecker.class.php</a>). What sets HTML Purifier apart from them?
78 Aren't all of these choices <q>secure</q>?</p>
80 <p>When it comes to <abbr>HTML</abbr>, <strong>attention to
81 detail</strong> is key. Does the library demonstrate an in-depth
82 knowledge of the <abbr>DTD</abbr> that defines
83 <abbr>HTML</abbr>? Does it perform its filtering off a robust
84 whitelist rather than a usually out-dated blacklist? Does it go through
85 the care to check every single attribute in the document for validity?
86 Does it actually understand tag markup, or pay lip-service with a series
87 of deficient regexes and str_replace's?</p>
89 <p>Somewhere along the way, all of HTML Purifier's predecessors fall
90 flat. HTML_Safe dooms itself to attacks of the future by using a
91 blacklist. Configurable filters like kses and PHP Input Filter still
92 cannot validate the contents inside attributes. With all these gaps in
93 coverage, none of the usual libraries come close to achieving
94 <strong>standards-compliance</strong>. There is a user-unfriendly,
95 draconic <abbr>XML</abbr>-based filter called Safe HTML Checker,
96 but even it forgets that <code>&lt;a&gt;</code> tags cannot be nested
97 within each other!</p>
99 <p><strong>Know thy enemy.</strong> Wily hackers have a huge arsenal of
100 <abbr>XSS</abbr> hidden within the depths of the
101 <abbr>HTML</abbr> specification. HTML Purifier takes its
102 effectiveness from the fact that it will decompose the whole document
103 into tokens, and rigorously process the tokens by removing
104 non-whitelisted elements, transforming bad practice tags like font into
105 span, properly checking the nesting of tags and their children and
106 validating all attributes according to their <abbr>RFC</abbr>s.
107 HTML Purifier's comprehensive algorithms are complemented by a
108 <strong>breadth of knowledge</strong>, ensuring that richly formatted
109 documents pass through unstripped.</p>
111 <p><a href="comparison.html"><img src="compare.png" class="compare-button" alt="Compare HTML Purifier with other filters" /></a></p>
113 <p>To my knowledge, there is nothing else in the wild that offers
114 protection from <abbr>XSS</abbr>, standards-compliance, and the
115 corrective processing of poorly formed <abbr>HTML</abbr>
116 simultaneously. Don't take my word for it though:
117 do your research. Investigate the other libraries, and decide for
118 yourself who you would prefer to be the <strong>gatekeeper</strong> to
119 your system.</p>
121 <p>To find out more, you can read the
122 <a href="comparison.html"><strong>Comparison</strong></a>
123 for a play-by-play analysis of the major filter libraries currently
124 out there.</p>
126 <blockquote class="fancy">
127 <div class="quote">
128 [Y]ou save my day by allowing me not to write another damned HTML parser.
129 </div>
130 <div class="origin">
131 &mdash; Joseph Halter, <em>Technical Director at Akira Web</em>
132 </div>
133 </blockquote>
136 <h2 id="News">News</h2>
138 <div id="news-container" class="news">
140 <div class="item" id="news-pear-channel">
141 <h3 class="title"><acronym>PEAR</acronym> channel available</h3>
142 <div class="date">Sat, 24 March 2007 20:27:42 EDT</div>
144 <div class="body">
145 <p>At the prompting of Lars Olesen, HTML Purifier now
146 has its very own <acronym>PEAR</acronym> channel. This means that
147 installing HTML Purifier is as simple as:</p>
148 <pre class="command">pear channel-discover hp.jpsband.org
149 pear install hp/HTMLPurifier</pre>
150 </div>
151 </div>
153 <div class="item" id="news-1.5.0-released">
154 <h3 class="title">HTML Purifier 1.5.0 released</h3>
155 <div class="date">Fri, 23 March 2007 22:42:12 EDT</div>
157 <div class="body">
158 <p>The 1.5.0 major bugfix
159 release is available today. There have been some major internal
160 refactoring efforts, but these changes are invisible to you.</p>
162 <p>Entrepid souls wanting to test out the new
163 <code>HTMLModuleManager</code> class can check out the
164 <code>HTMLModule</code>s. Also, I will personally assist anyone
165 who has modified <code>HTMLDefinition.php</code>. <strong>If you
166 have patched any files, please consult the Support forums before
167 upgrading.</strong></p>
169 <p>And now, the goodies:</p>
171 <ul>
172 <li><strong><abbr>XHTML</abbr> 1.1-style modularization of
173 <code>HTMLDefinition</code>.</strong> Instead of one monster,
174 huge <code>HTMLDefinition</code> class, the file has been
175 partitioned into modular bits organized into categories
176 like <q>Hypertext</q>, <q>Lists</q> and <q>Tables</q>. The
177 design of these modules makes it possible to arbitrarily
178 add your own elements without ever having to patch a core
179 file. However, the interface is unintuitive, not
180 documented, and definitely going to change. Keep your eyes
181 on this one.</li>
182 <li>Rudimentary internationalization system implemented. It's
183 not used yet, but will become the foundation of a projected
184 error reporting feature HTML Purifier will be getting soon.</li>
185 <li><code>x</code> subtag now allowed in language codes.</li>
186 <li>Buggy chameleon support for <code>ins</code> and <code>del</code>
187 fixed.</li>
188 <li>Element by element AllowedAttribute declaration now possible
189 for global attributes. Instead of <code>*.class</code>, you can write
190 <code>span.class</code> (the old syntax still works, and enables
191 the attribute for all elements).</li>
192 <li>Fatal error when <abbr>PHP</abbr>4 <acronym>DOM</acronym>
193 <abbr>XML</abbr> extension was loaded now fixed.</li>
194 <li>Youtube filter regexp now multiline.</li>
195 </ul>
196 <p>...as well as an assortment of some code refactoring (all
197 bugfixes are covered above). See <a
198 href="http://hp.jpsband.org/svnroot/htmlpurifier/tags/1.5.0/NEWS">News</a>
199 for a complete changelog.</p>
200 </div>
201 </div>
204 <div class="item" id="news-rss-feed">
205 <h3 class="title"><abbr>RSS</abbr> feed!</h3>
206 <div class="date">Sat, 17 March 2007 5:42:12 EDT</div>
208 <div class="body">
209 <p>We have a shiny new <abbr>RSS</abbr> feed
210 at <a href="news.rss">news.rss</a>, which is hooked up to this
211 news feed. Subscribe for release notifications as well as random
212 news about HTML Purifier.</p>
213 </div>
214 </div>
216 <div class="item" id="news-status-on-api">
217 <h3 class="title">Status on 1.5 and the Advanced <abbr>API</abbr></h3>
218 <div class="date">Wed, 14 March 2007 5:31:46 EDT</div>
220 <div class="body">
221 <p>Quick update on the status of version 1.5. The flagship
222 new feature of this release is to be an advanced system for selecting
223 and creating elements and attributes. You can view the <a
224 href="http://hp.jpsband.org/live/docs/dev-advanced-api.html">projected
225 advanced <abbr>API</abbr> here</a>.</p>
227 <p>If you actually took the time out to scan the document, you may notice
228 that its incomplete. This is a very big problem. I am slowly grinding
229 away at the details, but any suggestions and comments would be
230 greatly appreciated. You can post anything you want to see in the
231 <a href="http://hp.jpsband.org/phorum/list.php?2">general forums</a>.</p>
232 </div>
233 </div>
235 <div class="item" id="news-utf8-tutorial">
236 <h3 class="title"><abbr>UTF-8</abbr> Tutorial</h3>
237 <div class="date">Sat, 27 Jan 2007 13:30:56 EST</div>
239 <div class="body">
240 <p>Here's a tutorial on <a
241 href="http://hp.jpsband.org/live/docs/enduser-utf8.html">HTML Purifier
242 and <abbr>UTF-8</abbr> character encoding issues</a>. It discusses
243 how to figure out your character encoding, why you should
244 <abbr>UTF-8</abbr>, and how to migrate (should you choose to do
245 so). </p>
246 </div>
247 </div>
249 </div> <!-- end news-container -->
251 <h2 id="Plugins">Plugins</h2>
253 <p>HTML Purifier is a great library to integrate with existing
254 <abbr>CMS</abbr>es and other applications or <acronym>WYSIWYG</acronym> editors. Currently, we have plugins
255 for:</p>
257 <ul>
258 <li><a href="http://bart.motd.be/projects/html-purifier-drupal-module">Drupal HTML Purifier Module</a> (beta) by Bart Jansens</li>
259 <li><a href="http://hp.jpsband.org/svnroot/htmlpurifier/trunk/plugins/modx.txt">MODx Content Management System</a></li>
260 </ul>
262 <blockquote class="fancy">
263 <div class="quote">
264 This plugin is on top of my favorite list[.] I am going to heavily
265 depend on it since my clients insist on having <acronym>WYSIWYG</acronym> and I insist on
266 having pages that validate and are semantically sound.
267 </div>
268 <div class="origin">
269 &mdash; David Molliere, <em>MODx Marketing &amp; Design Team</em>
270 </div>
271 </blockquote>
273 <p>Plugins for other major applications gladly accepted!</p>
276 <h2 id="Demo">Demo</h2>
278 <p>Enter your <abbr>HTML</abbr> and see how it will be filtered!</p>
279 <form id="filter" action="http://hp.jpsband.org/live/docs/examples/demo.php?post" method="post">
280 <fieldset>
281 <legend>HTML Purifier Input</legend>
282 <textarea name="html" cols="50" rows="10" id="html"></textarea>
283 <div><abbr>XHTML</abbr> 1.0 Strict output? <input type="checkbox" value="1" name="strict" /></div>
284 <div>
285 <input type="submit" value="Submit" name="submit" class="button" />
286 </div>
287 </fieldset>
288 </form>
290 <p>...or try these sample inputs:</p>
292 <ul>
293 <li><a href="http://hp.jpsband.org/live/docs/examples/demo.php?get&amp;html=%3Cimg+src%3D%22javascript%3Aevil%28%29%3B%22+onload%3D%22evil%28%29%3B%22+%2F%3E">Malicious code removed</a></li>
294 <li><a href="http://hp.jpsband.org/live/docs/examples/demo.php?html=%3Cb%3EBold&amp;submit=Submit">Missing end tags fixed</a></li>
295 <li><a href="http://hp.jpsband.org/live/docs/examples/demo.php?html=%3Cb%3EInline+%3Cdel%3Econtext+%3Cdiv%3ENo+block+allowed%3C%2Fdiv%3E%3C%2Fdel%3E%3C%2Fb%3E&amp;submit=Submit">Illegal nesting fixed</a></li>
296 <li><a href="http://hp.jpsband.org/live/docs/examples/demo.php?html=%3Ccenter%3ECentered%3C%2Fcenter%3E&amp;strict=1&amp;submit=Submit">Deprecated tags converted</a></li>
297 <li><a href="http://hp.jpsband.org/live/docs/examples/demo.php?html=%3Cspan+style%3D%22color%3A%23COW%3Bfloat%3Aaround%3Btext-decoration%3Ablink%3B%22%3EText%3C%2Fspan%3E&amp;submit=Submit"><abbr>CSS</abbr> validated</a></li>
298 <li><a href="http://hp.jpsband.org/live/docs/examples/demo.php?html=%3Ctable%3E%0D%0A++%3Ccaption%3E%0D%0A++++Cool+table%0D%0A++%3C%2Fcaption%3E%0D%0A++%3Ctfoot%3E%0D%0A++++%3Ctr%3E%0D%0A++++++%3Cth%3EI+can+do+so+much%21%3C%2Fth%3E%0D%0A++++%3C%2Ftr%3E%0D%0A++%3C%2Ftfoot%3E%0D%0A++%3Ctr%3E%0D%0A++++%3Ctd+style%3D%22font-size%3A16pt%3B%0D%0A++++++color%3A%23F00%3Bfont-family%3Asans-serif%3B%0D%0A++++++text-align%3Acenter%3B%22%3EWow%3C%2Ftd%3E%0D%0A++%3C%2Ftr%3E%0D%0A%3C%2Ftable%3E&amp;submit=Submit">Rich formatting preserved</a></li>
299 </ul>
301 <h2 id="Download">Download</h2>
303 <p>The current version is
304 <strong>1.5.0</strong>. Pick your distribution:</p>
306 <ul>
307 <li><a class="download" href="releases/htmlpurifier-1.5.0.tar.gz">HTML Purifier 1.5.0 (.tar.gz)</a> [<a href="releases/htmlpurifier-1.5.0.tar.gz.sig">sig</a>]</li>
308 <li><a class="download" href="releases/htmlpurifier-1.5.0.zip">HTML Purifier 1.5.0 (.zip)</a> [<a href="releases/htmlpurifier-1.5.0.zip.sig">sig</a>]</li>
309 <li><a class="download" href="releases/htmlpurifier-1.5.0-strict.tar.gz">HTML Purifier 1.5.0 PHP5-strict (.tar.gz)</a> [<a href="releases/htmlpurifier-1.5.0-strict.tar.gz.sig">sig</a>]</li>
310 <li><a class="download" href="releases/htmlpurifier-1.5.0-strict.zip">HTML Purifier 1.5.0 PHP5-strict (.zip)</a> [<a href="releases/htmlpurifier-1.5.0-strict.zip.sig">sig</a>]</li>
311 </ul>
313 <p>The <abbr>PHP</abbr>5-strict version is exactly the same
314 as the regular version with a few tweaks
315 to prevent it from complaining with
316 <a href="http://php.net/manual/en/ref.errorfunc.php#e-strict">E_STRICT</a>
317 warnings.This library is open-source, licensed under the
318 <a href="http://www.gnu.org/licenses/lgpl.html"><abbr>LGPL</abbr> v2.1+</a>.</p>
320 <p>HTML Purifier is also available as a <acronym>PEAR</acronym> package.
321 You can install it by executing:</p>
323 <pre class="command">pear channel-discover hp.jpsband.org
324 pear install hp/HTMLPurifier</pre>
326 <p>You can also grab the latest developmental code from our Subversion
327 repository. Simply execute this command:</p>
329 <pre class="command">svn co http://hp.jpsband.org/svnroot/htmlpurifier/trunk ./</pre>
331 <p>...or <a href="http://hp.jpsband.org/svnroot/htmlpurifier/trunk/">browse
332 anonymously</a> at that address. Previous releases can be obtained by browsing
333 the <a href="releases/">release directory</a>
334 or checking code out of the
335 <a href="http://hp.jpsband.org/svnroot/htmlpurifier/tags/">tags/
336 directory</a>.</p>
338 <p><acronym>SHA-1</acronym> checksums:</p>
340 <pre>
341 ec0c9cd1f24840a93cc3785e109dad925d8acd8c htmlpurifier-1.5.0-strict.tar.gz
342 c3f48848a33345cfdb01ec61db7dc5e28e4fbef8 htmlpurifier-1.5.0-strict.zip
343 d48dc62acfb26605428e49b4275e8f689cf04439 htmlpurifier-1.5.0.tar.gz
344 e687a1ff0008d0303e9ec964704accdc59ed96ef htmlpurifier-1.5.0.zip
345 </pre>
347 <p>There are also <tt>.sig</tt> files which you can use to cryptographically verify
348 that the release is from me, Edward Z. Yang. You can find
349 my <a href="http://www.thewritingpot.com/gpgpubkey.asc">public key
350 here (0x869C48DA)</a>. My key's fingerprint is:
351 <tt>3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA</tt>.</p>
353 <p>Verify with these commands:</p>
355 <pre class="command">gpg --verify <strong>$filename</strong>.sig</pre>
357 <p>You can be notified of new releases by a low-traffic announce list. Subscribe
358 here:</p>
360 <form method="post" action="http://scripts.dreamhost.com/add_list.cgi">
361 <input type="hidden" name="list" value="htmlpurifier@jpsband.org" />
362 <input type="hidden" name="domain" value="jpsband.org" />
363 <input type="hidden" name="emailit" value="1" />
364 <div>Name: <input name="name" /> E-mail: <input name="email" /></div>
365 <div><input type="submit" name="submit" value="Suscribe to Announcement List" />
366 <input type="submit" name="unsub" value="Unsubscribe" /></div>
367 </form>
369 <h2 id="Resources">Resources</h2>
370 <ul>
371 <li><strong><a href="http://hp.jpsband.org/live/docs/">End-User
372 Documentation</a></strong> &mdash; In-depth documents on how to get
373 the most out of HTML Purifier.</li>
374 <li><a href="mantis/">Mantis Bugtracker</a> &mdash; Found a bug? Report
375 it here!</li>
376 <li><a href="phorum/">Support Forum</a> &mdash; Talk about all things
377 HTML Purifier.</li>
378 <li><a href="http://hp.jpsband.org/live/smoketests/printDefinition.php">Print
379 Definition</a> &mdash; If you want to actually see what HTML Purifier's
380 filtering rules are, look no further than to this page. You can even
381 experiment with the configuration to see how things respond to different
382 directives.</li>
383 <li><a href="http://hp.jpsband.org/live/smoketests/xssAttacks.php"><abbr>XSS</abbr>
384 Attacks Smoketest</a> &mdash; Tests how well HTML Purifier fares
385 against RSnake's famous cheatsheet of <abbr>XSS</abbr> attacks.</li>
386 <li><a href="http://hp.jpsband.org/live/TODO">Roadmap</a>
387 &mdash; Subject to lots of delays, but it's a glimpse of the future</li>
388 <li><a href="http://hp.jpsband.org/live/art/">Artwork</a>
389 &mdash; Extra media goodies.</li>
390 <li><a href="http://hp.jpsband.org/live/configdoc/plain.html">Configuration
391 documentation</a> &mdash; See the <code>INSTALL</code> document on how to
392 configure your HTML Purifier installation.</li>
393 <li><a href="http://hp.jpsband.org/doxygen/html/">Doxygen-generated
394 Documentation</a> &mdash; No class left undocumented! Cross-referenced
395 code! A must-read for any prospective HTML Purifier hacker.
396 (close by, <a href="http://hp.jpsband.org/phpdoc/">PHPDoc-generated
397 Documentation.</a>)</li>
398 </ul>
400 <h2 id="Propaganda">Spread the Word!</h2>
402 <p>Help spread awareness about HTML Purifier by:</p>
404 <ul>
405 <li><a
406 href="http://del.icio.us/post?v=4&amp;noui&amp;url=http://hp.jpsband.org/&amp;title=HTML%20Purifier%20-%20Filter%20your%20HTML%20the%20standards-compliant%20way!"
407 id="delicious">Bookmarking this website</a> on your <strong>del.icio.us</strong> account, and/or</li>
408 <li>
409 <div>Including this little <strong>label</strong> on your website:
410 <a href="http://hp.jpsband.org/"><img
411 src="http://hp.jpsband.org/live/art/powered.png"
412 alt="Powered by HTML Purifier" border="0" /></a>, with this code:
413 </div>
414 <pre>&lt;a href=&quot;http://hp.jpsband.org/&quot;&gt;&lt;img
415 src=&quot;http://hp.jpsband.org/live/art/powered.png&quot;
416 alt=&quot;Powered by HTML Purifier&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;</pre>
417 </li>
418 </ul>
420 <h2 id="Contact">Contact</h2>
422 <p>You can send me an email at
423 <a href="mailto:htmlpurifier@jpsband.org">htmlpurifier@jpsband.org</a>.
424 However, I prefer that you use the forums for asking general support
425 questions (response time will be the same, I promise!)
426 Any emails I receive will be considered public: if I think a
427 solution I thought up to help you would be particularly useful to others,
428 expect it to show up on the website.</p>
430 </div>
432 </body>
433 </html>