3 How to install HTML Purifier
5 Being a library, there's no fancy GUI that will take you step-by-step through
6 configuring database credentials and other mumbo-jumbo. HTML Purifier is
7 designed to run "out of the box." Regardless, there are still a couple of
8 things you should be mindful of.
14 HTML Purifier works in both PHP 4 and PHP 5. I have run the test suite on
22 And can confidently say that HTML Purifier should work in all versions
23 between and afterwards. HTML Purifier definitely does not support PHP 4.2,
24 and PHP 4.3 branch support may go further back than that, but I haven't tested
27 I have been unable to get PHP 5.0.5 working on my computer, so if someone
28 wants to test that, be my guest. All tests were done on Windows XP Home,
29 but operating system is quite irrelevant in this particular case.
33 1. Including the proper files
35 The library/ directory must be added to your path: HTML Purifier will not be
36 able to find the necessary includes otherwise. This is as simple as:
38 set_include_path('/path/to/htmlpurifier/library' . PATH_SEPARATOR .
41 ...replacing /path/to/htmlpurifier with the actual location of the folder. Don't
42 worry, HTML Purifier is namespaced so unless you have another file named
43 HTMLPurifier.php, the files won't collide with any of your includes.
45 Then, it's a simple matter of including the base file:
47 require_once 'HTMLPurifier.php';
49 ...and you're good to go.
53 2. Preparing the proper environment
55 While no configuration is necessary, you first should take precautions regarding
56 the other output HTML that the filtered content will be going along with. Here
57 is a (short) checklist:
59 * Have I specified XHTML 1.0 Transitional as the doctype?
60 * Have I specified UTF-8 as the character encoding?
62 To find out what these are, browse to your website and view its source code.
63 You can figure out the doctype from the a declaration that looks like
64 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
65 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
66 or no doctype. You can figure out the character encoding by looking for
67 <meta http-equiv="Content-type" content="text/html;charset=ENCODING">
69 I cannot stress the importance of these two bullets enough. Omitting either
70 of them could have dire consequences not only for security but for plain
71 old usability. You can find a more in-depth discussion of why this is needed
72 in docs/security.txt, in the meantime, try to change your output so this is
75 If, for some reason, you are unable to switch to UTF-8 immediately, you can
76 switch HTML Purifier's encoding. Note that the availability of encodings is
77 dependent on iconv, and you'll be missing characters if the charset you
78 choose doesn't have them.
80 $config = HTMLPurifier_Config::createDefault();
81 $config->set('Core', 'Encoding', /* put your encoding here */);
83 An example usage for Latin-1 websites:
85 $config = HTMLPurifier_Config::createDefault();
86 $config->set('Core', 'Encoding', 'ISO-8859-1');
92 The interface is mind-numbingly simple:
94 $purifier = new HTMLPurifier();
95 $clean_html = $purifier->purify($dirty_html);
97 Or, if you're using the configuration object:
99 $purifier = new HTMLPurifier($config);
100 $clean_html = $purifier->purify($dirty_html);
102 That's it. For more examples, check out docs/examples/. Also, SLOW gives
103 advice on what to do if HTML Purifier is slowing down your application.
109 If your website is in UTF-8, use this code:
112 set_include_path('/path/to/htmlpurifier/library'
113 . PATH_SEPARATOR . get_include_path() );
114 require_once 'HTMLPurifier.php';
115 $purifier = new HTMLPurifier();
117 $clean_html = $purifier->purify($dirty_html);
119 If your website is in a different encoding, use this code:
122 set_include_path('/path/to/htmlpurifier/library'
123 . PATH_SEPARATOR . get_include_path() );
124 require_once 'HTMLPurifier.php';
126 $config = HTMLPurifier_Config::createDefault();
127 $config->set('Core', 'Encoding', 'ISO-8859-1'); //replace with your encoding
128 $purifier = new HTMLPurifier($config);
130 $clean_html = $purifier->purify($dirty_html);