pylit.py version 0.3.1 expand hard-tabs to prevent errors in indentation.
[pylit.git] / rstdocs / examples / pylit.py.html
blobf60b0a189ca410de945bef49061af3af91c84100
1 <?xml version="1.0" encoding="iso-8859-1" ?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4 <head>
5 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
6 <meta name="generator" content="Docutils 0.4.1: http://docutils.sourceforge.net/" />
7 <title>pylit.py: Literate programming with Python and reStructuredText</title>
8 <meta name="date" content="2007-01-31" />
9 <meta name="copyright" content="2005, 2007 Guenter Milde. Released under the terms of the GNU General Public License (v. 2 or later)" />
10 <style type="text/css">
13 :Author: David Goodger
14 :Contact: goodger@users.sourceforge.net
15 :Date: $Date: 2005-12-18 01:56:14 +0100 (Sun, 18 Dec 2005) $
16 :Revision: $Revision: 4224 $
17 :Copyright: This stylesheet has been placed in the public domain.
19 Default cascading style sheet for the HTML output of Docutils.
21 See http://docutils.sf.net/docs/howto/html-stylesheets.html for how to
22 customize this style sheet.
25 /* used to remove borders from tables and images */
26 .borderless, table.borderless td, table.borderless th {
27 border: 0 }
29 table.borderless td, table.borderless th {
30 /* Override padding for "table.docutils td" with "! important".
31 The right padding separates the table cells. */
32 padding: 0 0.5em 0 0 ! important }
34 .first {
35 /* Override more specific margin styles with "! important". */
36 margin-top: 0 ! important }
38 .last, .with-subtitle {
39 margin-bottom: 0 ! important }
41 .hidden {
42 display: none }
44 a.toc-backref {
45 text-decoration: none ;
46 color: black }
48 blockquote.epigraph {
49 margin: 2em 5em ; }
51 dl.docutils dd {
52 margin-bottom: 0.5em }
54 /* Uncomment (and remove this text!) to get bold-faced definition list terms
55 dl.docutils dt {
56 font-weight: bold }
59 div.abstract {
60 margin: 2em 5em }
62 div.abstract p.topic-title {
63 font-weight: bold ;
64 text-align: center }
66 div.admonition, div.attention, div.caution, div.danger, div.error,
67 div.hint, div.important, div.note, div.tip, div.warning {
68 margin: 2em ;
69 border: medium outset ;
70 padding: 1em }
72 div.admonition p.admonition-title, div.hint p.admonition-title,
73 div.important p.admonition-title, div.note p.admonition-title,
74 div.tip p.admonition-title {
75 font-weight: bold ;
76 font-family: sans-serif }
78 div.attention p.admonition-title, div.caution p.admonition-title,
79 div.danger p.admonition-title, div.error p.admonition-title,
80 div.warning p.admonition-title {
81 color: red ;
82 font-weight: bold ;
83 font-family: sans-serif }
85 /* Uncomment (and remove this text!) to get reduced vertical space in
86 compound paragraphs.
87 div.compound .compound-first, div.compound .compound-middle {
88 margin-bottom: 0.5em }
90 div.compound .compound-last, div.compound .compound-middle {
91 margin-top: 0.5em }
94 div.dedication {
95 margin: 2em 5em ;
96 text-align: center ;
97 font-style: italic }
99 div.dedication p.topic-title {
100 font-weight: bold ;
101 font-style: normal }
103 div.figure {
104 margin-left: 2em ;
105 margin-right: 2em }
107 div.footer, div.header {
108 clear: both;
109 font-size: smaller }
111 div.line-block {
112 display: block ;
113 margin-top: 1em ;
114 margin-bottom: 1em }
116 div.line-block div.line-block {
117 margin-top: 0 ;
118 margin-bottom: 0 ;
119 margin-left: 1.5em }
121 div.sidebar {
122 margin-left: 1em ;
123 border: medium outset ;
124 padding: 1em ;
125 background-color: #ffffee ;
126 width: 40% ;
127 float: right ;
128 clear: right }
130 div.sidebar p.rubric {
131 font-family: sans-serif ;
132 font-size: medium }
134 div.system-messages {
135 margin: 5em }
137 div.system-messages h1 {
138 color: red }
140 div.system-message {
141 border: medium outset ;
142 padding: 1em }
144 div.system-message p.system-message-title {
145 color: red ;
146 font-weight: bold }
148 div.topic {
149 margin: 2em }
151 h1.section-subtitle, h2.section-subtitle, h3.section-subtitle,
152 h4.section-subtitle, h5.section-subtitle, h6.section-subtitle {
153 margin-top: 0.4em }
155 h1.title {
156 text-align: center }
158 h2.subtitle {
159 text-align: center }
161 hr.docutils {
162 width: 75% }
164 img.align-left {
165 clear: left }
167 img.align-right {
168 clear: right }
170 ol.simple, ul.simple {
171 margin-bottom: 1em }
173 ol.arabic {
174 list-style: decimal }
176 ol.loweralpha {
177 list-style: lower-alpha }
179 ol.upperalpha {
180 list-style: upper-alpha }
182 ol.lowerroman {
183 list-style: lower-roman }
185 ol.upperroman {
186 list-style: upper-roman }
188 p.attribution {
189 text-align: right ;
190 margin-left: 50% }
192 p.caption {
193 font-style: italic }
195 p.credits {
196 font-style: italic ;
197 font-size: smaller }
199 p.label {
200 white-space: nowrap }
202 p.rubric {
203 font-weight: bold ;
204 font-size: larger ;
205 color: maroon ;
206 text-align: center }
208 p.sidebar-title {
209 font-family: sans-serif ;
210 font-weight: bold ;
211 font-size: larger }
213 p.sidebar-subtitle {
214 font-family: sans-serif ;
215 font-weight: bold }
217 p.topic-title {
218 font-weight: bold }
220 pre.address {
221 margin-bottom: 0 ;
222 margin-top: 0 ;
223 font-family: serif ;
224 font-size: 100% }
226 pre.literal-block, pre.doctest-block {
227 margin-left: 2em ;
228 margin-right: 2em ;
229 background-color: #eeeeee }
231 span.classifier {
232 font-family: sans-serif ;
233 font-style: oblique }
235 span.classifier-delimiter {
236 font-family: sans-serif ;
237 font-weight: bold }
239 span.interpreted {
240 font-family: sans-serif }
242 span.option {
243 white-space: nowrap }
245 span.pre {
246 white-space: pre }
248 span.problematic {
249 color: red }
251 span.section-subtitle {
252 /* font-size relative to parent (h1..h6 element) */
253 font-size: 80% }
255 table.citation {
256 border-left: solid 1px gray;
257 margin-left: 1px }
259 table.docinfo {
260 margin: 2em 4em }
262 table.docutils {
263 margin-top: 0.5em ;
264 margin-bottom: 0.5em }
266 table.footnote {
267 border-left: solid 1px black;
268 margin-left: 1px }
270 table.docutils td, table.docutils th,
271 table.docinfo td, table.docinfo th {
272 padding-left: 0.5em ;
273 padding-right: 0.5em ;
274 vertical-align: top }
276 table.docutils th.field-name, table.docinfo th.docinfo-name {
277 font-weight: bold ;
278 text-align: left ;
279 white-space: nowrap ;
280 padding-left: 0 }
282 h1 tt.docutils, h2 tt.docutils, h3 tt.docutils,
283 h4 tt.docutils, h5 tt.docutils, h6 tt.docutils {
284 font-size: 100% }
286 tt.docutils {
287 background-color: #eeeeee }
289 ul.auto-toc {
290 list-style-type: none }
292 </style>
293 </head>
294 <body>
295 <div class="document" id="pylit-py-literate-programming-with-python-and-restructuredtext">
296 <h1 class="title">pylit.py: Literate programming with Python and reStructuredText</h1>
297 <table class="docinfo" frame="void" rules="none">
298 <col class="docinfo-name" />
299 <col class="docinfo-content" />
300 <tbody valign="top">
301 <tr><th class="docinfo-name">Date:</th>
302 <td>2007-01-31</td></tr>
303 <tr><th class="docinfo-name">Copyright:</th>
304 <td>2005, 2007 Guenter Milde.
305 Released under the terms of the GNU General Public License
306 (v. 2 or later)</td></tr>
307 </tbody>
308 </table>
309 <!-- #!/usr/bin/env python
310 # -*- coding: iso-8859-1 -*- -->
311 <div class="contents topic">
312 <p class="topic-title first"><a id="contents" name="contents">Contents</a></p>
313 <ul class="auto-toc simple">
314 <li><a class="reference" href="#frontmatter" id="id8" name="id8">1&nbsp;&nbsp;&nbsp;Frontmatter</a><ul class="auto-toc">
315 <li><a class="reference" href="#changelog" id="id9" name="id9">1.1&nbsp;&nbsp;&nbsp;Changelog</a></li>
316 <li><a class="reference" href="#requirements" id="id10" name="id10">1.2&nbsp;&nbsp;&nbsp;Requirements</a></li>
317 </ul>
318 </li>
319 <li><a class="reference" href="#customization" id="id11" name="id11">2&nbsp;&nbsp;&nbsp;Customization</a></li>
320 <li><a class="reference" href="#classes" id="id12" name="id12">3&nbsp;&nbsp;&nbsp;Classes</a><ul class="auto-toc">
321 <li><a class="reference" href="#pushiterator" id="id13" name="id13">3.1&nbsp;&nbsp;&nbsp;PushIterator</a></li>
322 <li><a class="reference" href="#converter" id="id14" name="id14">3.2&nbsp;&nbsp;&nbsp;Converter</a><ul class="auto-toc">
323 <li><a class="reference" href="#data-attributes" id="id15" name="id15">3.2.1&nbsp;&nbsp;&nbsp;Data attributes</a></li>
324 <li><a class="reference" href="#instantiation" id="id16" name="id16">3.2.2&nbsp;&nbsp;&nbsp;Instantiation</a></li>
325 <li><a class="reference" href="#converter-str" id="id17" name="id17">3.2.3&nbsp;&nbsp;&nbsp;Converter.__str__</a></li>
326 <li><a class="reference" href="#converter-get-indent" id="id18" name="id18">3.2.4&nbsp;&nbsp;&nbsp;Converter.get_indent</a></li>
327 <li><a class="reference" href="#converter-ensure-trailing-blank-line" id="id19" name="id19">3.2.5&nbsp;&nbsp;&nbsp;Converter.ensure_trailing_blank_line</a></li>
328 <li><a class="reference" href="#converter-collect-blocks" id="id20" name="id20">3.2.6&nbsp;&nbsp;&nbsp;Converter.collect_blocks</a></li>
329 </ul>
330 </li>
331 <li><a class="reference" href="#text2code" id="id21" name="id21">3.3&nbsp;&nbsp;&nbsp;Text2Code</a><ul class="auto-toc">
332 <li><a class="reference" href="#text2code-header" id="id22" name="id22">3.3.1&nbsp;&nbsp;&nbsp;Text2Code.header</a></li>
333 <li><a class="reference" href="#text2code-text-handler-generator" id="id23" name="id23">3.3.2&nbsp;&nbsp;&nbsp;Text2Code.text_handler_generator</a></li>
334 <li><a class="reference" href="#text2code-code-handler-generator" id="id24" name="id24">3.3.3&nbsp;&nbsp;&nbsp;Text2Code.code_handler_generator</a></li>
335 <li><a class="reference" href="#txt2code-remove-literal-marker" id="id25" name="id25">3.3.4&nbsp;&nbsp;&nbsp;Txt2Code.remove_literal_marker</a></li>
336 <li><a class="reference" href="#text2code-iter-strip" id="id26" name="id26">3.3.5&nbsp;&nbsp;&nbsp;Text2Code.iter_strip</a></li>
337 </ul>
338 </li>
339 <li><a class="reference" href="#code2text" id="id27" name="id27">3.4&nbsp;&nbsp;&nbsp;Code2Text</a><ul class="auto-toc">
340 <li><a class="reference" href="#code2text-iter" id="id28" name="id28">3.4.1&nbsp;&nbsp;&nbsp;Code2Text.__iter__</a></li>
341 <li><a class="reference" href="#header-state" id="id29" name="id29">3.4.2&nbsp;&nbsp;&nbsp;&quot;header&quot; state</a></li>
342 <li><a class="reference" href="#code2text-text" id="id30" name="id30">3.4.3&nbsp;&nbsp;&nbsp;Code2Text.text</a></li>
343 <li><a class="reference" href="#code2text-code" id="id31" name="id31">3.4.4&nbsp;&nbsp;&nbsp;Code2Text.code</a></li>
344 <li><a class="reference" href="#code2text-block-is-text" id="id32" name="id32">3.4.5&nbsp;&nbsp;&nbsp;Code2Text.block_is_text</a></li>
345 <li><a class="reference" href="#code2text-strip-literal-marker" id="id33" name="id33">3.4.6&nbsp;&nbsp;&nbsp;Code2Text.strip_literal_marker</a></li>
346 </ul>
347 </li>
348 </ul>
349 </li>
350 <li><a class="reference" href="#command-line-use" id="id34" name="id34">4&nbsp;&nbsp;&nbsp;Command line use</a><ul class="auto-toc">
351 <li><a class="reference" href="#dual-source-handling" id="id35" name="id35">4.1&nbsp;&nbsp;&nbsp;Dual source handling</a><ul class="auto-toc">
352 <li><a class="reference" href="#how-to-determine-which-source-is-up-to-date" id="id36" name="id36">4.1.1&nbsp;&nbsp;&nbsp;How to determine which source is up-to-date?</a></li>
353 <li><a class="reference" href="#recognised-filename-extensions" id="id37" name="id37">4.1.2&nbsp;&nbsp;&nbsp;Recognised Filename Extensions</a></li>
354 </ul>
355 </li>
356 <li><a class="reference" href="#optionvalues" id="id38" name="id38">4.2&nbsp;&nbsp;&nbsp;OptionValues</a></li>
357 <li><a class="reference" href="#pylitoptions" id="id39" name="id39">4.3&nbsp;&nbsp;&nbsp;PylitOptions</a><ul class="auto-toc">
358 <li><a class="reference" href="#id5" id="id40" name="id40">4.3.1&nbsp;&nbsp;&nbsp;Instantiation</a></li>
359 <li><a class="reference" href="#calling" id="id41" name="id41">4.3.2&nbsp;&nbsp;&nbsp;Calling</a></li>
360 <li><a class="reference" href="#pylitoptions-parse-args" id="id42" name="id42">4.3.3&nbsp;&nbsp;&nbsp;PylitOptions.parse_args</a></li>
361 <li><a class="reference" href="#pylitoptions-complete-values" id="id43" name="id43">4.3.4&nbsp;&nbsp;&nbsp;PylitOptions.complete_values</a></li>
362 <li><a class="reference" href="#pylitoptions-get-outfile-name" id="id44" name="id44">4.3.5&nbsp;&nbsp;&nbsp;PylitOptions.get_outfile_name</a></li>
363 </ul>
364 </li>
365 <li><a class="reference" href="#helper-functions" id="id45" name="id45">4.4&nbsp;&nbsp;&nbsp;Helper functions</a><ul class="auto-toc">
366 <li><a class="reference" href="#open-streams" id="id46" name="id46">4.4.1&nbsp;&nbsp;&nbsp;open_streams</a></li>
367 <li><a class="reference" href="#is-newer" id="id47" name="id47">4.4.2&nbsp;&nbsp;&nbsp;is_newer</a></li>
368 <li><a class="reference" href="#get-converter" id="id48" name="id48">4.4.3&nbsp;&nbsp;&nbsp;get_converter</a></li>
369 </ul>
370 </li>
371 <li><a class="reference" href="#use-cases" id="id49" name="id49">4.5&nbsp;&nbsp;&nbsp;Use cases</a><ul class="auto-toc">
372 <li><a class="reference" href="#run-doctest" id="id50" name="id50">4.5.1&nbsp;&nbsp;&nbsp;run_doctest</a></li>
373 <li><a class="reference" href="#diff" id="id51" name="id51">4.5.2&nbsp;&nbsp;&nbsp;diff</a></li>
374 </ul>
375 </li>
376 <li><a class="reference" href="#main" id="id52" name="id52">4.6&nbsp;&nbsp;&nbsp;main</a><ul class="auto-toc">
377 <li><a class="reference" href="#id6" id="id53" name="id53">4.6.1&nbsp;&nbsp;&nbsp;Customization</a></li>
378 </ul>
379 </li>
380 </ul>
381 </li>
382 <li><a class="reference" href="#open-questions" id="id54" name="id54">5&nbsp;&nbsp;&nbsp;Open questions</a><ul class="auto-toc">
383 <li><a class="reference" href="#options" id="id55" name="id55">5.1&nbsp;&nbsp;&nbsp;Options</a></li>
384 <li><a class="reference" href="#parsing-problems" id="id56" name="id56">5.2&nbsp;&nbsp;&nbsp;Parsing Problems</a></li>
385 <li><a class="reference" href="#code-syntax-highlight" id="id57" name="id57">5.3&nbsp;&nbsp;&nbsp;code syntax highlight</a></li>
386 </ul>
387 </li>
388 </ul>
389 </div>
390 <div class="section">
391 <h1><a class="toc-backref" href="#id8" id="frontmatter" name="frontmatter">1&nbsp;&nbsp;&nbsp;Frontmatter</a></h1>
392 <div class="section">
393 <h2><a class="toc-backref" href="#id9" id="changelog" name="changelog">1.1&nbsp;&nbsp;&nbsp;Changelog</a></h2>
394 <table class="docutils field-list" frame="void" rules="none">
395 <col class="field-name" />
396 <col class="field-body" />
397 <tbody valign="top">
398 <tr class="field"><th class="field-name">2005-06-29:</th><td class="field-body">Initial version</td>
399 </tr>
400 <tr class="field"><th class="field-name">2005-06-30:</th><td class="field-body">first literate version of the script</td>
401 </tr>
402 <tr class="field"><th class="field-name">2005-07-01:</th><td class="field-body">object orientated script using generators</td>
403 </tr>
404 <tr class="field"><th class="field-name">2005-07-10:</th><td class="field-body">Two state machine (later added 'header' state)</td>
405 </tr>
406 <tr class="field"><th class="field-name">2006-12-04:</th><td class="field-body">Start of work on version 0.2 (code restructuring)</td>
407 </tr>
408 <tr class="field"><th class="field-name">2007-01-23:</th><td class="field-body">0.2 published at <a class="reference" href="http://pylit.berlios.de">http://pylit.berlios.de</a></td>
409 </tr>
410 <tr class="field"><th class="field-name">2007-01-25:</th><td class="field-body">0.2.1 Outsourced non-core documentation to the PyLit pages.</td>
411 </tr>
412 <tr class="field"><th class="field-name">2007-01-26:</th><td class="field-body">0.2.2 new behaviour of <cite>diff</cite> function</td>
413 </tr>
414 <tr class="field"><th class="field-name">2007-01-29:</th><td class="field-body">0.2.3 new <cite>header</cite> methods after suggestion by Riccardo Murri</td>
415 </tr>
416 <tr class="field"><th class="field-name">2007-01-31:</th><td class="field-body">0.2.4 raise Error if code indent is too small</td>
417 </tr>
418 <tr class="field"><th class="field-name">2007-02-05:</th><td class="field-body">0.2.5 new command line option --comment-string</td>
419 </tr>
420 <tr class="field"><th class="field-name">2007-02-09:</th><td class="field-body">0.2.6 add section with open questions,
421 Code2Text: let only blank lines (no comment str)
422 separate text and code,
423 fix <cite>Code2Text.header</cite></td>
424 </tr>
425 <tr class="field"><th class="field-name">2007-02-19:</th><td class="field-body">0.2.7 simplify <cite>Code2Text.header,</cite>
426 new <cite>iter_strip</cite> method replacing a lot of <tt class="docutils literal"><span class="pre">if</span></tt>-s</td>
427 </tr>
428 <tr class="field"><th class="field-name">2007-02-22:</th><td class="field-body">0.2.8 set <cite>mtime</cite> of outfile to the one of infile</td>
429 </tr>
430 <tr class="field"><th class="field-name">2007-02-27:</th><td class="field-body">0.3 new <cite>Code2Text</cite> converter after an idea by Riccardo Murri
431 a new <cite>Text2Code</cite> will follow soon
432 explicite <cite>option_defaults</cite> dict for easier customization</td>
433 </tr>
434 </tbody>
435 </table>
436 <pre class="literal-block">
437 &quot;&quot;&quot;pylit: Literate programming with Python and reStructuredText
439 PyLit is a bidirectional converter between
441 * a (reStructured) text source with embedded code, and
442 * a code source with embedded text blocks (comments)
443 &quot;&quot;&quot;
445 __docformat__ = 'restructuredtext'
447 _version = &quot;0.3&quot;
448 </pre>
449 </div>
450 <div class="section">
451 <h2><a class="toc-backref" href="#id10" id="requirements" name="requirements">1.2&nbsp;&nbsp;&nbsp;Requirements</a></h2>
452 <ul class="simple">
453 <li>library modules</li>
454 </ul>
455 <pre class="literal-block">
456 import re
457 import os
458 import sys
459 import optparse
460 </pre>
461 <ul class="simple">
462 <li>non-standard extensions</li>
463 </ul>
464 <pre class="literal-block">
465 from simplestates import SimpleStates # generic state machine
466 </pre>
467 </div>
468 </div>
469 <div class="section">
470 <h1><a class="toc-backref" href="#id11" id="customization" name="customization">2&nbsp;&nbsp;&nbsp;Customization</a></h1>
471 <pre class="literal-block">
472 option_defaults = {}
473 </pre>
474 <p>Default language and language specific defaults:</p>
475 <pre class="literal-block">
476 option_defaults[&quot;language&quot;] = &quot;python&quot;
477 option_defaults[&quot;comment_strings&quot;] = {&quot;python&quot;: '# ',
478 &quot;slang&quot;: '% ',
479 &quot;c++&quot;: '// ',
480 &quot;elisp&quot;: ';; '}
481 </pre>
482 <p>Recognized file extensions for text and code versions of the source.
483 Used to guess the language from the filename.</p>
484 <pre class="literal-block">
485 option_defaults[&quot;code_languages&quot;] = {&quot;.py&quot;: &quot;python&quot;,
486 &quot;.sl&quot;: &quot;slang&quot;,
487 &quot;.c&quot;: &quot;c++&quot;,
488 &quot;.el&quot;:&quot;elisp&quot;}
489 option_defaults[&quot;code_extensions&quot;] = option_defaults[&quot;code_languages&quot;].keys()
490 option_defaults[&quot;text_extensions&quot;] = [&quot;.txt&quot;]
491 </pre>
492 <p>Number of spaces to indent code blocks in the code -&gt; text conversion.[#]_</p>
493 <table class="docutils footnote" frame="void" id="id1" rules="none">
494 <colgroup><col class="label" /><col /></colgroup>
495 <tbody valign="top">
496 <tr><td class="label"><a class="fn-backref" href="#id3" name="id1">[2]</a></td><td>For the text -&gt; code conversion, the codeindent is determined by the
497 first recognized code line (leading comment or first indented literal
498 block of the text source).</td></tr>
499 </tbody>
500 </table>
501 <pre class="literal-block">
502 option_defaults[&quot;codeindent&quot;] = 2
503 </pre>
504 </div>
505 <div class="section">
506 <h1><a class="toc-backref" href="#id12" id="classes" name="classes">3&nbsp;&nbsp;&nbsp;Classes</a></h1>
507 <div class="section">
508 <h2><a class="toc-backref" href="#id13" id="pushiterator" name="pushiterator">3.1&nbsp;&nbsp;&nbsp;PushIterator</a></h2>
509 <p>The PushIterator is a minimal implementation of an iterator with
510 backtracking from the <a class="reference" href="http://www.interlink.com.au/anthony/tech/talks/OSCON2005/effective_r27.pdf">Effective Python Programming</a> OSCON 2005 tutorial by
511 Anthony&nbsp;Baxter. As the definition is small, it is inlined now. For the full
512 reasoning and documentation see <a class="reference" href="iterqueue.py.html">iterqueue.py</a>.</p>
513 <pre class="literal-block">
514 class PushIterator(object):
515 def __init__(self, iterable):
516 self.it = iter(iterable)
517 self.cache = []
518 def __iter__(self):
519 &quot;&quot;&quot;Return `self`, as this is already an iterator&quot;&quot;&quot;
520 return self
521 def next(self):
522 return (self.cache and self.cache.pop()) or self.it.next()
523 def push(self, value):
524 self.cache.append(value)
525 </pre>
526 </div>
527 <div class="section">
528 <h2><a class="toc-backref" href="#id14" id="converter" name="converter">3.2&nbsp;&nbsp;&nbsp;Converter</a></h2>
529 <p>The converter classes implement a simple <cite>state machine</cite> to separate and
530 transform text and code blocks. For this task, only a very limited parsing
531 is needed. Using the full blown <a class="reference" href="http://docutils.sourceforge.net/">docutils</a> rst parser would introduce a
532 large overhead and slow down the conversion.</p>
533 <p>PyLit's simple parser assumes:</p>
534 <ul class="simple">
535 <li>indented literal blocks in a text source are code blocks.</li>
536 <li>comment lines that start with a matching comment string in a code source
537 are text blocks.</li>
538 </ul>
539 <p>The actual converter classes are derived from <cite>PyLitConverter</cite>:
540 <a class="reference" href="#text2code">Text2Code</a> converts a text source to executable code, while <a class="reference" href="#code2text">Code2Text</a>
541 does the opposite: converting commented code to a text source.</p>
542 <p>The <cite>PyLitConverter</cite> class inherits the state machine framework
543 (initalisation, scheduler, iterator interface, ...) from <cite>SimpleStates</cite>,
544 overrides the <tt class="docutils literal"><span class="pre">__init__</span></tt> method, and adds auxiliary methods and
545 configuration attributes (options).</p>
546 <pre class="literal-block">
547 class PyLitConverter(SimpleStates):
548 &quot;&quot;&quot;parent class for `Text2Code` and `Code2Text`, the state machines
549 converting between text source and code source of a literal program.
550 &quot;&quot;&quot;
551 </pre>
552 <div class="section">
553 <h3><a class="toc-backref" href="#id15" id="data-attributes" name="data-attributes">3.2.1&nbsp;&nbsp;&nbsp;Data attributes</a></h3>
554 <p>The data attributes are class default values. They will be overridden by
555 matching keyword arguments during class instantiation.</p>
556 <p><a class="reference" href="#get-converter">get_converter</a> and <a class="reference" href="#main">main</a> pass on unused keyword arguments to
557 the instantiation of a converter class. This way, keyword arguments
558 to these functions can be used to customize the converter.</p>
559 <p>Default language and language specific defaults:</p>
560 <pre class="literal-block">
561 language = option_defaults[&quot;language&quot;]
562 comment_strings = option_defaults[&quot;comment_strings&quot;]
563 </pre>
564 <p>Number of spaces to indent code blocks in the code -&gt; text conversion:</p>
565 <pre class="literal-block">
566 codeindent = option_defaults[&quot;codeindent&quot;]
567 </pre>
568 <p>Marker string for the first code block. (Should be a valid rst directive
569 that accepts code on the same line, e.g. <tt class="docutils literal"><span class="pre">'..</span> <span class="pre">admonition::'</span></tt>.) No
570 trailing whitespace needed as indented code follows. Default is a comment
571 marker:</p>
572 <pre class="literal-block">
573 header_string = '..'
574 </pre>
575 <p>Export to the output format stripping text or code blocks:</p>
576 <pre class="literal-block">
577 strip = False
578 </pre>
579 <p>Initial state:</p>
580 <pre class="literal-block">
581 state = 'header'
582 </pre>
583 </div>
584 <div class="section">
585 <h3><a class="toc-backref" href="#id16" id="instantiation" name="instantiation">3.2.2&nbsp;&nbsp;&nbsp;Instantiation</a></h3>
586 <p>Initializing sets up the <cite>data</cite> attribute, an iterable object yielding
587 lines of the source to convert.[1]_</p>
588 <pre class="literal-block">
589 def __init__(self, data, **keyw):
590 &quot;&quot;&quot;data -- iterable data object
591 (list, file, generator, string, ...)
592 **keyw -- all remaining keyword arguments are
593 stored as class attributes
594 &quot;&quot;&quot;
595 </pre>
596 <p>As the state handlers need backtracking, the data is wrapped in a
597 <a class="reference" href="#pushiterator">PushIterator</a> if it doesnot already have a <cite>push</cite> method:</p>
598 <pre class="literal-block">
599 if hasattr(data, 'push'):
600 self.data = data
601 else:
602 self.data = PushIterator(data)
603 self._textindent = 0
604 </pre>
605 <p>Additional keyword arguments are stored as data attributes, overwriting the
606 class defaults:</p>
607 <pre class="literal-block">
608 self.__dict__.update(keyw)
609 </pre>
610 <p>The comment string is set to the language's comment string if not given in
611 the keyword arguments:</p>
612 <pre class="literal-block">
613 if not hasattr(self, &quot;comment_string&quot;) or not self.comment_string:
614 self.comment_string = self.comment_strings[self.language]
615 </pre>
616 <table class="docutils footnote" frame="void" id="id2" rules="none">
617 <colgroup><col class="label" /><col /></colgroup>
618 <tbody valign="top">
619 <tr><td class="label"><a name="id2">[1]</a></td><td><p class="first">The most common choice of data is a <cite>file</cite> object with the text
620 or code source.</p>
621 <p class="last">To convert a string into a suitable object, use its splitlines method
622 with the optional <cite>keepends</cite> argument set to True.</p>
623 </td></tr>
624 </tbody>
625 </table>
626 </div>
627 <div class="section">
628 <h3><a class="toc-backref" href="#id17" id="converter-str" name="converter-str">3.2.3&nbsp;&nbsp;&nbsp;Converter.__str__</a></h3>
629 <p>Return converted data as string:</p>
630 <pre class="literal-block">
631 def __str__(self):
632 blocks = [&quot;&quot;.join(block) for block in self()]
633 return &quot;&quot;.join(blocks)
634 </pre>
635 </div>
636 <div class="section">
637 <h3><a class="toc-backref" href="#id18" id="converter-get-indent" name="converter-get-indent">3.2.4&nbsp;&nbsp;&nbsp;Converter.get_indent</a></h3>
638 <p>Return the number of leading spaces in <cite>string</cite> after expanding tabs</p>
639 <pre class="literal-block">
640 def get_indent(self, string):
641 &quot;&quot;&quot;Return the indentation of `string`.
642 &quot;&quot;&quot;
643 line = string.expandtabs()
644 return len(line) - len(line.lstrip())
645 </pre>
646 </div>
647 <div class="section">
648 <h3><a class="toc-backref" href="#id19" id="converter-ensure-trailing-blank-line" name="converter-ensure-trailing-blank-line">3.2.5&nbsp;&nbsp;&nbsp;Converter.ensure_trailing_blank_line</a></h3>
649 <p>Ensure there is a blank line as last element of the list <cite>lines</cite>:</p>
650 <pre class="literal-block">
651 def ensure_trailing_blank_line(self, lines, next_line):
652 if not lines:
653 return
654 if lines[-1].lstrip():
655 sys.stderr.write(&quot;\nWarning: inserted blank line between\n %s %s&quot;
656 %(lines[-1], next_line))
657 lines.append(&quot;\n&quot;)
658 </pre>
659 </div>
660 <div class="section">
661 <h3><a class="toc-backref" href="#id20" id="converter-collect-blocks" name="converter-collect-blocks">3.2.6&nbsp;&nbsp;&nbsp;Converter.collect_blocks</a></h3>
662 <pre class="literal-block">
663 def collect_blocks(self):
664 &quot;&quot;&quot;collect lines in a list
666 return list for each block of lines (paragraph) seperated by a
667 blank line (whitespace only)
668 &quot;&quot;&quot;
669 block = []
670 for line in self.data:
671 block.append(line)
672 if not line.rstrip():
673 yield block
674 block = []
675 yield block
676 </pre>
677 </div>
678 </div>
679 <div class="section">
680 <h2><a class="toc-backref" href="#id21" id="text2code" name="text2code">3.3&nbsp;&nbsp;&nbsp;Text2Code</a></h2>
681 <p>The <cite>Text2Code</cite> class separates code blocks (indented literal blocks) from
682 reStructured text. Code blocks are unindented, text is commented (or
683 filtered, if the <tt class="docutils literal"><span class="pre">strip</span></tt> option is True.</p>
684 <p>Only <cite>indented literal blocks</cite> are extracted. <cite>quoted literal blocks</cite> and
685 <cite>pydoc blocks</cite> are treated as text. This allows the easy inclusion of
686 examples: <a class="footnote-reference" href="#id1" id="id3" name="id3">[2]</a></p>
687 <blockquote>
688 <pre class="doctest-block">
689 &gt;&gt;&gt; 23 + 3
691 </pre>
692 </blockquote>
693 <table class="docutils footnote" frame="void" id="id4" rules="none">
694 <colgroup><col class="label" /><col /></colgroup>
695 <tbody valign="top">
696 <tr><td class="label"><a name="id4">[3]</a></td><td>Mark that there is no double colon before the doctest block in
697 the text source.</td></tr>
698 </tbody>
699 </table>
700 <p>The state handlers are implemented as generators. Iterating over a
701 <cite>Text2Code</cite> instance initializes them to generate iterators for
702 the respective states (see <tt class="docutils literal"><span class="pre">simplestates.py</span></tt>).</p>
703 <pre class="literal-block">
704 class Text2Code(PyLitConverter):
705 &quot;&quot;&quot;Convert a (reStructured) text source to code source
706 &quot;&quot;&quot;
707 </pre>
708 <p>INIT: call the parent classes init method.</p>
709 <p>If the <cite>strip</cite> argument is true, replace the <cite>__iter_</cite> method
710 with a special one that drops &quot;spurious&quot; blocks:</p>
711 <pre class="literal-block">
712 def __init__(self, data, **keyw):
713 PyLitConverter.__init__(self, data, **keyw)
714 if getattr(self, &quot;strip&quot;, False):
715 self.__iter__ = self.iter_strip
716 </pre>
717 <div class="section">
718 <h3><a class="toc-backref" href="#id22" id="text2code-header" name="text2code-header">3.3.1&nbsp;&nbsp;&nbsp;Text2Code.header</a></h3>
719 <p>Convert the header (leading rst comment block) to code:</p>
720 <pre class="literal-block">
721 def header(self):
722 &quot;&quot;&quot;Convert header (comment) to code&quot;&quot;&quot;
723 line = self.data_iterator.next()
724 </pre>
725 <p>Test first line for rst comment: (We need to do this explicitely here, as
726 the code handler will only recognize the start of a text block if a line
727 starting with &quot;matching comment&quot; is preceded by an empty line. However, we
728 have to care for the case of the first line beeing a &quot;text line&quot;.</p>
729 <p>Which variant is better?</p>
730 <ol class="arabic">
731 <li><p class="first">starts with comment marker and has
732 something behind the comment on the first line:</p>
733 <pre class="literal-block">
734 # if line.startswith(&quot;..&quot;) and len(line.rstrip()) &gt; 2:
735 </pre>
736 </li>
737 <li><p class="first">Convert any leading comment to code:</p>
738 <pre class="literal-block">
739 if line.startswith(self.header_string):
740 </pre>
741 </li>
742 </ol>
743 <p>Strip leading comment string (typically added by <cite>Code2Text.header</cite>) and
744 return the result of processing the data with the code handler:</p>
745 <pre class="literal-block">
746 self.data_iterator.push(line.replace(self.header_string, &quot;&quot;, 1))
747 return self.code()
748 </pre>
749 <p>No header code found: Push back first non-header line and set state to
750 &quot;text&quot;:</p>
751 <pre class="literal-block">
752 self.data_iterator.push(line)
753 self.state = 'text'
754 return []
755 </pre>
756 </div>
757 <div class="section">
758 <h3><a class="toc-backref" href="#id23" id="text2code-text-handler-generator" name="text2code-text-handler-generator">3.3.2&nbsp;&nbsp;&nbsp;Text2Code.text_handler_generator</a></h3>
759 <p>The 'text' handler processes everything that is not an indented literal
760 comment. Text is quoted with <cite>self.comment_string</cite> or filtered (with
761 strip=True).</p>
762 <p>It is implemented as a generator function that acts on the <cite>data</cite> iterator
763 and yields text blocks.</p>
764 <p>Declaration and initialization:</p>
765 <pre class="literal-block">
766 def text_handler_generator(self):
767 &quot;&quot;&quot;Convert text blocks from rst to comment
768 &quot;&quot;&quot;
769 lines = []
770 </pre>
771 <p>Iterate over the data_iterator (which yields the data lines):</p>
772 <pre class="literal-block">
773 for line in self.data_iterator:
774 # print &quot;Text: '%s'&quot;%line
775 </pre>
776 <p>Default action: add comment string and collect in <cite>lines</cite> list:</p>
777 <pre class="literal-block">
778 lines.append(self.comment_string + line)
779 </pre>
780 <p>Test for the end of the text block: a line that ends with <cite>::</cite> but is neither
781 a comment nor a directive:</p>
782 <pre class="literal-block">
783 if (line.rstrip().endswith(&quot;::&quot;)
784 and not line.lstrip().startswith(&quot;..&quot;)):
785 </pre>
786 <p>End of text block is detected, now:</p>
787 <p>set the current text indent level (needed by the code handler to find the
788 end of code block) and set the state to &quot;code&quot; (i.e. the next call of
789 <cite>self.next</cite> goes to the code handler):</p>
790 <pre class="literal-block">
791 self._textindent = self.get_indent(line)
792 self.state = 'code'
793 </pre>
794 <p>Ensure a trailing blank line (which is the paragraph separator in
795 reStructured Text. Look at the next line, if it is blank -- OK, if it is
796 not blank, push it back (it should be code) and add a line by calling the
797 <cite>ensure_trailing_blank_line</cite> method (which also issues a warning):</p>
798 <pre class="literal-block">
799 line = self.data_iterator.next()
800 if line.lstrip():
801 self.data_iterator.push(line) # push back
802 self.ensure_trailing_blank_line(lines, line)
803 else:
804 lines.append(line)
805 </pre>
806 <p>Now yield and reset the lines. (There was a function call to remove a
807 literal marker (if on a line on itself) to shorten the comment. However,
808 this behaviour was removed as the resulting difference in line numbers leads
809 to misleading error messages in doctests):</p>
810 <pre class="literal-block">
811 #remove_literal_marker(lines)
812 yield lines
813 lines = []
814 </pre>
815 <p>End of data: if we &quot;fall of&quot; the iteration loop, just join and return the
816 lines:</p>
817 <pre class="literal-block">
818 yield lines
819 </pre>
820 </div>
821 <div class="section">
822 <h3><a class="toc-backref" href="#id24" id="text2code-code-handler-generator" name="text2code-code-handler-generator">3.3.3&nbsp;&nbsp;&nbsp;Text2Code.code_handler_generator</a></h3>
823 <p>The <cite>code</cite> handler is called when a literal block marker is encounterd. It
824 returns a code block (indented literal block), removing leading whitespace
825 up to the indentation of the first code line in the file (this deviation
826 from docutils behaviour allows indented blocks of Python code).</p>
827 <p>As the code handler detects the switch to &quot;text&quot; state by looking at
828 the line indents, it needs to push back the last probed data token. I.e.
829 the data_iterator must support a <cite>push</cite> method. (This is the
830 reason for the use of the PushIterator class in <cite>__init__</cite>.)</p>
831 <pre class="literal-block">
832 def code_handler_generator(self):
833 &quot;&quot;&quot;Convert indented literal blocks to source code
834 &quot;&quot;&quot;
835 lines = []
836 codeindent = None # indent of first non-blank code line, set below
837 indent_string = &quot;&quot; # leading whitespace chars ...
838 </pre>
839 <p>Iterate over the lines in the input data:</p>
840 <pre class="literal-block">
841 for line in self.data_iterator:
842 # print &quot;Code: '%s'&quot;%line
843 </pre>
844 <p>Pass on blank lines (no test for end of code block needed|possible):</p>
845 <pre class="literal-block">
846 if not line.rstrip():
847 lines.append(line.replace(indent_string, &quot;&quot;, 1))
848 continue
849 </pre>
850 <p>Test for end of code block:</p>
851 <p>A literal block ends with the first less indented, nonblank line.
852 <cite>self._textindent</cite> is set by the text handler to the indent of the
853 preceding paragraph.</p>
854 <p>To prevent problems with different tabulator settings, hard tabs in code
855 lines are expanded with the <cite>expandtabs</cite> string method when calculating the
856 indentation (i.e. replaced by 8 spaces, by default).</p>
857 <pre class="literal-block">
858 if self.get_indent(line) &lt;= self._textindent:
859 # push back line
860 self.data_iterator.push(line)
861 self.state = 'text'
862 # append blank line (if not already present)
863 self.ensure_trailing_blank_line(lines, line)
864 yield lines
865 # reset list of lines
866 lines = []
867 continue
868 </pre>
869 <p>OK, we are sure now that the current line is neither blank nor a text line.</p>
870 <p>If still unset, determine the code indentation from first non-blank code
871 line:</p>
872 <pre class="literal-block">
873 if codeindent is None and line.lstrip():
874 codeindent = self.get_indent(line)
875 indent_string = line[:codeindent]
876 </pre>
877 <p>Append unindented line to lines cache (but check if we can safely unindent
878 first):</p>
879 <pre class="literal-block">
880 if not line.startswith(indent_string):
881 raise ValueError, &quot;cannot unindent line %r,\n&quot;%line \
882 + &quot; doesnot start with code indent string %r&quot;%indent_string
884 lines.append(line[codeindent:])
885 </pre>
886 <p>No more lines in the input data: just return what we have:</p>
887 <pre class="literal-block">
888 yield lines
889 </pre>
890 </div>
891 <div class="section">
892 <h3><a class="toc-backref" href="#id25" id="txt2code-remove-literal-marker" name="txt2code-remove-literal-marker">3.3.4&nbsp;&nbsp;&nbsp;Txt2Code.remove_literal_marker</a></h3>
893 <p>Remove literal marker (::) in &quot;expanded form&quot; i.e. in a paragraph on its own.</p>
894 <p>While cleaning up the code source, it leads to confusion for doctest and
895 searches (e.g. grep) as line-numbers between text and code source will
896 differ.</p>
897 <pre class="literal-block">
898 def remove_literal_marker(list):
899 try:
900 # print lines[-3:]
901 if (lines[-3].strip() == self.comment_string.strip()
902 and lines[-2].strip() == self.comment_string + '::'):
903 del(lines[-3:-1])
904 except IndexError:
905 pass
906 </pre>
907 </div>
908 <div class="section">
909 <h3><a class="toc-backref" href="#id26" id="text2code-iter-strip" name="text2code-iter-strip">3.3.5&nbsp;&nbsp;&nbsp;Text2Code.iter_strip</a></h3>
910 <p>Modification of the <cite>simplestates.__iter__</cite> method that will replace it when
911 the <cite>strip</cite> keyword argument is <cite>True</cite> during class instantiation:</p>
912 <p>Iterate over class instances dropping text blocks:</p>
913 <pre class="literal-block">
914 def iter_strip(self):
915 &quot;&quot;&quot;Generate and return an iterator dropping text blocks
916 &quot;&quot;&quot;
917 self.data_iterator = self.data
918 self._initialize_state_generators()
919 while True:
920 yield getattr(self, self.state)()
921 getattr(self, self.state)() # drop text block
922 </pre>
923 </div>
924 </div>
925 <div class="section">
926 <h2><a class="toc-backref" href="#id27" id="code2text" name="code2text">3.4&nbsp;&nbsp;&nbsp;Code2Text</a></h2>
927 <p>The <cite>Code2Text</cite> class does the opposite of <a class="reference" href="#text2code">Text2Code</a> -- it processes
928 valid source code, extracts comments, and puts non-commented code in literal
929 blocks.</p>
930 <p>The class is derived from the PyLitConverter state machine and adds an
931 <cite>__iter__</cite> method as well as handlers for &quot;text&quot;, and &quot;code&quot; states.</p>
932 <pre class="literal-block">
933 class Code2Text(PyLitConverter):
934 &quot;&quot;&quot;Convert code source to text source
935 &quot;&quot;&quot;
936 </pre>
937 <div class="section">
938 <h3><a class="toc-backref" href="#id28" id="code2text-iter" name="code2text-iter">3.4.1&nbsp;&nbsp;&nbsp;Code2Text.__iter__</a></h3>
939 <pre class="literal-block">
940 def __iter__(self):
941 </pre>
942 <p>If the last text block doesnot end with a code marker (by default, the
943 literal-block marker <tt class="docutils literal"><span class="pre">::</span></tt>), the <cite>text</cite> method will set <cite>code marker</cite> to
944 a paragraph that will start the next code block. It is yielded if non-empty
945 at a text-code transition. If there is no preceding text block, <cite>code_marker</cite>
946 contains the <cite>header_string</cite>:</p>
947 <pre class="literal-block">
948 if self.strip:
949 self.code_marker = []
950 else:
951 self.code_marker = [self.header_string]
953 for block in self.collect_blocks():
954 </pre>
955 <p>Test the state of the block with <a class="reference" href="#code2text-block-is-text">Code2Text.block_is_text</a>, return it
956 processed with the matching handler:</p>
957 <pre class="literal-block">
958 if self.block_is_text(block):
959 self.state = &quot;text&quot;
960 else:
961 if self.state != &quot;code&quot; and self.code_marker:
962 yield self.code_marker
963 self.state = &quot;code&quot;
964 yield getattr(self, self.state)(block)
965 </pre>
966 </div>
967 <div class="section">
968 <h3><a class="toc-backref" href="#id29" id="header-state" name="header-state">3.4.2&nbsp;&nbsp;&nbsp;&quot;header&quot; state</a></h3>
969 <p>Sometimes code needs to remain on the first line(s) of the document to be
970 valid. The most common example is the &quot;shebang&quot; line that tells a POSIX
971 shell how to process an executable file:</p>
972 <pre class="literal-block">
973 #!/usr/bin/env python
974 </pre>
975 <p>In Python, the <tt class="docutils literal"><span class="pre">#</span> <span class="pre">-*-</span> <span class="pre">coding:</span> <span class="pre">iso-8859-1</span> <span class="pre">-*-</span></tt> line must occure before any
976 other comment or code.</p>
977 <p>If we want to keep the line numbers in sync for text and code source, the
978 reStructured Text markup for these header lines must start at the same line
979 as the first header line. Therfore, header lines could not be marked as
980 literal block (this would require the <tt class="docutils literal"><span class="pre">::</span></tt> and an empty line above the code).</p>
981 <p>OTOH, a comment may start at the same line as the comment marker and it
982 includes subsequent indented lines. Comments are visible in the reStructured
983 Text source but hidden in the pretty-printed output.</p>
984 <p>With a header converted to comment in the text source, everything before the
985 first text block (i.e. before the first paragraph using the matching comment
986 string) will be hidden away (in HTML or PDF output).</p>
987 <p>This seems a good compromise, the advantages</p>
988 <ul class="simple">
989 <li>line numbers are kept</li>
990 <li>the &quot;normal&quot; code conversion rules (indent/unindent by <cite>codeindent</cite> apply</li>
991 <li>greater flexibility: you can hide a repeating header in a project
992 consisting of many source files.</li>
993 </ul>
994 <p>set off the disadvantages</p>
995 <ul class="simple">
996 <li>it may come as surprise if a part of the file is not &quot;printed&quot;,</li>
997 <li>one more syntax element to learn for rst newbees to start with pylit,
998 (however, starting from the code source, this will be auto-generated)</li>
999 </ul>
1000 <p>In the case that there is no matching comment at all, the complete code
1001 source will become a comment -- however, in this case it is not very likely
1002 the source is a literate document anyway.</p>
1003 <p>If needed for the documentation, it is possible to repeat the header in (or
1004 after) the first text block, e.g. with a <cite>line block</cite> in a <cite>block quote</cite>:</p>
1005 <blockquote>
1006 <div class="line-block">
1007 <div class="line"><tt class="docutils literal"><span class="pre">#!/usr/bin/env</span> <span class="pre">python</span></tt></div>
1008 <div class="line"><tt class="docutils literal"><span class="pre">#</span> <span class="pre">-*-</span> <span class="pre">coding:</span> <span class="pre">iso-8859-1</span> <span class="pre">-*-</span></tt></div>
1009 </div>
1010 </blockquote>
1011 <p>The current implementation represents the header state by the setting of
1012 <cite>code_marker</cite> to <tt class="docutils literal"><span class="pre">[self.header_string]</span></tt>. The first non-empty text block
1013 will overwrite this setting.</p>
1014 </div>
1015 <div class="section">
1016 <h3><a class="toc-backref" href="#id30" id="code2text-text" name="code2text-text">3.4.3&nbsp;&nbsp;&nbsp;Code2Text.text</a></h3>
1017 <p>The <em>text state handler</em> converts a comment to a text block by stripping
1018 the leading <cite>comment string</cite> from every line:</p>
1019 <pre class="literal-block">
1020 def text(self, lines):
1021 &quot;&quot;&quot;Uncomment text blocks in source code
1022 &quot;&quot;&quot;
1024 lines = [line.replace(self.comment_string, &quot;&quot;, 1) for line in lines]
1026 lines = [re.sub(&quot;^&quot;+self.comment_string.rstrip(), &quot;&quot;, line)
1027 for line in lines]
1028 </pre>
1029 <p>If the code block is stripped, the literal marker would lead to an error
1030 when the text is converted with docutils. Replace it with
1031 <a class="reference" href="#code2text-strip-literal-marker">Code2Text.strip_literal_marker</a>:</p>
1032 <pre class="literal-block">
1033 if self.strip:
1034 self.strip_literal_marker(lines)
1035 self.code_marker = []
1036 </pre>
1037 <p>Check for code block marker (double colon) at the end of the text block
1038 Update the <cite>code_marker</cite> argument. (The <cite>code marker</cite> is yielded by
1039 <a class="reference" href="#code2text-iter">Code2Text.__iter__</a> at a text -&gt; code transition if it is not empty):</p>
1040 <pre class="literal-block">
1041 elif len(lines)&gt;1:
1042 if lines[-2].rstrip().endswith(&quot;::&quot;):
1043 self.code_marker = []
1044 else:
1045 self.code_marker = [&quot;::\n&quot;, &quot;\n&quot;]
1046 </pre>
1047 <p>Return the text block to the calling function:</p>
1048 <pre class="literal-block">
1049 return lines
1050 </pre>
1051 </div>
1052 <div class="section">
1053 <h3><a class="toc-backref" href="#id31" id="code2text-code" name="code2text-code">3.4.4&nbsp;&nbsp;&nbsp;Code2Text.code</a></h3>
1054 <p>The <cite>code</cite> method is called on non-commented code. Code is returned as
1055 indented literal block (or filtered, if <tt class="docutils literal"><span class="pre">self.strip</span> <span class="pre">==</span> <span class="pre">True</span></tt>). The amount
1056 of the code indentation is controled by <cite>self.codeindent</cite> (default 2).</p>
1057 <pre class="literal-block">
1058 def code(self, lines):
1059 &quot;&quot;&quot;Indent lines or strip if `strip` == `True`
1060 &quot;&quot;&quot;
1061 if self.strip == True:
1062 return []
1064 return [&quot; &quot;*self.codeindent + line for line in lines]
1065 </pre>
1066 </div>
1067 <div class="section">
1068 <h3><a class="toc-backref" href="#id32" id="code2text-block-is-text" name="code2text-block-is-text">3.4.5&nbsp;&nbsp;&nbsp;Code2Text.block_is_text</a></h3>
1069 <p>A paragraph is a text block, if every non-blank line starts with a matching
1070 comment string (test includes whitespace except for commented blank lines!)</p>
1071 <pre class="literal-block">
1072 def block_is_text(self, block):
1073 for line in block:
1074 if (line.rstrip()
1075 and not line.startswith(self.comment_string)
1076 and line.rstrip() != self.comment_string.rstrip()):
1077 return False
1078 return True
1079 </pre>
1080 </div>
1081 <div class="section">
1082 <h3><a class="toc-backref" href="#id33" id="code2text-strip-literal-marker" name="code2text-strip-literal-marker">3.4.6&nbsp;&nbsp;&nbsp;Code2Text.strip_literal_marker</a></h3>
1083 <p>Replace the literal marker with the equivalent of docutils replace rules</p>
1084 <ul class="simple">
1085 <li>strip <cite>::</cite>-line (and preceding blank line) if on a line on its own</li>
1086 <li>strip <cite>::</cite> if it is preceded by whitespace.</li>
1087 <li>convert <cite>::</cite> to a single colon if preceded by text</li>
1088 </ul>
1089 <p><cite>lines</cite> should be list of text lines (with a trailing blank line).
1090 It is modified in-place:</p>
1091 <pre class="literal-block">
1092 def strip_literal_marker(self, lines):
1093 try:
1094 line = lines[-2]
1095 except IndexError: # len(lines &lt; 2)
1096 return
1098 # split at rightmost '::'
1099 try:
1100 (head, tail) = line.rsplit('::', 1)
1101 except ValueError: # only one part (no '::')
1102 return
1104 # '::' on an extra line
1105 if not head.strip():
1106 del(lines[-2])
1107 # delete preceding line if it is blank
1108 if len(lines) &gt;= 2 and not lines[-2].lstrip():
1109 del(lines[-2])
1110 # '::' follows whitespace
1111 elif head.rstrip() &lt; head:
1112 head = head.rstrip()
1113 lines[-2] = &quot;&quot;.join((head, tail))
1114 # '::' follows text
1115 else:
1116 lines[-2] = &quot;:&quot;.join((head, tail))
1117 </pre>
1118 </div>
1119 </div>
1120 </div>
1121 <div class="section">
1122 <h1><a class="toc-backref" href="#id34" id="command-line-use" name="command-line-use">4&nbsp;&nbsp;&nbsp;Command line use</a></h1>
1123 <p>Using this script from the command line will convert a file according to its
1124 extension. This default can be overridden by a couple of options.</p>
1125 <div class="section">
1126 <h2><a class="toc-backref" href="#id35" id="dual-source-handling" name="dual-source-handling">4.1&nbsp;&nbsp;&nbsp;Dual source handling</a></h2>
1127 <div class="section">
1128 <h3><a class="toc-backref" href="#id36" id="how-to-determine-which-source-is-up-to-date" name="how-to-determine-which-source-is-up-to-date">4.1.1&nbsp;&nbsp;&nbsp;How to determine which source is up-to-date?</a></h3>
1129 <ul>
1130 <li><p class="first">set modification date of <cite>oufile</cite> to the one of <cite>infile</cite></p>
1131 <p>Points out that the source files are 'synchronized'.</p>
1132 <ul>
1133 <li><p class="first">Are there problems to expect from &quot;backdating&quot; a file? Which?</p>
1134 <p>Looking at <a class="reference" href="http://www.unix.com/showthread.php?t=20526">http://www.unix.com/showthread.php?t=20526</a>, it seems
1135 perfectly legal to set <cite>mtime</cite> (while leaving <cite>ctime</cite>) as <cite>mtime</cite> is a
1136 description of the &quot;actuality&quot; of the data in the file.</p>
1137 </li>
1138 <li><p class="first">Should this become a default or an option?</p>
1139 </li>
1140 </ul>
1141 </li>
1142 <li><p class="first">alternatively move input file to a backup copy (with option: <cite>--replace</cite>)</p>
1143 </li>
1144 <li><p class="first">check modification date before overwriting
1145 (with option: <cite>--overwrite=update</cite>)</p>
1146 </li>
1147 <li><p class="first">check modification date before editing (implemented as <a class="reference" href="http://www.jedsoft.org/jed/">Jed editor</a>
1148 function <cite>pylit_check()</cite> in <a class="reference" href="http://jedmodes.sourceforge.net/mode/pylit/">pylit.sl</a>)</p>
1149 </li>
1150 </ul>
1151 </div>
1152 <div class="section">
1153 <h3><a class="toc-backref" href="#id37" id="recognised-filename-extensions" name="recognised-filename-extensions">4.1.2&nbsp;&nbsp;&nbsp;Recognised Filename Extensions</a></h3>
1154 <p>Finding an easy to remember, unused filename extension is not easy.</p>
1155 <dl class="docutils">
1156 <dt>.py.txt</dt>
1157 <dd>a double extension (similar to .tar.gz, say) seems most appropriate
1158 (at least on UNIX). However, it fails on FAT16 filesystems.
1159 The same scheme can be used for c.txt, p.txt and the like.</dd>
1160 <dt>.pytxt</dt>
1161 <dd>is recognised as extension by os.path.splitext but also fails on FAT16</dd>
1162 <dt>.pyt</dt>
1163 <dd>(PYthon Text) is used by the Python test interpreter
1164 <a class="reference" href="http:www.zetadev.com/software/pytest/">pytest</a></dd>
1165 <dt>.pyl</dt>
1166 <dd>was even mentioned as extension for &quot;literate Python&quot; files in an
1167 email exchange (<a class="reference" href="http://www.python.org/tim_one/000115.html">http://www.python.org/tim_one/000115.html</a>) but
1168 subsequently used for Python libraries.</dd>
1169 <dt>.lpy</dt>
1170 <dd>seems to be free (as by a Google search, &quot;lpy&quot; is the name of a python
1171 code pretty printer but this should not pose a problem).</dd>
1172 <dt>.tpy</dt>
1173 <dd>seems to be free as well.</dd>
1174 </dl>
1175 <p>Instead of defining a new extension for &quot;pylit&quot; literate programms,
1176 by default <tt class="docutils literal"><span class="pre">.txt</span></tt> will be appended for literate code and stripped by
1177 the conversion to executable code. i.e. for a program foo:</p>
1178 <ul class="simple">
1179 <li>the literate source is called <tt class="docutils literal"><span class="pre">foo.py.txt</span></tt></li>
1180 <li>the html rendering is called <tt class="docutils literal"><span class="pre">foo.py.html</span></tt></li>
1181 <li>the python source is called <tt class="docutils literal"><span class="pre">foo.py</span></tt></li>
1182 </ul>
1183 </div>
1184 </div>
1185 <div class="section">
1186 <h2><a class="toc-backref" href="#id38" id="optionvalues" name="optionvalues">4.2&nbsp;&nbsp;&nbsp;OptionValues</a></h2>
1187 <p>For use as keyword arguments, it is handy to have the options
1188 in a dictionary. The following class adds an <cite>as_dict</cite> method
1189 to <cite>optparse.Values</cite>:</p>
1190 <pre class="literal-block">
1191 class OptionValues(optparse.Values):
1192 def as_dict(self):
1193 &quot;&quot;&quot;Return options as dictionary object&quot;&quot;&quot;
1194 return dict([(option, getattr(self, option)) for option in dir(self)
1195 if option not in dir(OptionValues)
1196 and option is not None
1198 </pre>
1199 </div>
1200 <div class="section">
1201 <h2><a class="toc-backref" href="#id39" id="pylitoptions" name="pylitoptions">4.3&nbsp;&nbsp;&nbsp;PylitOptions</a></h2>
1202 <p>Options are stored in the values attribute of the <cite>PylitOptions</cite> class.
1203 It is initialized with default values and parsed command line options (and
1204 arguments) This scheme allows easy customization by code importing the
1205 <cite>pylit</cite> module.</p>
1206 <pre class="literal-block">
1207 class PylitOptions(object):
1208 &quot;&quot;&quot;Storage and handling of program options
1209 &quot;&quot;&quot;
1210 </pre>
1211 <div class="section">
1212 <h3><a class="toc-backref" href="#id40" id="id5" name="id5">4.3.1&nbsp;&nbsp;&nbsp;Instantiation</a></h3>
1213 <p>Instantiation sets up an OptionParser and initializes it with pylit's
1214 command line options and <cite>default_values</cite>. It then updates the values based
1215 on command line options and sensible defaults:</p>
1216 <pre class="literal-block">
1217 def __init__(self, args=sys.argv[1:], **keyw):
1218 &quot;&quot;&quot;Set up an `OptionParser` instance and parse and complete arguments
1219 &quot;&quot;&quot;
1220 p = optparse.OptionParser(usage=main.__doc__, version=_version)
1221 # set defaults (from modules option_defaults dict and keyword args)
1222 defaults = dict(option_defaults) # copy module-level defaults
1223 defaults.update(keyw)
1224 p.set_defaults(**defaults)
1225 # add the options
1226 p.add_option(&quot;-c&quot;, &quot;--code2txt&quot;, dest=&quot;txt2code&quot;, action=&quot;store_false&quot;,
1227 help=&quot;convert code to reStructured text&quot;)
1228 p.add_option(&quot;--comment-string&quot;, dest=&quot;comment_string&quot;,
1229 help=&quot;text block marker (default '# ' (for Python))&quot; )
1230 p.add_option(&quot;-d&quot;, &quot;--diff&quot;, action=&quot;store_true&quot;,
1231 help=&quot;test for differences to existing file&quot;)
1232 p.add_option(&quot;--doctest&quot;, action=&quot;store_true&quot;,
1233 help=&quot;run doctest.testfile() on the text version&quot;)
1234 p.add_option(&quot;-e&quot;, &quot;--execute&quot;, action=&quot;store_true&quot;,
1235 help=&quot;execute code (Python only)&quot;)
1236 p.add_option(&quot;-f&quot;, &quot;--infile&quot;,
1237 help=&quot;input file name ('-' for stdout)&quot; )
1238 p.add_option(&quot;--language&quot;, action=&quot;store&quot;,
1239 choices = option_defaults[&quot;code_languages&quot;].values(),
1240 help=&quot;use LANGUAGE native comment style&quot;)
1241 p.add_option(&quot;--overwrite&quot;, action=&quot;store&quot;,
1242 choices = [&quot;yes&quot;, &quot;update&quot;, &quot;no&quot;],
1243 help=&quot;overwrite output file (default 'update')&quot;)
1244 p.add_option(&quot;-o&quot;, &quot;--outfile&quot;,
1245 help=&quot;output file name ('-' for stdout)&quot; )
1246 p.add_option(&quot;--replace&quot;, action=&quot;store_true&quot;,
1247 help=&quot;move infile to a backup copy (appending '~')&quot;)
1248 p.add_option(&quot;-s&quot;, &quot;--strip&quot;, action=&quot;store_true&quot;,
1249 help=&quot;export by stripping text or code&quot;)
1250 p.add_option(&quot;-t&quot;, &quot;--txt2code&quot;, action=&quot;store_true&quot;,
1251 help=&quot;convert reStructured text to code&quot;)
1252 self.parser = p
1254 # parse to fill a self.Values instance
1255 self.values = self.parse_args(args)
1256 # complete with context-sensitive defaults
1257 self.values = self.complete_values(self.values)
1258 </pre>
1259 </div>
1260 <div class="section">
1261 <h3><a class="toc-backref" href="#id41" id="calling" name="calling">4.3.2&nbsp;&nbsp;&nbsp;Calling</a></h3>
1262 <p>&quot;Calling&quot; an instance updates the option values based on command line
1263 arguments and default values and does a completion of the options based on
1264 &quot;context-sensitive defaults&quot;:</p>
1265 <pre class="literal-block">
1266 def __call__(self, args=sys.argv[1:], **default_values):
1267 &quot;&quot;&quot;parse and complete command line args
1268 &quot;&quot;&quot;
1269 values = self.parse_args(args, **default_values)
1270 return self.complete_values(values)
1271 </pre>
1272 </div>
1273 <div class="section">
1274 <h3><a class="toc-backref" href="#id42" id="pylitoptions-parse-args" name="pylitoptions-parse-args">4.3.3&nbsp;&nbsp;&nbsp;PylitOptions.parse_args</a></h3>
1275 <p>The <cite>parse_args</cite> method calls the <cite>optparse.OptionParser</cite> on command
1276 line or provided args and returns the result as <cite>PylitOptions.Values</cite>
1277 instance. Defaults can be provided as keyword arguments:</p>
1278 <pre class="literal-block">
1279 def parse_args(self, args=sys.argv[1:], **default_values):
1280 &quot;&quot;&quot;parse command line arguments using `optparse.OptionParser`
1282 args -- list of command line arguments.
1283 default_values -- dictionary of option defaults
1284 &quot;&quot;&quot;
1285 # update defaults
1286 defaults = self.parser.defaults.copy()
1287 defaults.update(default_values)
1288 # parse arguments
1289 (values, args) = self.parser.parse_args(args, OptionValues(defaults))
1290 # Convert FILE and OUTFILE positional args to option values
1291 # (other positional arguments are ignored)
1292 try:
1293 values.infile = args[0]
1294 values.outfile = args[1]
1295 except IndexError:
1296 pass
1297 return values
1298 </pre>
1299 </div>
1300 <div class="section">
1301 <h3><a class="toc-backref" href="#id43" id="pylitoptions-complete-values" name="pylitoptions-complete-values">4.3.4&nbsp;&nbsp;&nbsp;PylitOptions.complete_values</a></h3>
1302 <p>The <cite>complete</cite> method uses context information to set missing option values
1303 to sensible defaults (if possible).</p>
1304 <pre class="literal-block">
1305 def complete_values(self, values):
1306 &quot;&quot;&quot;complete option values with context sensible defaults
1307 &quot;&quot;&quot;
1308 values.ensure_value(&quot;infile&quot;, &quot;&quot;)
1309 # Guess conversion direction from infile filename
1310 if values.ensure_value(&quot;txt2code&quot;, None) is None:
1311 in_extension = os.path.splitext(values.infile)[1]
1312 if in_extension in self.values.text_extensions:
1313 values.txt2code = True
1314 elif in_extension in self.values.code_extensions:
1315 values.txt2code = False
1316 # Auto-determine the output file name
1317 values.ensure_value(&quot;outfile&quot;, self.get_outfile_name(values.infile,
1318 values.txt2code))
1319 # Guess conversion direction from outfile filename or set to default
1320 if values.txt2code is None:
1321 out_extension = os.path.splitext(values.outfile)[1]
1322 values.txt2code = not (out_extension in self.values.text_extensions)
1324 # Set the language of the code (default &quot;python&quot;)
1325 if values.txt2code is True:
1326 code_extension = os.path.splitext(values.outfile)[1]
1327 elif values.txt2code is False:
1328 code_extension = os.path.splitext(values.infile)[1]
1329 values.ensure_value(&quot;language&quot;,
1330 self.values.code_languages.get(code_extension, &quot;python&quot;))
1331 # Set the default overwrite mode
1332 values.ensure_value(&quot;overwrite&quot;, 'update')
1334 return values
1335 </pre>
1336 </div>
1337 <div class="section">
1338 <h3><a class="toc-backref" href="#id44" id="pylitoptions-get-outfile-name" name="pylitoptions-get-outfile-name">4.3.5&nbsp;&nbsp;&nbsp;PylitOptions.get_outfile_name</a></h3>
1339 <p>Construct a matching filename for the output file. The output filename is
1340 constructed from <cite>infile</cite> by the following rules:</p>
1341 <ul class="simple">
1342 <li>'-' (stdin) results in '-' (stdout)</li>
1343 <li>strip the <cite>txt_extension</cite> or add the <cite>code_extension</cite> (txt2code)</li>
1344 <li>add a <cite>txt_ extension</cite> (code2txt)</li>
1345 <li>fallback: if no guess can be made, add &quot;.out&quot;</li>
1346 </ul>
1347 <pre class="literal-block">
1348 def get_outfile_name(self, infile, txt2code=None):
1349 &quot;&quot;&quot;Return a matching output filename for `infile`
1350 &quot;&quot;&quot;
1351 # if input is stdin, default output is stdout
1352 if infile == '-':
1353 return '-'
1354 # Modify `infile`
1355 (base, ext) = os.path.splitext(infile)
1356 # TODO: should get_outfile_name() use self.values.outfile_extension
1357 # if it exists?
1359 # strip text extension
1360 if ext in self.values.text_extensions:
1361 return base
1362 # add (first) text extension for code files
1363 if ext in self.values.code_extensions or txt2code == False:
1364 return infile + self.values.text_extensions[0]
1365 # give up
1366 return infile + &quot;.out&quot;
1367 </pre>
1368 </div>
1369 </div>
1370 <div class="section">
1371 <h2><a class="toc-backref" href="#id45" id="helper-functions" name="helper-functions">4.4&nbsp;&nbsp;&nbsp;Helper functions</a></h2>
1372 <div class="section">
1373 <h3><a class="toc-backref" href="#id46" id="open-streams" name="open-streams">4.4.1&nbsp;&nbsp;&nbsp;open_streams</a></h3>
1374 <p>Return file objects for in- and output. If the input path is missing,
1375 write usage and abort. (An alternative would be to use stdin as default.
1376 However, this leaves the uninitiated user with a non-responding application
1377 if (s)he just tries the script without any arguments)</p>
1378 <pre class="literal-block">
1379 def open_streams(infile = '-', outfile = '-', overwrite='update', **keyw):
1380 &quot;&quot;&quot;Open and return the input and output stream
1382 open_streams(infile, outfile) -&gt; (in_stream, out_stream)
1384 in_stream -- file(infile) or sys.stdin
1385 out_stream -- file(outfile) or sys.stdout
1386 overwrite -- ['yes', 'update', 'no']
1387 if 'update', only open output file if it is older than
1388 the input stream.
1389 Irrelevant if outfile == '-'.
1390 &quot;&quot;&quot;
1391 if not infile:
1392 strerror = &quot;Missing input file name ('-' for stdin; -h for help)&quot;
1393 raise IOError, (2, strerror, infile)
1394 if infile == '-':
1395 in_stream = sys.stdin
1396 else:
1397 in_stream = file(infile, 'r')
1398 if outfile == '-':
1399 out_stream = sys.stdout
1400 elif overwrite == 'no' and os.path.exists(outfile):
1401 raise IOError, (1, &quot;Output file exists!&quot;, outfile)
1402 elif overwrite == 'update' and is_newer(outfile, infile):
1403 raise IOError, (1, &quot;Output file is newer than input file!&quot;, outfile)
1404 else:
1405 out_stream = file(outfile, 'w')
1406 return (in_stream, out_stream)
1407 </pre>
1408 </div>
1409 <div class="section">
1410 <h3><a class="toc-backref" href="#id47" id="is-newer" name="is-newer">4.4.2&nbsp;&nbsp;&nbsp;is_newer</a></h3>
1411 <pre class="literal-block">
1412 def is_newer(path1, path2):
1413 &quot;&quot;&quot;Check if `path1` is newer than `path2` (using mtime)
1415 Compare modification time of files at path1 and path2.
1417 Non-existing files are considered oldest: Return False if path1 doesnot
1418 exist and True if path2 doesnot exist.
1420 Return None for equal modification time. (This evaluates to False in a
1421 boolean context but allows a test for equality.)
1423 &quot;&quot;&quot;
1424 try:
1425 mtime1 = os.path.getmtime(path1)
1426 except OSError:
1427 mtime1 = -1
1428 try:
1429 mtime2 = os.path.getmtime(path2)
1430 except OSError:
1431 mtime2 = -1
1432 # print &quot;mtime1&quot;, mtime1, path1, &quot;\n&quot;, &quot;mtime2&quot;, mtime2, path2
1434 if mtime1 == mtime2:
1435 return None
1436 return mtime1 &gt; mtime2
1437 </pre>
1438 </div>
1439 <div class="section">
1440 <h3><a class="toc-backref" href="#id48" id="get-converter" name="get-converter">4.4.3&nbsp;&nbsp;&nbsp;get_converter</a></h3>
1441 <p>Get an instance of the converter state machine:</p>
1442 <pre class="literal-block">
1443 def get_converter(data, txt2code=True, **keyw):
1444 if txt2code:
1445 return Text2Code(data, **keyw)
1446 else:
1447 return Code2Text(data, **keyw)
1448 </pre>
1449 </div>
1450 </div>
1451 <div class="section">
1452 <h2><a class="toc-backref" href="#id49" id="use-cases" name="use-cases">4.5&nbsp;&nbsp;&nbsp;Use cases</a></h2>
1453 <div class="section">
1454 <h3><a class="toc-backref" href="#id50" id="run-doctest" name="run-doctest">4.5.1&nbsp;&nbsp;&nbsp;run_doctest</a></h3>
1455 <pre class="literal-block">
1456 def run_doctest(infile=&quot;-&quot;, txt2code=True,
1457 globs={}, verbose=False, optionflags=0, **keyw):
1458 &quot;&quot;&quot;run doctest on the text source
1459 &quot;&quot;&quot;
1460 from doctest import DocTestParser, DocTestRunner
1461 (data, out_stream) = open_streams(infile, &quot;-&quot;)
1462 </pre>
1463 <p>If source is code, convert to text, as tests in comments are not found by
1464 doctest:</p>
1465 <pre class="literal-block">
1466 if txt2code is False:
1467 converter = Code2Text(data, **keyw)
1468 docstring = str(converter)
1469 else:
1470 docstring = data.read()
1471 </pre>
1472 <p>Use the doctest Advanced API to do all doctests in a given string:</p>
1473 <pre class="literal-block">
1474 test = DocTestParser().get_doctest(docstring, globs={}, name=&quot;&quot;,
1475 filename=infile, lineno=0)
1476 runner = DocTestRunner(verbose=verbose, optionflags=optionflags)
1477 runner.run(test)
1478 runner.summarize
1479 if not runner.failures:
1480 print &quot;%d failures in %d tests&quot;%(runner.failures, runner.tries)
1481 return runner.failures, runner.tries
1482 </pre>
1483 </div>
1484 <div class="section">
1485 <h3><a class="toc-backref" href="#id51" id="diff" name="diff">4.5.2&nbsp;&nbsp;&nbsp;diff</a></h3>
1486 <pre class="literal-block">
1487 def diff(infile='-', outfile='-', txt2code=True, **keyw):
1488 &quot;&quot;&quot;Report differences between converted infile and existing outfile
1490 If outfile is '-', do a round-trip conversion and report differences
1491 &quot;&quot;&quot;
1493 import difflib
1495 instream = file(infile)
1496 # for diffing, we need a copy of the data as list::
1497 data = instream.readlines()
1498 # convert
1499 converter = get_converter(data, txt2code, **keyw)
1500 new = str(converter).splitlines(True)
1502 if outfile != '-':
1503 outstream = file(outfile)
1504 old = outstream.readlines()
1505 oldname = outfile
1506 newname = &quot;&lt;conversion of %s&gt;&quot;%infile
1507 else:
1508 old = data
1509 oldname = infile
1510 # back-convert the output data
1511 converter = get_converter(new, not txt2code)
1512 new = str(converter).splitlines(True)
1513 newname = &quot;&lt;round-conversion of %s&gt;&quot;%infile
1515 # find and print the differences
1516 delta = list(difflib.unified_diff(old, new, fromfile=oldname,
1517 tofile=newname))
1518 if not delta:
1519 print oldname
1520 print newname
1521 print &quot;no differences found&quot;
1522 return False
1523 print &quot;&quot;.join(delta)
1524 return True
1525 </pre>
1526 </div>
1527 </div>
1528 <div class="section">
1529 <h2><a class="toc-backref" href="#id52" id="main" name="main">4.6&nbsp;&nbsp;&nbsp;main</a></h2>
1530 <p>If this script is called from the command line, the <cite>main</cite> function will
1531 convert the input (file or stdin) between text and code formats.</p>
1532 <div class="section">
1533 <h3><a class="toc-backref" href="#id53" id="id6" name="id6">4.6.1&nbsp;&nbsp;&nbsp;Customization</a></h3>
1534 <p>Option defaults for the conversion can be as keyword arguments to <a class="reference" href="#main">main</a>.
1535 The option defaults will be updated by command line options and extended
1536 with &quot;intelligent guesses&quot; by <cite>PylitOptions</cite> and passed on to helper
1537 functions and the converter instantiation.</p>
1538 <p>This allows easy customization for programmatic use -- just or call <cite>main</cite>
1539 with the appropriate keyword options (or with a <cite>option_defaults</cite>
1540 dictionary.), e.g.:</p>
1541 <pre class="doctest-block">
1542 &gt;&gt;&gt; option_defaults = {'language': &quot;c++&quot;,
1543 ... 'codeindent': 4,
1544 ... 'header_string': '..admonition::'
1545 ... }
1546 </pre>
1547 <pre class="doctest-block">
1548 &gt;&gt;&gt; main(**option_defaults)
1549 </pre>
1550 <pre class="literal-block">
1551 def main(args=sys.argv[1:], **option_defaults):
1552 &quot;&quot;&quot;%prog [options] FILE [OUTFILE]
1554 Convert between reStructured Text with embedded code, and
1555 Source code with embedded text comment blocks&quot;&quot;&quot;
1556 </pre>
1557 <p>Parse and complete the options:</p>
1558 <pre class="literal-block">
1559 options = PylitOptions(args, **option_defaults).values
1560 </pre>
1561 <p>Run doctests if <tt class="docutils literal"><span class="pre">--doctest</span></tt> option is set:</p>
1562 <pre class="literal-block">
1563 if options.ensure_value(&quot;doctest&quot;, None):
1564 return run_doctest(**options.as_dict())
1565 </pre>
1566 <p>Do a round-trip and report differences if the <tt class="docutils literal"><span class="pre">--diff</span></tt> opton is set:</p>
1567 <pre class="literal-block">
1568 if options.ensure_value(&quot;diff&quot;, None):
1569 return diff(**options.as_dict())
1570 </pre>
1571 <p>Open in- and output streams:</p>
1572 <pre class="literal-block">
1573 try:
1574 (data, out_stream) = open_streams(**options.as_dict())
1575 except IOError, ex:
1576 print &quot;IOError: %s %s&quot; % (ex.filename, ex.strerror)
1577 sys.exit(ex.errno)
1578 </pre>
1579 <p>Get a converter instance:</p>
1580 <pre class="literal-block">
1581 converter = get_converter(data, **options.as_dict())
1582 </pre>
1583 <p>Execute if the <tt class="docutils literal"><span class="pre">-execute</span></tt> option is set:</p>
1584 <pre class="literal-block">
1585 if options.ensure_value(&quot;execute&quot;, None):
1586 print &quot;executing &quot; + options.infile
1587 if options.txt2code:
1588 code = str(converter)
1589 else:
1590 code = data
1591 exec code
1592 return
1593 </pre>
1594 <p>Default action: Convert and write to out_stream:</p>
1595 <pre class="literal-block">
1596 out_stream.write(str(converter))
1598 if out_stream is not sys.stdout:
1599 print &quot;extract written to&quot;, out_stream.name
1600 out_stream.close()
1601 </pre>
1602 <p>Rename the infile to a backup copy if <tt class="docutils literal"><span class="pre">--replace</span></tt> is set:</p>
1603 <pre class="literal-block">
1604 if options.ensure_value(&quot;replace&quot;, None):
1605 os.rename(options.infile, options.infile + &quot;~&quot;)
1606 </pre>
1607 <p>If not (and input and output are from files), set the modification time
1608 (<cite>mtime</cite>) of the output file to the one of the input file to indicate that
1609 the contained information is equal.[#]_</p>
1610 <pre class="literal-block">
1611 else:
1612 try:
1613 os.utime(options.outfile, (os.path.getatime(options.outfile),
1614 os.path.getmtime(options.infile))
1616 except OSError:
1617 pass
1619 ## print &quot;mtime&quot;, os.path.getmtime(options.infile), options.infile
1620 ## print &quot;mtime&quot;, os.path.getmtime(options.outfile), options.outfile
1621 </pre>
1622 <table class="docutils footnote" frame="void" id="id7" rules="none">
1623 <colgroup><col class="label" /><col /></colgroup>
1624 <tbody valign="top">
1625 <tr><td class="label"><a name="id7">[4]</a></td><td>Make sure the corresponding file object (here <cite>out_stream</cite>) is
1626 closed, as otherwise the change will be overwritten when <cite>close</cite> is
1627 called afterwards (either explicitely or at program exit).</td></tr>
1628 </tbody>
1629 </table>
1630 <p>Run main, if called from the command line:</p>
1631 <pre class="literal-block">
1632 if __name__ == '__main__':
1633 main()
1634 </pre>
1635 </div>
1636 </div>
1637 </div>
1638 <div class="section">
1639 <h1><a class="toc-backref" href="#id54" id="open-questions" name="open-questions">5&nbsp;&nbsp;&nbsp;Open questions</a></h1>
1640 <p>Open questions and ideas for further development</p>
1641 <div class="section">
1642 <h2><a class="toc-backref" href="#id55" id="options" name="options">5.1&nbsp;&nbsp;&nbsp;Options</a></h2>
1643 <ul>
1644 <li><p class="first">Collect option defaults in a dictionary (on module level)</p>
1645 <p>Facilitates the setting of options in programmatic use</p>
1646 <p>Use templates for the &quot;intelligent guesses&quot; (with Python syntax for string
1647 replacement with dicts: <tt class="docutils literal"><span class="pre">&quot;hello</span> <span class="pre">%(what)s&quot;</span> <span class="pre">%</span> <span class="pre">{'what':</span> <span class="pre">'world'}</span></tt>)</p>
1648 </li>
1649 <li><p class="first">Is it sensible to offer the <cite>header_string</cite> option also as command line
1650 option?</p>
1651 </li>
1652 <li><p class="first">Configurable</p>
1653 </li>
1654 </ul>
1655 </div>
1656 <div class="section">
1657 <h2><a class="toc-backref" href="#id56" id="parsing-problems" name="parsing-problems">5.2&nbsp;&nbsp;&nbsp;Parsing Problems</a></h2>
1658 <ul>
1659 <li><p class="first">How can I include a literal block that should not be in the
1660 executable code (e.g. an example, an earlier version or variant)?</p>
1661 <dl class="docutils">
1662 <dt>Workaround:</dt>
1663 <dd><p class="first">Use a <cite>quoted literal block</cite> (with a quotation different from
1664 the comment string used for text blocks to keep it as commented over the
1665 code-text round-trips.</p>
1666 <p class="last">Python <cite>pydoc</cite> examples can also use the special pydoc block syntax (no
1667 double colon!).</p>
1668 </dd>
1669 <dt>Alternative:</dt>
1670 <dd><p class="first last">use a special &quot;code block&quot; directive or a special &quot;no code
1671 block&quot; directive.</p>
1672 </dd>
1673 </dl>
1674 </li>
1675 <li><p class="first">ignore &quot;matching comments&quot; in literal strings?</p>
1676 <p>(would need a specific detection algorithm for every language that
1677 supports multi-line literal strings (C++, PHP, Python)</p>
1678 </li>
1679 <li><p class="first">Warn if a comment in code will become text after round-trip?</p>
1680 </li>
1681 </ul>
1682 </div>
1683 <div class="section">
1684 <h2><a class="toc-backref" href="#id57" id="code-syntax-highlight" name="code-syntax-highlight">5.3&nbsp;&nbsp;&nbsp;code syntax highlight</a></h2>
1685 <p>use <cite>listing</cite> package in LaTeX-&gt;PDF</p>
1686 <p>in html, see</p>
1687 <ul class="simple">
1688 <li>the syntax highlight support in rest2web
1689 (uses the Moin-Moin Python colorizer, see a version at
1690 <a class="reference" href="http://www.standards-schmandards.com/2005/fangs-093/">http://www.standards-schmandards.com/2005/fangs-093/</a>)</li>
1691 <li>Pygments (pure Python, many languages, rst integration recipe):
1692 <a class="reference" href="http://pygments.org/docs/rstdirective/">http://pygments.org/docs/rstdirective/</a></li>
1693 <li>Silvercity, enscript, ...</li>
1694 </ul>
1695 <p>Some plug-ins require a special &quot;code block&quot; directive instead of the
1696 <cite>::</cite>-literal block. TODO: make this an option</p>
1697 <p>Ask at docutils users|developers</p>
1698 <ul class="simple">
1699 <li>How to handle docstrings in code blocks? (it would be nice to convert them
1700 to rst-text if <tt class="docutils literal"><span class="pre">__docformat__</span> <span class="pre">==</span> <span class="pre">restructuredtext</span></tt>)</li>
1701 </ul>
1702 </div>
1703 </div>
1704 </div>
1705 <div class="footer">
1706 <hr class="footer" />
1707 Generated on: 2007-03-02.
1709 </div>
1710 </body>
1711 </html>