1 <?xml version=
"1.0" encoding=
"iso-8859-1" ?>
2 <!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3 <html xmlns=
"http://www.w3.org/1999/xhtml" xml:
lang=
"en" lang=
"en">
5 <meta http-equiv=
"Content-Type" content=
"text/html; charset=iso-8859-1" />
6 <meta name=
"generator" content=
"Docutils 0.4.1: http://docutils.sourceforge.net/" />
7 <title>pylit.py: Literate programming with Python and reStructuredText
</title>
8 <meta name=
"date" content=
"2007-01-31" />
9 <meta name=
"copyright" content=
"2005, 2007 Guenter Milde. Released under the terms of the GNU General Public License (v. 2 or later)" />
10 <style type=
"text/css">
13 :Author: David Goodger
14 :Contact: goodger@users.sourceforge.net
15 :Date: $Date:
2005-
12-
18 01:
56:
14 +
0100 (Sun,
18 Dec
2005) $
16 :Revision: $Revision:
4224 $
17 :Copyright: This stylesheet has been placed in the public domain.
19 Default cascading style sheet for the HTML output of Docutils.
21 See http://docutils.sf.net/docs/howto/html-stylesheets.html for how to
22 customize this style sheet.
25 /* used to remove borders from tables and images */
26 .borderless, table.borderless td, table.borderless th {
29 table.borderless td, table.borderless th {
30 /* Override padding for
"table.docutils td" with
"! important".
31 The right padding separates the table cells. */
32 padding:
0 0.5em
0 0 ! important }
35 /* Override more specific margin styles with
"! important". */
36 margin-top:
0 ! important }
38 .last, .with-subtitle {
39 margin-bottom:
0 ! important }
45 text-decoration: none ;
52 margin-bottom:
0.5em }
54 /* Uncomment (and remove this text!) to get bold-faced definition list terms
62 div.abstract p.topic-title {
66 div.admonition, div.attention, div.caution, div.danger, div.error,
67 div.hint, div.important, div.note, div.tip, div.warning {
69 border: medium outset ;
72 div.admonition p.admonition-title, div.hint p.admonition-title,
73 div.important p.admonition-title, div.note p.admonition-title,
74 div.tip p.admonition-title {
76 font-family: sans-serif }
78 div.attention p.admonition-title, div.caution p.admonition-title,
79 div.danger p.admonition-title, div.error p.admonition-title,
80 div.warning p.admonition-title {
83 font-family: sans-serif }
85 /* Uncomment (and remove this text!) to get reduced vertical space in
87 div.compound .compound-first, div.compound .compound-middle {
88 margin-bottom:
0.5em }
90 div.compound .compound-last, div.compound .compound-middle {
99 div.dedication p.topic-title {
107 div.footer, div.header {
116 div.line-block div.line-block {
123 border: medium outset ;
125 background-color: #ffffee ;
130 div.sidebar p.rubric {
131 font-family: sans-serif ;
134 div.system-messages {
137 div.system-messages h1 {
141 border: medium outset ;
144 div.system-message p.system-message-title {
151 h1.section-subtitle, h2.section-subtitle, h3.section-subtitle,
152 h4.section-subtitle, h5.section-subtitle, h6.section-subtitle {
170 ol.simple, ul.simple {
174 list-style: decimal }
177 list-style: lower-alpha }
180 list-style: upper-alpha }
183 list-style: lower-roman }
186 list-style: upper-roman }
200 white-space: nowrap }
209 font-family: sans-serif ;
214 font-family: sans-serif ;
226 pre.literal-block, pre.doctest-block {
229 background-color: #eeeeee }
232 font-family: sans-serif ;
233 font-style: oblique }
235 span.classifier-delimiter {
236 font-family: sans-serif ;
240 font-family: sans-serif }
243 white-space: nowrap }
251 span.section-subtitle {
252 /* font-size relative to parent (h1..h6 element) */
256 border-left: solid
1px gray;
264 margin-bottom:
0.5em }
267 border-left: solid
1px black;
270 table.docutils td, table.docutils th,
271 table.docinfo td, table.docinfo th {
272 padding-left:
0.5em ;
273 padding-right:
0.5em ;
274 vertical-align: top }
276 table.docutils th.field-name, table.docinfo th.docinfo-name {
279 white-space: nowrap ;
282 h1 tt.docutils, h2 tt.docutils, h3 tt.docutils,
283 h4 tt.docutils, h5 tt.docutils, h6 tt.docutils {
287 background-color: #eeeeee }
290 list-style-type: none }
295 <div class=
"document" id=
"pylit-py-literate-programming-with-python-and-restructuredtext">
296 <h1 class=
"title">pylit.py: Literate programming with Python and reStructuredText
</h1>
297 <table class=
"docinfo" frame=
"void" rules=
"none">
298 <col class=
"docinfo-name" />
299 <col class=
"docinfo-content" />
301 <tr><th class=
"docinfo-name">Date:
</th>
302 <td>2007-
01-
31</td></tr>
303 <tr><th class=
"docinfo-name">Copyright:
</th>
304 <td>2005,
2007 Guenter Milde.
305 Released under the terms of the GNU General Public License
306 (v.
2 or later)
</td></tr>
309 <!-- #!/usr/bin/env python
310 # -*- coding: iso-8859-1 -*- -->
311 <div class=
"contents topic">
312 <p class=
"topic-title first"><a id=
"contents" name=
"contents">Contents
</a></p>
313 <ul class=
"auto-toc simple">
314 <li><a class=
"reference" href=
"#frontmatter" id=
"id8" name=
"id8">1 Frontmatter
</a><ul class=
"auto-toc">
315 <li><a class=
"reference" href=
"#changelog" id=
"id9" name=
"id9">1.1 Changelog
</a></li>
316 <li><a class=
"reference" href=
"#requirements" id=
"id10" name=
"id10">1.2 Requirements
</a></li>
319 <li><a class=
"reference" href=
"#customization" id=
"id11" name=
"id11">2 Customization
</a></li>
320 <li><a class=
"reference" href=
"#classes" id=
"id12" name=
"id12">3 Classes
</a><ul class=
"auto-toc">
321 <li><a class=
"reference" href=
"#pushiterator" id=
"id13" name=
"id13">3.1 PushIterator
</a></li>
322 <li><a class=
"reference" href=
"#converter" id=
"id14" name=
"id14">3.2 Converter
</a><ul class=
"auto-toc">
323 <li><a class=
"reference" href=
"#data-attributes" id=
"id15" name=
"id15">3.2.1 Data attributes
</a></li>
324 <li><a class=
"reference" href=
"#instantiation" id=
"id16" name=
"id16">3.2.2 Instantiation
</a></li>
325 <li><a class=
"reference" href=
"#converter-str" id=
"id17" name=
"id17">3.2.3 Converter.__str__
</a></li>
326 <li><a class=
"reference" href=
"#converter-get-indent" id=
"id18" name=
"id18">3.2.4 Converter.get_indent
</a></li>
327 <li><a class=
"reference" href=
"#converter-ensure-trailing-blank-line" id=
"id19" name=
"id19">3.2.5 Converter.ensure_trailing_blank_line
</a></li>
328 <li><a class=
"reference" href=
"#converter-collect-blocks" id=
"id20" name=
"id20">3.2.6 Converter.collect_blocks
</a></li>
331 <li><a class=
"reference" href=
"#text2code" id=
"id21" name=
"id21">3.3 Text2Code
</a><ul class=
"auto-toc">
332 <li><a class=
"reference" href=
"#text2code-header" id=
"id22" name=
"id22">3.3.1 Text2Code.header
</a></li>
333 <li><a class=
"reference" href=
"#text2code-text-handler-generator" id=
"id23" name=
"id23">3.3.2 Text2Code.text_handler_generator
</a></li>
334 <li><a class=
"reference" href=
"#text2code-code-handler-generator" id=
"id24" name=
"id24">3.3.3 Text2Code.code_handler_generator
</a></li>
335 <li><a class=
"reference" href=
"#txt2code-remove-literal-marker" id=
"id25" name=
"id25">3.3.4 Txt2Code.remove_literal_marker
</a></li>
336 <li><a class=
"reference" href=
"#text2code-iter-strip" id=
"id26" name=
"id26">3.3.5 Text2Code.iter_strip
</a></li>
339 <li><a class=
"reference" href=
"#code2text" id=
"id27" name=
"id27">3.4 Code2Text
</a><ul class=
"auto-toc">
340 <li><a class=
"reference" href=
"#code2text-iter" id=
"id28" name=
"id28">3.4.1 Code2Text.__iter__
</a></li>
341 <li><a class=
"reference" href=
"#header-state" id=
"id29" name=
"id29">3.4.2 "header
" state
</a></li>
342 <li><a class=
"reference" href=
"#code2text-text" id=
"id30" name=
"id30">3.4.3 Code2Text.text
</a></li>
343 <li><a class=
"reference" href=
"#code2text-code" id=
"id31" name=
"id31">3.4.4 Code2Text.code
</a></li>
344 <li><a class=
"reference" href=
"#code2text-block-is-text" id=
"id32" name=
"id32">3.4.5 Code2Text.block_is_text
</a></li>
345 <li><a class=
"reference" href=
"#code2text-strip-literal-marker" id=
"id33" name=
"id33">3.4.6 Code2Text.strip_literal_marker
</a></li>
350 <li><a class=
"reference" href=
"#command-line-use" id=
"id34" name=
"id34">4 Command line use
</a><ul class=
"auto-toc">
351 <li><a class=
"reference" href=
"#dual-source-handling" id=
"id35" name=
"id35">4.1 Dual source handling
</a><ul class=
"auto-toc">
352 <li><a class=
"reference" href=
"#how-to-determine-which-source-is-up-to-date" id=
"id36" name=
"id36">4.1.1 How to determine which source is up-to-date?
</a></li>
353 <li><a class=
"reference" href=
"#recognised-filename-extensions" id=
"id37" name=
"id37">4.1.2 Recognised Filename Extensions
</a></li>
356 <li><a class=
"reference" href=
"#optionvalues" id=
"id38" name=
"id38">4.2 OptionValues
</a></li>
357 <li><a class=
"reference" href=
"#pylitoptions" id=
"id39" name=
"id39">4.3 PylitOptions
</a><ul class=
"auto-toc">
358 <li><a class=
"reference" href=
"#id5" id=
"id40" name=
"id40">4.3.1 Instantiation
</a></li>
359 <li><a class=
"reference" href=
"#calling" id=
"id41" name=
"id41">4.3.2 Calling
</a></li>
360 <li><a class=
"reference" href=
"#pylitoptions-parse-args" id=
"id42" name=
"id42">4.3.3 PylitOptions.parse_args
</a></li>
361 <li><a class=
"reference" href=
"#pylitoptions-complete-values" id=
"id43" name=
"id43">4.3.4 PylitOptions.complete_values
</a></li>
362 <li><a class=
"reference" href=
"#pylitoptions-get-outfile-name" id=
"id44" name=
"id44">4.3.5 PylitOptions.get_outfile_name
</a></li>
365 <li><a class=
"reference" href=
"#helper-functions" id=
"id45" name=
"id45">4.4 Helper functions
</a><ul class=
"auto-toc">
366 <li><a class=
"reference" href=
"#open-streams" id=
"id46" name=
"id46">4.4.1 open_streams
</a></li>
367 <li><a class=
"reference" href=
"#is-newer" id=
"id47" name=
"id47">4.4.2 is_newer
</a></li>
368 <li><a class=
"reference" href=
"#get-converter" id=
"id48" name=
"id48">4.4.3 get_converter
</a></li>
371 <li><a class=
"reference" href=
"#use-cases" id=
"id49" name=
"id49">4.5 Use cases
</a><ul class=
"auto-toc">
372 <li><a class=
"reference" href=
"#run-doctest" id=
"id50" name=
"id50">4.5.1 run_doctest
</a></li>
373 <li><a class=
"reference" href=
"#diff" id=
"id51" name=
"id51">4.5.2 diff
</a></li>
376 <li><a class=
"reference" href=
"#main" id=
"id52" name=
"id52">4.6 main
</a><ul class=
"auto-toc">
377 <li><a class=
"reference" href=
"#id6" id=
"id53" name=
"id53">4.6.1 Customization
</a></li>
382 <li><a class=
"reference" href=
"#open-questions" id=
"id54" name=
"id54">5 Open questions
</a><ul class=
"auto-toc">
383 <li><a class=
"reference" href=
"#options" id=
"id55" name=
"id55">5.1 Options
</a></li>
384 <li><a class=
"reference" href=
"#parsing-problems" id=
"id56" name=
"id56">5.2 Parsing Problems
</a></li>
385 <li><a class=
"reference" href=
"#code-syntax-highlight" id=
"id57" name=
"id57">5.3 code syntax highlight
</a></li>
390 <div class=
"section">
391 <h1><a class=
"toc-backref" href=
"#id8" id=
"frontmatter" name=
"frontmatter">1 Frontmatter
</a></h1>
392 <div class=
"section">
393 <h2><a class=
"toc-backref" href=
"#id9" id=
"changelog" name=
"changelog">1.1 Changelog
</a></h2>
394 <table class=
"docutils field-list" frame=
"void" rules=
"none">
395 <col class=
"field-name" />
396 <col class=
"field-body" />
398 <tr class=
"field"><th class=
"field-name">2005-
06-
29:
</th><td class=
"field-body">Initial version
</td>
400 <tr class=
"field"><th class=
"field-name">2005-
06-
30:
</th><td class=
"field-body">first literate version of the script
</td>
402 <tr class=
"field"><th class=
"field-name">2005-
07-
01:
</th><td class=
"field-body">object orientated script using generators
</td>
404 <tr class=
"field"><th class=
"field-name">2005-
07-
10:
</th><td class=
"field-body">Two state machine (later added 'header' state)
</td>
406 <tr class=
"field"><th class=
"field-name">2006-
12-
04:
</th><td class=
"field-body">Start of work on version
0.2 (code restructuring)
</td>
408 <tr class=
"field"><th class=
"field-name">2007-
01-
23:
</th><td class=
"field-body">0.2 published at
<a class=
"reference" href=
"http://pylit.berlios.de">http://pylit.berlios.de
</a></td>
410 <tr class=
"field"><th class=
"field-name">2007-
01-
25:
</th><td class=
"field-body">0.2.1 Outsourced non-core documentation to the PyLit pages.
</td>
412 <tr class=
"field"><th class=
"field-name">2007-
01-
26:
</th><td class=
"field-body">0.2.2 new behaviour of
<cite>diff
</cite> function
</td>
414 <tr class=
"field"><th class=
"field-name">2007-
01-
29:
</th><td class=
"field-body">0.2.3 new
<cite>header
</cite> methods after suggestion by Riccardo Murri
</td>
416 <tr class=
"field"><th class=
"field-name">2007-
01-
31:
</th><td class=
"field-body">0.2.4 raise Error if code indent is too small
</td>
418 <tr class=
"field"><th class=
"field-name">2007-
02-
05:
</th><td class=
"field-body">0.2.5 new command line option --comment-string
</td>
420 <tr class=
"field"><th class=
"field-name">2007-
02-
09:
</th><td class=
"field-body">0.2.6 add section with open questions,
421 Code2Text: let only blank lines (no comment str)
422 separate text and code,
423 fix
<cite>Code2Text.header
</cite></td>
425 <tr class=
"field"><th class=
"field-name">2007-
02-
19:
</th><td class=
"field-body">0.2.7 simplify
<cite>Code2Text.header,
</cite>
426 new
<cite>iter_strip
</cite> method replacing a lot of
<tt class=
"docutils literal"><span class=
"pre">if
</span></tt>-s
</td>
428 <tr class=
"field"><th class=
"field-name">2007-
02-
22:
</th><td class=
"field-body">0.2.8 set
<cite>mtime
</cite> of outfile to the one of infile
</td>
430 <tr class=
"field"><th class=
"field-name">2007-
02-
27:
</th><td class=
"field-body">0.3 new
<cite>Code2Text
</cite> converter after an idea by Riccardo Murri
431 a new
<cite>Text2Code
</cite> will follow soon
432 explicite
<cite>option_defaults
</cite> dict for easier customization
</td>
436 <pre class=
"literal-block">
437 """pylit: Literate programming with Python and reStructuredText
439 PyLit is a bidirectional converter between
441 * a (reStructured) text source with embedded code, and
442 * a code source with embedded text blocks (comments)
445 __docformat__ = 'restructuredtext'
447 _version =
"0.3"
450 <div class=
"section">
451 <h2><a class=
"toc-backref" href=
"#id10" id=
"requirements" name=
"requirements">1.2 Requirements
</a></h2>
453 <li>library modules
</li>
455 <pre class=
"literal-block">
462 <li>non-standard extensions
</li>
464 <pre class=
"literal-block">
465 from simplestates import SimpleStates # generic state machine
469 <div class=
"section">
470 <h1><a class=
"toc-backref" href=
"#id11" id=
"customization" name=
"customization">2 Customization
</a></h1>
471 <pre class=
"literal-block">
474 <p>Default language and language specific defaults:
</p>
475 <pre class=
"literal-block">
476 option_defaults[
"language
"] =
"python
"
477 option_defaults[
"comment_strings
"] = {
"python
": '# ',
478 "slang
": '% ',
479 "c++
": '// ',
480 "elisp
": ';; '}
482 <p>Recognized file extensions for text and code versions of the source.
483 Used to guess the language from the filename.
</p>
484 <pre class=
"literal-block">
485 option_defaults[
"code_languages
"] = {
".py
":
"python
",
486 ".sl
":
"slang
",
487 ".c
":
"c++
",
488 ".el
":
"elisp
"}
489 option_defaults[
"code_extensions
"] = option_defaults[
"code_languages
"].keys()
490 option_defaults[
"text_extensions
"] = [
".txt
"]
492 <p>Number of spaces to indent code blocks in the code -
> text conversion.[#]_
</p>
493 <table class=
"docutils footnote" frame=
"void" id=
"id1" rules=
"none">
494 <colgroup><col class=
"label" /><col /></colgroup>
496 <tr><td class=
"label"><a class=
"fn-backref" href=
"#id3" name=
"id1">[
2]
</a></td><td>For the text -
> code conversion, the codeindent is determined by the
497 first recognized code line (leading comment or first indented literal
498 block of the text source).
</td></tr>
501 <pre class=
"literal-block">
502 option_defaults[
"codeindent
"] =
2
505 <div class=
"section">
506 <h1><a class=
"toc-backref" href=
"#id12" id=
"classes" name=
"classes">3 Classes
</a></h1>
507 <div class=
"section">
508 <h2><a class=
"toc-backref" href=
"#id13" id=
"pushiterator" name=
"pushiterator">3.1 PushIterator
</a></h2>
509 <p>The PushIterator is a minimal implementation of an iterator with
510 backtracking from the
<a class=
"reference" href=
"http://www.interlink.com.au/anthony/tech/talks/OSCON2005/effective_r27.pdf">Effective Python Programming
</a> OSCON
2005 tutorial by
511 Anthony
Baxter. As the definition is small, it is inlined now. For the full
512 reasoning and documentation see
<a class=
"reference" href=
"iterqueue.py.html">iterqueue.py
</a>.
</p>
513 <pre class=
"literal-block">
514 class PushIterator(object):
515 def __init__(self, iterable):
516 self.it = iter(iterable)
519 """Return `self`, as this is already an iterator
"""
522 return (self.cache and self.cache.pop()) or self.it.next()
523 def push(self, value):
524 self.cache.append(value)
527 <div class=
"section">
528 <h2><a class=
"toc-backref" href=
"#id14" id=
"converter" name=
"converter">3.2 Converter
</a></h2>
529 <p>The converter classes implement a simple
<cite>state machine
</cite> to separate and
530 transform text and code blocks. For this task, only a very limited parsing
531 is needed. Using the full blown
<a class=
"reference" href=
"http://docutils.sourceforge.net/">docutils
</a> rst parser would introduce a
532 large overhead and slow down the conversion.
</p>
533 <p>PyLit's simple parser assumes:
</p>
535 <li>indented literal blocks in a text source are code blocks.
</li>
536 <li>comment lines that start with a matching comment string in a code source
537 are text blocks.
</li>
539 <p>The actual converter classes are derived from
<cite>PyLitConverter
</cite>:
540 <a class=
"reference" href=
"#text2code">Text2Code
</a> converts a text source to executable code, while
<a class=
"reference" href=
"#code2text">Code2Text
</a>
541 does the opposite: converting commented code to a text source.
</p>
542 <p>The
<cite>PyLitConverter
</cite> class inherits the state machine framework
543 (initalisation, scheduler, iterator interface, ...) from
<cite>SimpleStates
</cite>,
544 overrides the
<tt class=
"docutils literal"><span class=
"pre">__init__
</span></tt> method, and adds auxiliary methods and
545 configuration attributes (options).
</p>
546 <pre class=
"literal-block">
547 class PyLitConverter(SimpleStates):
548 """parent class for `Text2Code` and `Code2Text`, the state machines
549 converting between text source and code source of a literal program.
552 <div class=
"section">
553 <h3><a class=
"toc-backref" href=
"#id15" id=
"data-attributes" name=
"data-attributes">3.2.1 Data attributes
</a></h3>
554 <p>The data attributes are class default values. They will be overridden by
555 matching keyword arguments during class instantiation.
</p>
556 <p><a class=
"reference" href=
"#get-converter">get_converter
</a> and
<a class=
"reference" href=
"#main">main
</a> pass on unused keyword arguments to
557 the instantiation of a converter class. This way, keyword arguments
558 to these functions can be used to customize the converter.
</p>
559 <p>Default language and language specific defaults:
</p>
560 <pre class=
"literal-block">
561 language = option_defaults[
"language
"]
562 comment_strings = option_defaults[
"comment_strings
"]
564 <p>Number of spaces to indent code blocks in the code -
> text conversion:
</p>
565 <pre class=
"literal-block">
566 codeindent = option_defaults[
"codeindent
"]
568 <p>Marker string for the first code block. (Should be a valid rst directive
569 that accepts code on the same line, e.g.
<tt class=
"docutils literal"><span class=
"pre">'..
</span> <span class=
"pre">admonition::'
</span></tt>.) No
570 trailing whitespace needed as indented code follows. Default is a comment
572 <pre class=
"literal-block">
575 <p>Export to the output format stripping text or code blocks:
</p>
576 <pre class=
"literal-block">
579 <p>Initial state:
</p>
580 <pre class=
"literal-block">
584 <div class=
"section">
585 <h3><a class=
"toc-backref" href=
"#id16" id=
"instantiation" name=
"instantiation">3.2.2 Instantiation
</a></h3>
586 <p>Initializing sets up the
<cite>data
</cite> attribute, an iterable object yielding
587 lines of the source to convert.[
1]_
</p>
588 <pre class=
"literal-block">
589 def __init__(self, data, **keyw):
590 """data -- iterable data object
591 (list, file, generator, string, ...)
592 **keyw -- all remaining keyword arguments are
593 stored as class attributes
596 <p>As the state handlers need backtracking, the data is wrapped in a
597 <a class=
"reference" href=
"#pushiterator">PushIterator
</a> if it doesnot already have a
<cite>push
</cite> method:
</p>
598 <pre class=
"literal-block">
599 if hasattr(data, 'push'):
602 self.data = PushIterator(data)
605 <p>Additional keyword arguments are stored as data attributes, overwriting the
607 <pre class=
"literal-block">
608 self.__dict__.update(keyw)
610 <p>The comment string is set to the language's comment string if not given in
611 the keyword arguments:
</p>
612 <pre class=
"literal-block">
613 if not hasattr(self,
"comment_string
") or not self.comment_string:
614 self.comment_string = self.comment_strings[self.language]
616 <table class=
"docutils footnote" frame=
"void" id=
"id2" rules=
"none">
617 <colgroup><col class=
"label" /><col /></colgroup>
619 <tr><td class=
"label"><a name=
"id2">[
1]
</a></td><td><p class=
"first">The most common choice of data is a
<cite>file
</cite> object with the text
621 <p class=
"last">To convert a string into a suitable object, use its splitlines method
622 with the optional
<cite>keepends
</cite> argument set to True.
</p>
627 <div class=
"section">
628 <h3><a class=
"toc-backref" href=
"#id17" id=
"converter-str" name=
"converter-str">3.2.3 Converter.__str__
</a></h3>
629 <p>Return converted data as string:
</p>
630 <pre class=
"literal-block">
632 blocks = [
"".join(block) for block in self()]
633 return
"".join(blocks)
636 <div class=
"section">
637 <h3><a class=
"toc-backref" href=
"#id18" id=
"converter-get-indent" name=
"converter-get-indent">3.2.4 Converter.get_indent
</a></h3>
638 <p>Return the number of leading spaces in
<cite>string
</cite> after expanding tabs
</p>
639 <pre class=
"literal-block">
640 def get_indent(self, string):
641 """Return the indentation of `string`.
643 line = string.expandtabs()
644 return len(line) - len(line.lstrip())
647 <div class=
"section">
648 <h3><a class=
"toc-backref" href=
"#id19" id=
"converter-ensure-trailing-blank-line" name=
"converter-ensure-trailing-blank-line">3.2.5 Converter.ensure_trailing_blank_line
</a></h3>
649 <p>Ensure there is a blank line as last element of the list
<cite>lines
</cite>:
</p>
650 <pre class=
"literal-block">
651 def ensure_trailing_blank_line(self, lines, next_line):
654 if lines[-
1].lstrip():
655 sys.stderr.write(
"\nWarning: inserted blank line between\n %s %s
"
656 %(lines[-
1], next_line))
657 lines.append(
"\n
")
660 <div class=
"section">
661 <h3><a class=
"toc-backref" href=
"#id20" id=
"converter-collect-blocks" name=
"converter-collect-blocks">3.2.6 Converter.collect_blocks
</a></h3>
662 <pre class=
"literal-block">
663 def collect_blocks(self):
664 """collect lines in a list
666 return list for each block of lines (paragraph) seperated by a
667 blank line (whitespace only)
670 for line in self.data:
672 if not line.rstrip():
679 <div class=
"section">
680 <h2><a class=
"toc-backref" href=
"#id21" id=
"text2code" name=
"text2code">3.3 Text2Code
</a></h2>
681 <p>The
<cite>Text2Code
</cite> class separates code blocks (indented literal blocks) from
682 reStructured text. Code blocks are unindented, text is commented (or
683 filtered, if the
<tt class=
"docutils literal"><span class=
"pre">strip
</span></tt> option is True.
</p>
684 <p>Only
<cite>indented literal blocks
</cite> are extracted.
<cite>quoted literal blocks
</cite> and
685 <cite>pydoc blocks
</cite> are treated as text. This allows the easy inclusion of
686 examples:
<a class=
"footnote-reference" href=
"#id1" id=
"id3" name=
"id3">[
2]
</a></p>
688 <pre class=
"doctest-block">
693 <table class=
"docutils footnote" frame=
"void" id=
"id4" rules=
"none">
694 <colgroup><col class=
"label" /><col /></colgroup>
696 <tr><td class=
"label"><a name=
"id4">[
3]
</a></td><td>Mark that there is no double colon before the doctest block in
697 the text source.
</td></tr>
700 <p>The state handlers are implemented as generators. Iterating over a
701 <cite>Text2Code
</cite> instance initializes them to generate iterators for
702 the respective states (see
<tt class=
"docutils literal"><span class=
"pre">simplestates.py
</span></tt>).
</p>
703 <pre class=
"literal-block">
704 class Text2Code(PyLitConverter):
705 """Convert a (reStructured) text source to code source
708 <p>INIT: call the parent classes init method.
</p>
709 <p>If the
<cite>strip
</cite> argument is true, replace the
<cite>__iter_
</cite> method
710 with a special one that drops
"spurious
" blocks:
</p>
711 <pre class=
"literal-block">
712 def __init__(self, data, **keyw):
713 PyLitConverter.__init__(self, data, **keyw)
714 if getattr(self,
"strip
", False):
715 self.__iter__ = self.iter_strip
717 <div class=
"section">
718 <h3><a class=
"toc-backref" href=
"#id22" id=
"text2code-header" name=
"text2code-header">3.3.1 Text2Code.header
</a></h3>
719 <p>Convert the header (leading rst comment block) to code:
</p>
720 <pre class=
"literal-block">
722 """Convert header (comment) to code
"""
723 line = self.data_iterator.next()
725 <p>Test first line for rst comment: (We need to do this explicitely here, as
726 the code handler will only recognize the start of a text block if a line
727 starting with
"matching comment
" is preceded by an empty line. However, we
728 have to care for the case of the first line beeing a
"text line
".
</p>
729 <p>Which variant is better?
</p>
731 <li><p class=
"first">starts with comment marker and has
732 something behind the comment on the first line:
</p>
733 <pre class=
"literal-block">
734 # if line.startswith(
"..
") and len(line.rstrip())
> 2:
737 <li><p class=
"first">Convert any leading comment to code:
</p>
738 <pre class=
"literal-block">
739 if line.startswith(self.header_string):
743 <p>Strip leading comment string (typically added by
<cite>Code2Text.header
</cite>) and
744 return the result of processing the data with the code handler:
</p>
745 <pre class=
"literal-block">
746 self.data_iterator.push(line.replace(self.header_string,
"",
1))
749 <p>No header code found: Push back first non-header line and set state to
750 "text
":
</p>
751 <pre class=
"literal-block">
752 self.data_iterator.push(line)
757 <div class=
"section">
758 <h3><a class=
"toc-backref" href=
"#id23" id=
"text2code-text-handler-generator" name=
"text2code-text-handler-generator">3.3.2 Text2Code.text_handler_generator
</a></h3>
759 <p>The 'text' handler processes everything that is not an indented literal
760 comment. Text is quoted with
<cite>self.comment_string
</cite> or filtered (with
762 <p>It is implemented as a generator function that acts on the
<cite>data
</cite> iterator
763 and yields text blocks.
</p>
764 <p>Declaration and initialization:
</p>
765 <pre class=
"literal-block">
766 def text_handler_generator(self):
767 """Convert text blocks from rst to comment
771 <p>Iterate over the data_iterator (which yields the data lines):
</p>
772 <pre class=
"literal-block">
773 for line in self.data_iterator:
774 # print
"Text: '%s'
"%line
776 <p>Default action: add comment string and collect in
<cite>lines
</cite> list:
</p>
777 <pre class=
"literal-block">
778 lines.append(self.comment_string + line)
780 <p>Test for the end of the text block: a line that ends with
<cite>::
</cite> but is neither
781 a comment nor a directive:
</p>
782 <pre class=
"literal-block">
783 if (line.rstrip().endswith(
"::
")
784 and not line.lstrip().startswith(
"..
")):
786 <p>End of text block is detected, now:
</p>
787 <p>set the current text indent level (needed by the code handler to find the
788 end of code block) and set the state to
"code
" (i.e. the next call of
789 <cite>self.next
</cite> goes to the code handler):
</p>
790 <pre class=
"literal-block">
791 self._textindent = self.get_indent(line)
794 <p>Ensure a trailing blank line (which is the paragraph separator in
795 reStructured Text. Look at the next line, if it is blank -- OK, if it is
796 not blank, push it back (it should be code) and add a line by calling the
797 <cite>ensure_trailing_blank_line
</cite> method (which also issues a warning):
</p>
798 <pre class=
"literal-block">
799 line = self.data_iterator.next()
801 self.data_iterator.push(line) # push back
802 self.ensure_trailing_blank_line(lines, line)
806 <p>Now yield and reset the lines. (There was a function call to remove a
807 literal marker (if on a line on itself) to shorten the comment. However,
808 this behaviour was removed as the resulting difference in line numbers leads
809 to misleading error messages in doctests):
</p>
810 <pre class=
"literal-block">
811 #remove_literal_marker(lines)
815 <p>End of data: if we
"fall of
" the iteration loop, just join and return the
817 <pre class=
"literal-block">
821 <div class=
"section">
822 <h3><a class=
"toc-backref" href=
"#id24" id=
"text2code-code-handler-generator" name=
"text2code-code-handler-generator">3.3.3 Text2Code.code_handler_generator
</a></h3>
823 <p>The
<cite>code
</cite> handler is called when a literal block marker is encounterd. It
824 returns a code block (indented literal block), removing leading whitespace
825 up to the indentation of the first code line in the file (this deviation
826 from docutils behaviour allows indented blocks of Python code).
</p>
827 <p>As the code handler detects the switch to
"text
" state by looking at
828 the line indents, it needs to push back the last probed data token. I.e.
829 the data_iterator must support a
<cite>push
</cite> method. (This is the
830 reason for the use of the PushIterator class in
<cite>__init__
</cite>.)
</p>
831 <pre class=
"literal-block">
832 def code_handler_generator(self):
833 """Convert indented literal blocks to source code
836 codeindent = None # indent of first non-blank code line, set below
837 indent_string =
"" # leading whitespace chars ...
839 <p>Iterate over the lines in the input data:
</p>
840 <pre class=
"literal-block">
841 for line in self.data_iterator:
842 # print
"Code: '%s'
"%line
844 <p>Pass on blank lines (no test for end of code block needed|possible):
</p>
845 <pre class=
"literal-block">
846 if not line.rstrip():
847 lines.append(line.replace(indent_string,
"",
1))
850 <p>Test for end of code block:
</p>
851 <p>A literal block ends with the first less indented, nonblank line.
852 <cite>self._textindent
</cite> is set by the text handler to the indent of the
853 preceding paragraph.
</p>
854 <p>To prevent problems with different tabulator settings, hard tabs in code
855 lines are expanded with the
<cite>expandtabs
</cite> string method when calculating the
856 indentation (i.e. replaced by
8 spaces, by default).
</p>
857 <pre class=
"literal-block">
858 if self.get_indent(line)
<= self._textindent:
860 self.data_iterator.push(line)
862 # append blank line (if not already present)
863 self.ensure_trailing_blank_line(lines, line)
865 # reset list of lines
869 <p>OK, we are sure now that the current line is neither blank nor a text line.
</p>
870 <p>If still unset, determine the code indentation from first non-blank code
872 <pre class=
"literal-block">
873 if codeindent is None and line.lstrip():
874 codeindent = self.get_indent(line)
875 indent_string = line[:codeindent]
877 <p>Append unindented line to lines cache (but check if we can safely unindent
879 <pre class=
"literal-block">
880 if not line.startswith(indent_string):
881 raise ValueError,
"cannot unindent line %r,\n
"%line \
882 +
" doesnot start with code indent string %r
"%indent_string
884 lines.append(line[codeindent:])
886 <p>No more lines in the input data: just return what we have:
</p>
887 <pre class=
"literal-block">
891 <div class=
"section">
892 <h3><a class=
"toc-backref" href=
"#id25" id=
"txt2code-remove-literal-marker" name=
"txt2code-remove-literal-marker">3.3.4 Txt2Code.remove_literal_marker
</a></h3>
893 <p>Remove literal marker (::) in
"expanded form
" i.e. in a paragraph on its own.
</p>
894 <p>While cleaning up the code source, it leads to confusion for doctest and
895 searches (e.g. grep) as line-numbers between text and code source will
897 <pre class=
"literal-block">
898 def remove_literal_marker(list):
901 if (lines[-
3].strip() == self.comment_string.strip()
902 and lines[-
2].strip() == self.comment_string + '::'):
908 <div class=
"section">
909 <h3><a class=
"toc-backref" href=
"#id26" id=
"text2code-iter-strip" name=
"text2code-iter-strip">3.3.5 Text2Code.iter_strip
</a></h3>
910 <p>Modification of the
<cite>simplestates.__iter__
</cite> method that will replace it when
911 the
<cite>strip
</cite> keyword argument is
<cite>True
</cite> during class instantiation:
</p>
912 <p>Iterate over class instances dropping text blocks:
</p>
913 <pre class=
"literal-block">
914 def iter_strip(self):
915 """Generate and return an iterator dropping text blocks
917 self.data_iterator = self.data
918 self._initialize_state_generators()
920 yield getattr(self, self.state)()
921 getattr(self, self.state)() # drop text block
925 <div class=
"section">
926 <h2><a class=
"toc-backref" href=
"#id27" id=
"code2text" name=
"code2text">3.4 Code2Text
</a></h2>
927 <p>The
<cite>Code2Text
</cite> class does the opposite of
<a class=
"reference" href=
"#text2code">Text2Code
</a> -- it processes
928 valid source code, extracts comments, and puts non-commented code in literal
930 <p>The class is derived from the PyLitConverter state machine and adds an
931 <cite>__iter__
</cite> method as well as handlers for
"text
", and
"code
" states.
</p>
932 <pre class=
"literal-block">
933 class Code2Text(PyLitConverter):
934 """Convert code source to text source
937 <div class=
"section">
938 <h3><a class=
"toc-backref" href=
"#id28" id=
"code2text-iter" name=
"code2text-iter">3.4.1 Code2Text.__iter__
</a></h3>
939 <pre class=
"literal-block">
942 <p>If the last text block doesnot end with a code marker (by default, the
943 literal-block marker
<tt class=
"docutils literal"><span class=
"pre">::
</span></tt>), the
<cite>text
</cite> method will set
<cite>code marker
</cite> to
944 a paragraph that will start the next code block. It is yielded if non-empty
945 at a text-code transition. If there is no preceding text block,
<cite>code_marker
</cite>
946 contains the
<cite>header_string
</cite>:
</p>
947 <pre class=
"literal-block">
949 self.code_marker = []
951 self.code_marker = [self.header_string]
953 for block in self.collect_blocks():
955 <p>Test the state of the block with
<a class=
"reference" href=
"#code2text-block-is-text">Code2Text.block_is_text
</a>, return it
956 processed with the matching handler:
</p>
957 <pre class=
"literal-block">
958 if self.block_is_text(block):
959 self.state =
"text
"
961 if self.state !=
"code
" and self.code_marker:
962 yield self.code_marker
963 self.state =
"code
"
964 yield getattr(self, self.state)(block)
967 <div class=
"section">
968 <h3><a class=
"toc-backref" href=
"#id29" id=
"header-state" name=
"header-state">3.4.2 "header
" state
</a></h3>
969 <p>Sometimes code needs to remain on the first line(s) of the document to be
970 valid. The most common example is the
"shebang
" line that tells a POSIX
971 shell how to process an executable file:
</p>
972 <pre class=
"literal-block">
973 #!/usr/bin/env python
975 <p>In Python, the
<tt class=
"docutils literal"><span class=
"pre">#
</span> <span class=
"pre">-*-
</span> <span class=
"pre">coding:
</span> <span class=
"pre">iso-
8859-
1</span> <span class=
"pre">-*-
</span></tt> line must occure before any
976 other comment or code.
</p>
977 <p>If we want to keep the line numbers in sync for text and code source, the
978 reStructured Text markup for these header lines must start at the same line
979 as the first header line. Therfore, header lines could not be marked as
980 literal block (this would require the
<tt class=
"docutils literal"><span class=
"pre">::
</span></tt> and an empty line above the code).
</p>
981 <p>OTOH, a comment may start at the same line as the comment marker and it
982 includes subsequent indented lines. Comments are visible in the reStructured
983 Text source but hidden in the pretty-printed output.
</p>
984 <p>With a header converted to comment in the text source, everything before the
985 first text block (i.e. before the first paragraph using the matching comment
986 string) will be hidden away (in HTML or PDF output).
</p>
987 <p>This seems a good compromise, the advantages
</p>
989 <li>line numbers are kept
</li>
990 <li>the
"normal
" code conversion rules (indent/unindent by
<cite>codeindent
</cite> apply
</li>
991 <li>greater flexibility: you can hide a repeating header in a project
992 consisting of many source files.
</li>
994 <p>set off the disadvantages
</p>
996 <li>it may come as surprise if a part of the file is not
"printed
",
</li>
997 <li>one more syntax element to learn for rst newbees to start with pylit,
998 (however, starting from the code source, this will be auto-generated)
</li>
1000 <p>In the case that there is no matching comment at all, the complete code
1001 source will become a comment -- however, in this case it is not very likely
1002 the source is a literate document anyway.
</p>
1003 <p>If needed for the documentation, it is possible to repeat the header in (or
1004 after) the first text block, e.g. with a
<cite>line block
</cite> in a
<cite>block quote
</cite>:
</p>
1006 <div class=
"line-block">
1007 <div class=
"line"><tt class=
"docutils literal"><span class=
"pre">#!/usr/bin/env
</span> <span class=
"pre">python
</span></tt></div>
1008 <div class=
"line"><tt class=
"docutils literal"><span class=
"pre">#
</span> <span class=
"pre">-*-
</span> <span class=
"pre">coding:
</span> <span class=
"pre">iso-
8859-
1</span> <span class=
"pre">-*-
</span></tt></div>
1011 <p>The current implementation represents the header state by the setting of
1012 <cite>code_marker
</cite> to
<tt class=
"docutils literal"><span class=
"pre">[self.header_string]
</span></tt>. The first non-empty text block
1013 will overwrite this setting.
</p>
1015 <div class=
"section">
1016 <h3><a class=
"toc-backref" href=
"#id30" id=
"code2text-text" name=
"code2text-text">3.4.3 Code2Text.text
</a></h3>
1017 <p>The
<em>text state handler
</em> converts a comment to a text block by stripping
1018 the leading
<cite>comment string
</cite> from every line:
</p>
1019 <pre class=
"literal-block">
1020 def text(self, lines):
1021 """Uncomment text blocks in source code
1024 lines = [line.replace(self.comment_string,
"",
1) for line in lines]
1026 lines = [re.sub(
"^
"+self.comment_string.rstrip(),
"", line)
1029 <p>If the code block is stripped, the literal marker would lead to an error
1030 when the text is converted with docutils. Replace it with
1031 <a class=
"reference" href=
"#code2text-strip-literal-marker">Code2Text.strip_literal_marker
</a>:
</p>
1032 <pre class=
"literal-block">
1034 self.strip_literal_marker(lines)
1035 self.code_marker = []
1037 <p>Check for code block marker (double colon) at the end of the text block
1038 Update the
<cite>code_marker
</cite> argument. (The
<cite>code marker
</cite> is yielded by
1039 <a class=
"reference" href=
"#code2text-iter">Code2Text.__iter__
</a> at a text -
> code transition if it is not empty):
</p>
1040 <pre class=
"literal-block">
1041 elif len(lines)
>1:
1042 if lines[-
2].rstrip().endswith(
"::
"):
1043 self.code_marker = []
1045 self.code_marker = [
"::\n
",
"\n
"]
1047 <p>Return the text block to the calling function:
</p>
1048 <pre class=
"literal-block">
1052 <div class=
"section">
1053 <h3><a class=
"toc-backref" href=
"#id31" id=
"code2text-code" name=
"code2text-code">3.4.4 Code2Text.code
</a></h3>
1054 <p>The
<cite>code
</cite> method is called on non-commented code. Code is returned as
1055 indented literal block (or filtered, if
<tt class=
"docutils literal"><span class=
"pre">self.strip
</span> <span class=
"pre">==
</span> <span class=
"pre">True
</span></tt>). The amount
1056 of the code indentation is controled by
<cite>self.codeindent
</cite> (default
2).
</p>
1057 <pre class=
"literal-block">
1058 def code(self, lines):
1059 """Indent lines or strip if `strip` == `True`
1061 if self.strip == True:
1064 return [
" "*self.codeindent + line for line in lines]
1067 <div class=
"section">
1068 <h3><a class=
"toc-backref" href=
"#id32" id=
"code2text-block-is-text" name=
"code2text-block-is-text">3.4.5 Code2Text.block_is_text
</a></h3>
1069 <p>A paragraph is a text block, if every non-blank line starts with a matching
1070 comment string (test includes whitespace except for commented blank lines!)
</p>
1071 <pre class=
"literal-block">
1072 def block_is_text(self, block):
1075 and not line.startswith(self.comment_string)
1076 and line.rstrip() != self.comment_string.rstrip()):
1081 <div class=
"section">
1082 <h3><a class=
"toc-backref" href=
"#id33" id=
"code2text-strip-literal-marker" name=
"code2text-strip-literal-marker">3.4.6 Code2Text.strip_literal_marker
</a></h3>
1083 <p>Replace the literal marker with the equivalent of docutils replace rules
</p>
1085 <li>strip
<cite>::
</cite>-line (and preceding blank line) if on a line on its own
</li>
1086 <li>strip
<cite>::
</cite> if it is preceded by whitespace.
</li>
1087 <li>convert
<cite>::
</cite> to a single colon if preceded by text
</li>
1089 <p><cite>lines
</cite> should be list of text lines (with a trailing blank line).
1090 It is modified in-place:
</p>
1091 <pre class=
"literal-block">
1092 def strip_literal_marker(self, lines):
1095 except IndexError: # len(lines
< 2)
1098 # split at rightmost '::'
1100 (head, tail) = line.rsplit('::',
1)
1101 except ValueError: # only one part (no '::')
1104 # '::' on an extra line
1105 if not head.strip():
1107 # delete preceding line if it is blank
1108 if len(lines)
>=
2 and not lines[-
2].lstrip():
1110 # '::' follows whitespace
1111 elif head.rstrip()
< head:
1112 head = head.rstrip()
1113 lines[-
2] =
"".join((head, tail))
1116 lines[-
2] =
":
".join((head, tail))
1121 <div class=
"section">
1122 <h1><a class=
"toc-backref" href=
"#id34" id=
"command-line-use" name=
"command-line-use">4 Command line use
</a></h1>
1123 <p>Using this script from the command line will convert a file according to its
1124 extension. This default can be overridden by a couple of options.
</p>
1125 <div class=
"section">
1126 <h2><a class=
"toc-backref" href=
"#id35" id=
"dual-source-handling" name=
"dual-source-handling">4.1 Dual source handling
</a></h2>
1127 <div class=
"section">
1128 <h3><a class=
"toc-backref" href=
"#id36" id=
"how-to-determine-which-source-is-up-to-date" name=
"how-to-determine-which-source-is-up-to-date">4.1.1 How to determine which source is up-to-date?
</a></h3>
1130 <li><p class=
"first">set modification date of
<cite>oufile
</cite> to the one of
<cite>infile
</cite></p>
1131 <p>Points out that the source files are 'synchronized'.
</p>
1133 <li><p class=
"first">Are there problems to expect from
"backdating
" a file? Which?
</p>
1134 <p>Looking at
<a class=
"reference" href=
"http://www.unix.com/showthread.php?t=20526">http://www.unix.com/showthread.php?t=
20526</a>, it seems
1135 perfectly legal to set
<cite>mtime
</cite> (while leaving
<cite>ctime
</cite>) as
<cite>mtime
</cite> is a
1136 description of the
"actuality
" of the data in the file.
</p>
1138 <li><p class=
"first">Should this become a default or an option?
</p>
1142 <li><p class=
"first">alternatively move input file to a backup copy (with option:
<cite>--replace
</cite>)
</p>
1144 <li><p class=
"first">check modification date before overwriting
1145 (with option:
<cite>--overwrite=update
</cite>)
</p>
1147 <li><p class=
"first">check modification date before editing (implemented as
<a class=
"reference" href=
"http://www.jedsoft.org/jed/">Jed editor
</a>
1148 function
<cite>pylit_check()
</cite> in
<a class=
"reference" href=
"http://jedmodes.sourceforge.net/mode/pylit/">pylit.sl
</a>)
</p>
1152 <div class=
"section">
1153 <h3><a class=
"toc-backref" href=
"#id37" id=
"recognised-filename-extensions" name=
"recognised-filename-extensions">4.1.2 Recognised Filename Extensions
</a></h3>
1154 <p>Finding an easy to remember, unused filename extension is not easy.
</p>
1155 <dl class=
"docutils">
1157 <dd>a double extension (similar to .tar.gz, say) seems most appropriate
1158 (at least on UNIX). However, it fails on FAT16 filesystems.
1159 The same scheme can be used for c.txt, p.txt and the like.
</dd>
1161 <dd>is recognised as extension by os.path.splitext but also fails on FAT16
</dd>
1163 <dd>(PYthon Text) is used by the Python test interpreter
1164 <a class=
"reference" href=
"http:www.zetadev.com/software/pytest/">pytest
</a></dd>
1166 <dd>was even mentioned as extension for
"literate Python
" files in an
1167 email exchange (
<a class=
"reference" href=
"http://www.python.org/tim_one/000115.html">http://www.python.org/tim_one/
000115.html
</a>) but
1168 subsequently used for Python libraries.
</dd>
1170 <dd>seems to be free (as by a Google search,
"lpy
" is the name of a python
1171 code pretty printer but this should not pose a problem).
</dd>
1173 <dd>seems to be free as well.
</dd>
1175 <p>Instead of defining a new extension for
"pylit
" literate programms,
1176 by default
<tt class=
"docutils literal"><span class=
"pre">.txt
</span></tt> will be appended for literate code and stripped by
1177 the conversion to executable code. i.e. for a program foo:
</p>
1179 <li>the literate source is called
<tt class=
"docutils literal"><span class=
"pre">foo.py.txt
</span></tt></li>
1180 <li>the html rendering is called
<tt class=
"docutils literal"><span class=
"pre">foo.py.html
</span></tt></li>
1181 <li>the python source is called
<tt class=
"docutils literal"><span class=
"pre">foo.py
</span></tt></li>
1185 <div class=
"section">
1186 <h2><a class=
"toc-backref" href=
"#id38" id=
"optionvalues" name=
"optionvalues">4.2 OptionValues
</a></h2>
1187 <p>For use as keyword arguments, it is handy to have the options
1188 in a dictionary. The following class adds an
<cite>as_dict
</cite> method
1189 to
<cite>optparse.Values
</cite>:
</p>
1190 <pre class=
"literal-block">
1191 class OptionValues(optparse.Values):
1193 """Return options as dictionary object
"""
1194 return dict([(option, getattr(self, option)) for option in dir(self)
1195 if option not in dir(OptionValues)
1196 and option is not None
1200 <div class=
"section">
1201 <h2><a class=
"toc-backref" href=
"#id39" id=
"pylitoptions" name=
"pylitoptions">4.3 PylitOptions
</a></h2>
1202 <p>Options are stored in the values attribute of the
<cite>PylitOptions
</cite> class.
1203 It is initialized with default values and parsed command line options (and
1204 arguments) This scheme allows easy customization by code importing the
1205 <cite>pylit
</cite> module.
</p>
1206 <pre class=
"literal-block">
1207 class PylitOptions(object):
1208 """Storage and handling of program options
1211 <div class=
"section">
1212 <h3><a class=
"toc-backref" href=
"#id40" id=
"id5" name=
"id5">4.3.1 Instantiation
</a></h3>
1213 <p>Instantiation sets up an OptionParser and initializes it with pylit's
1214 command line options and
<cite>default_values
</cite>. It then updates the values based
1215 on command line options and sensible defaults:
</p>
1216 <pre class=
"literal-block">
1217 def __init__(self, args=sys.argv[
1:], **keyw):
1218 """Set up an `OptionParser` instance and parse and complete arguments
1220 p = optparse.OptionParser(usage=main.__doc__, version=_version)
1221 # set defaults (from modules option_defaults dict and keyword args)
1222 defaults = dict(option_defaults) # copy module-level defaults
1223 defaults.update(keyw)
1224 p.set_defaults(**defaults)
1226 p.add_option(
"-c
",
"--code2txt
", dest=
"txt2code
", action=
"store_false
",
1227 help=
"convert code to reStructured text
")
1228 p.add_option(
"--comment-string
", dest=
"comment_string
",
1229 help=
"text block marker (default '# ' (for Python))
" )
1230 p.add_option(
"-d
",
"--diff
", action=
"store_true
",
1231 help=
"test for differences to existing file
")
1232 p.add_option(
"--doctest
", action=
"store_true
",
1233 help=
"run doctest.testfile() on the text version
")
1234 p.add_option(
"-e
",
"--execute
", action=
"store_true
",
1235 help=
"execute code (Python only)
")
1236 p.add_option(
"-f
",
"--infile
",
1237 help=
"input file name ('-' for stdout)
" )
1238 p.add_option(
"--language
", action=
"store
",
1239 choices = option_defaults[
"code_languages
"].values(),
1240 help=
"use LANGUAGE native comment style
")
1241 p.add_option(
"--overwrite
", action=
"store
",
1242 choices = [
"yes
",
"update
",
"no
"],
1243 help=
"overwrite output file (default 'update')
")
1244 p.add_option(
"-o
",
"--outfile
",
1245 help=
"output file name ('-' for stdout)
" )
1246 p.add_option(
"--replace
", action=
"store_true
",
1247 help=
"move infile to a backup copy (appending '~')
")
1248 p.add_option(
"-s
",
"--strip
", action=
"store_true
",
1249 help=
"export by stripping text or code
")
1250 p.add_option(
"-t
",
"--txt2code
", action=
"store_true
",
1251 help=
"convert reStructured text to code
")
1254 # parse to fill a self.Values instance
1255 self.values = self.parse_args(args)
1256 # complete with context-sensitive defaults
1257 self.values = self.complete_values(self.values)
1260 <div class=
"section">
1261 <h3><a class=
"toc-backref" href=
"#id41" id=
"calling" name=
"calling">4.3.2 Calling
</a></h3>
1262 <p>"Calling
" an instance updates the option values based on command line
1263 arguments and default values and does a completion of the options based on
1264 "context-sensitive defaults
":
</p>
1265 <pre class=
"literal-block">
1266 def __call__(self, args=sys.argv[
1:], **default_values):
1267 """parse and complete command line args
1269 values = self.parse_args(args, **default_values)
1270 return self.complete_values(values)
1273 <div class=
"section">
1274 <h3><a class=
"toc-backref" href=
"#id42" id=
"pylitoptions-parse-args" name=
"pylitoptions-parse-args">4.3.3 PylitOptions.parse_args
</a></h3>
1275 <p>The
<cite>parse_args
</cite> method calls the
<cite>optparse.OptionParser
</cite> on command
1276 line or provided args and returns the result as
<cite>PylitOptions.Values
</cite>
1277 instance. Defaults can be provided as keyword arguments:
</p>
1278 <pre class=
"literal-block">
1279 def parse_args(self, args=sys.argv[
1:], **default_values):
1280 """parse command line arguments using `optparse.OptionParser`
1282 args -- list of command line arguments.
1283 default_values -- dictionary of option defaults
1286 defaults = self.parser.defaults.copy()
1287 defaults.update(default_values)
1289 (values, args) = self.parser.parse_args(args, OptionValues(defaults))
1290 # Convert FILE and OUTFILE positional args to option values
1291 # (other positional arguments are ignored)
1293 values.infile = args[
0]
1294 values.outfile = args[
1]
1300 <div class=
"section">
1301 <h3><a class=
"toc-backref" href=
"#id43" id=
"pylitoptions-complete-values" name=
"pylitoptions-complete-values">4.3.4 PylitOptions.complete_values
</a></h3>
1302 <p>The
<cite>complete
</cite> method uses context information to set missing option values
1303 to sensible defaults (if possible).
</p>
1304 <pre class=
"literal-block">
1305 def complete_values(self, values):
1306 """complete option values with context sensible defaults
1308 values.ensure_value(
"infile
",
"")
1309 # Guess conversion direction from infile filename
1310 if values.ensure_value(
"txt2code
", None) is None:
1311 in_extension = os.path.splitext(values.infile)[
1]
1312 if in_extension in self.values.text_extensions:
1313 values.txt2code = True
1314 elif in_extension in self.values.code_extensions:
1315 values.txt2code = False
1316 # Auto-determine the output file name
1317 values.ensure_value(
"outfile
", self.get_outfile_name(values.infile,
1319 # Guess conversion direction from outfile filename or set to default
1320 if values.txt2code is None:
1321 out_extension = os.path.splitext(values.outfile)[
1]
1322 values.txt2code = not (out_extension in self.values.text_extensions)
1324 # Set the language of the code (default
"python
")
1325 if values.txt2code is True:
1326 code_extension = os.path.splitext(values.outfile)[
1]
1327 elif values.txt2code is False:
1328 code_extension = os.path.splitext(values.infile)[
1]
1329 values.ensure_value(
"language
",
1330 self.values.code_languages.get(code_extension,
"python
"))
1331 # Set the default overwrite mode
1332 values.ensure_value(
"overwrite
", 'update')
1337 <div class=
"section">
1338 <h3><a class=
"toc-backref" href=
"#id44" id=
"pylitoptions-get-outfile-name" name=
"pylitoptions-get-outfile-name">4.3.5 PylitOptions.get_outfile_name
</a></h3>
1339 <p>Construct a matching filename for the output file. The output filename is
1340 constructed from
<cite>infile
</cite> by the following rules:
</p>
1342 <li>'-' (stdin) results in '-' (stdout)
</li>
1343 <li>strip the
<cite>txt_extension
</cite> or add the
<cite>code_extension
</cite> (txt2code)
</li>
1344 <li>add a
<cite>txt_ extension
</cite> (code2txt)
</li>
1345 <li>fallback: if no guess can be made, add
".out
"</li>
1347 <pre class=
"literal-block">
1348 def get_outfile_name(self, infile, txt2code=None):
1349 """Return a matching output filename for `infile`
1351 # if input is stdin, default output is stdout
1355 (base, ext) = os.path.splitext(infile)
1356 # TODO: should get_outfile_name() use self.values.outfile_extension
1359 # strip text extension
1360 if ext in self.values.text_extensions:
1362 # add (first) text extension for code files
1363 if ext in self.values.code_extensions or txt2code == False:
1364 return infile + self.values.text_extensions[
0]
1366 return infile +
".out
"
1370 <div class=
"section">
1371 <h2><a class=
"toc-backref" href=
"#id45" id=
"helper-functions" name=
"helper-functions">4.4 Helper functions
</a></h2>
1372 <div class=
"section">
1373 <h3><a class=
"toc-backref" href=
"#id46" id=
"open-streams" name=
"open-streams">4.4.1 open_streams
</a></h3>
1374 <p>Return file objects for in- and output. If the input path is missing,
1375 write usage and abort. (An alternative would be to use stdin as default.
1376 However, this leaves the uninitiated user with a non-responding application
1377 if (s)he just tries the script without any arguments)
</p>
1378 <pre class=
"literal-block">
1379 def open_streams(infile = '-', outfile = '-', overwrite='update', **keyw):
1380 """Open and return the input and output stream
1382 open_streams(infile, outfile) -
> (in_stream, out_stream)
1384 in_stream -- file(infile) or sys.stdin
1385 out_stream -- file(outfile) or sys.stdout
1386 overwrite -- ['yes', 'update', 'no']
1387 if 'update', only open output file if it is older than
1389 Irrelevant if outfile == '-'.
1392 strerror =
"Missing input file name ('-' for stdin; -h for help)
"
1393 raise IOError, (
2, strerror, infile)
1395 in_stream = sys.stdin
1397 in_stream = file(infile, 'r')
1399 out_stream = sys.stdout
1400 elif overwrite == 'no' and os.path.exists(outfile):
1401 raise IOError, (
1,
"Output file exists!
", outfile)
1402 elif overwrite == 'update' and is_newer(outfile, infile):
1403 raise IOError, (
1,
"Output file is newer than input file!
", outfile)
1405 out_stream = file(outfile, 'w')
1406 return (in_stream, out_stream)
1409 <div class=
"section">
1410 <h3><a class=
"toc-backref" href=
"#id47" id=
"is-newer" name=
"is-newer">4.4.2 is_newer
</a></h3>
1411 <pre class=
"literal-block">
1412 def is_newer(path1, path2):
1413 """Check if `path1` is newer than `path2` (using mtime)
1415 Compare modification time of files at path1 and path2.
1417 Non-existing files are considered oldest: Return False if path1 doesnot
1418 exist and True if path2 doesnot exist.
1420 Return None for equal modification time. (This evaluates to False in a
1421 boolean context but allows a test for equality.)
1425 mtime1 = os.path.getmtime(path1)
1429 mtime2 = os.path.getmtime(path2)
1432 # print
"mtime1
", mtime1, path1,
"\n
",
"mtime2
", mtime2, path2
1434 if mtime1 == mtime2:
1436 return mtime1
> mtime2
1439 <div class=
"section">
1440 <h3><a class=
"toc-backref" href=
"#id48" id=
"get-converter" name=
"get-converter">4.4.3 get_converter
</a></h3>
1441 <p>Get an instance of the converter state machine:
</p>
1442 <pre class=
"literal-block">
1443 def get_converter(data, txt2code=True, **keyw):
1445 return Text2Code(data, **keyw)
1447 return Code2Text(data, **keyw)
1451 <div class=
"section">
1452 <h2><a class=
"toc-backref" href=
"#id49" id=
"use-cases" name=
"use-cases">4.5 Use cases
</a></h2>
1453 <div class=
"section">
1454 <h3><a class=
"toc-backref" href=
"#id50" id=
"run-doctest" name=
"run-doctest">4.5.1 run_doctest
</a></h3>
1455 <pre class=
"literal-block">
1456 def run_doctest(infile=
"-
", txt2code=True,
1457 globs={}, verbose=False, optionflags=
0, **keyw):
1458 """run doctest on the text source
1460 from doctest import DocTestParser, DocTestRunner
1461 (data, out_stream) = open_streams(infile,
"-
")
1463 <p>If source is code, convert to text, as tests in comments are not found by
1465 <pre class=
"literal-block">
1466 if txt2code is False:
1467 converter = Code2Text(data, **keyw)
1468 docstring = str(converter)
1470 docstring = data.read()
1472 <p>Use the doctest Advanced API to do all doctests in a given string:
</p>
1473 <pre class=
"literal-block">
1474 test = DocTestParser().get_doctest(docstring, globs={}, name=
"",
1475 filename=infile, lineno=
0)
1476 runner = DocTestRunner(verbose=verbose, optionflags=optionflags)
1479 if not runner.failures:
1480 print
"%d failures in %d tests
"%(runner.failures, runner.tries)
1481 return runner.failures, runner.tries
1484 <div class=
"section">
1485 <h3><a class=
"toc-backref" href=
"#id51" id=
"diff" name=
"diff">4.5.2 diff
</a></h3>
1486 <pre class=
"literal-block">
1487 def diff(infile='-', outfile='-', txt2code=True, **keyw):
1488 """Report differences between converted infile and existing outfile
1490 If outfile is '-', do a round-trip conversion and report differences
1495 instream = file(infile)
1496 # for diffing, we need a copy of the data as list::
1497 data = instream.readlines()
1499 converter = get_converter(data, txt2code, **keyw)
1500 new = str(converter).splitlines(True)
1503 outstream = file(outfile)
1504 old = outstream.readlines()
1506 newname =
"<conversion of %s
>"%infile
1510 # back-convert the output data
1511 converter = get_converter(new, not txt2code)
1512 new = str(converter).splitlines(True)
1513 newname =
"<round-conversion of %s
>"%infile
1515 # find and print the differences
1516 delta = list(difflib.unified_diff(old, new, fromfile=oldname,
1521 print
"no differences found
"
1523 print
"".join(delta)
1528 <div class=
"section">
1529 <h2><a class=
"toc-backref" href=
"#id52" id=
"main" name=
"main">4.6 main
</a></h2>
1530 <p>If this script is called from the command line, the
<cite>main
</cite> function will
1531 convert the input (file or stdin) between text and code formats.
</p>
1532 <div class=
"section">
1533 <h3><a class=
"toc-backref" href=
"#id53" id=
"id6" name=
"id6">4.6.1 Customization
</a></h3>
1534 <p>Option defaults for the conversion can be as keyword arguments to
<a class=
"reference" href=
"#main">main
</a>.
1535 The option defaults will be updated by command line options and extended
1536 with
"intelligent guesses
" by
<cite>PylitOptions
</cite> and passed on to helper
1537 functions and the converter instantiation.
</p>
1538 <p>This allows easy customization for programmatic use -- just or call
<cite>main
</cite>
1539 with the appropriate keyword options (or with a
<cite>option_defaults
</cite>
1540 dictionary.), e.g.:
</p>
1541 <pre class=
"doctest-block">
1542 >>> option_defaults = {'language':
"c++
",
1543 ... 'codeindent':
4,
1544 ... 'header_string': '..admonition::'
1547 <pre class=
"doctest-block">
1548 >>> main(**option_defaults)
1550 <pre class=
"literal-block">
1551 def main(args=sys.argv[
1:], **option_defaults):
1552 """%prog [options] FILE [OUTFILE]
1554 Convert between reStructured Text with embedded code, and
1555 Source code with embedded text comment blocks
"""
1557 <p>Parse and complete the options:
</p>
1558 <pre class=
"literal-block">
1559 options = PylitOptions(args, **option_defaults).values
1561 <p>Run doctests if
<tt class=
"docutils literal"><span class=
"pre">--doctest
</span></tt> option is set:
</p>
1562 <pre class=
"literal-block">
1563 if options.ensure_value(
"doctest
", None):
1564 return run_doctest(**options.as_dict())
1566 <p>Do a round-trip and report differences if the
<tt class=
"docutils literal"><span class=
"pre">--diff
</span></tt> opton is set:
</p>
1567 <pre class=
"literal-block">
1568 if options.ensure_value(
"diff
", None):
1569 return diff(**options.as_dict())
1571 <p>Open in- and output streams:
</p>
1572 <pre class=
"literal-block">
1574 (data, out_stream) = open_streams(**options.as_dict())
1576 print
"IOError: %s %s
" % (ex.filename, ex.strerror)
1579 <p>Get a converter instance:
</p>
1580 <pre class=
"literal-block">
1581 converter = get_converter(data, **options.as_dict())
1583 <p>Execute if the
<tt class=
"docutils literal"><span class=
"pre">-execute
</span></tt> option is set:
</p>
1584 <pre class=
"literal-block">
1585 if options.ensure_value(
"execute
", None):
1586 print
"executing
" + options.infile
1587 if options.txt2code:
1588 code = str(converter)
1594 <p>Default action: Convert and write to out_stream:
</p>
1595 <pre class=
"literal-block">
1596 out_stream.write(str(converter))
1598 if out_stream is not sys.stdout:
1599 print
"extract written to
", out_stream.name
1602 <p>Rename the infile to a backup copy if
<tt class=
"docutils literal"><span class=
"pre">--replace
</span></tt> is set:
</p>
1603 <pre class=
"literal-block">
1604 if options.ensure_value(
"replace
", None):
1605 os.rename(options.infile, options.infile +
"~
")
1607 <p>If not (and input and output are from files), set the modification time
1608 (
<cite>mtime
</cite>) of the output file to the one of the input file to indicate that
1609 the contained information is equal.[#]_
</p>
1610 <pre class=
"literal-block">
1613 os.utime(options.outfile, (os.path.getatime(options.outfile),
1614 os.path.getmtime(options.infile))
1619 ## print
"mtime
", os.path.getmtime(options.infile), options.infile
1620 ## print
"mtime
", os.path.getmtime(options.outfile), options.outfile
1622 <table class=
"docutils footnote" frame=
"void" id=
"id7" rules=
"none">
1623 <colgroup><col class=
"label" /><col /></colgroup>
1624 <tbody valign=
"top">
1625 <tr><td class=
"label"><a name=
"id7">[
4]
</a></td><td>Make sure the corresponding file object (here
<cite>out_stream
</cite>) is
1626 closed, as otherwise the change will be overwritten when
<cite>close
</cite> is
1627 called afterwards (either explicitely or at program exit).
</td></tr>
1630 <p>Run main, if called from the command line:
</p>
1631 <pre class=
"literal-block">
1632 if __name__ == '__main__':
1638 <div class=
"section">
1639 <h1><a class=
"toc-backref" href=
"#id54" id=
"open-questions" name=
"open-questions">5 Open questions
</a></h1>
1640 <p>Open questions and ideas for further development
</p>
1641 <div class=
"section">
1642 <h2><a class=
"toc-backref" href=
"#id55" id=
"options" name=
"options">5.1 Options
</a></h2>
1644 <li><p class=
"first">Collect option defaults in a dictionary (on module level)
</p>
1645 <p>Facilitates the setting of options in programmatic use
</p>
1646 <p>Use templates for the
"intelligent guesses
" (with Python syntax for string
1647 replacement with dicts:
<tt class=
"docutils literal"><span class=
"pre">"hello
</span> <span class=
"pre">%(what)s
"</span> <span class=
"pre">%
</span> <span class=
"pre">{'what':
</span> <span class=
"pre">'world'}
</span></tt>)
</p>
1649 <li><p class=
"first">Is it sensible to offer the
<cite>header_string
</cite> option also as command line
1652 <li><p class=
"first">Configurable
</p>
1656 <div class=
"section">
1657 <h2><a class=
"toc-backref" href=
"#id56" id=
"parsing-problems" name=
"parsing-problems">5.2 Parsing Problems
</a></h2>
1659 <li><p class=
"first">How can I include a literal block that should not be in the
1660 executable code (e.g. an example, an earlier version or variant)?
</p>
1661 <dl class=
"docutils">
1662 <dt>Workaround:
</dt>
1663 <dd><p class=
"first">Use a
<cite>quoted literal block
</cite> (with a quotation different from
1664 the comment string used for text blocks to keep it as commented over the
1665 code-text round-trips.
</p>
1666 <p class=
"last">Python
<cite>pydoc
</cite> examples can also use the special pydoc block syntax (no
1669 <dt>Alternative:
</dt>
1670 <dd><p class=
"first last">use a special
"code block
" directive or a special
"no code
1671 block
" directive.
</p>
1675 <li><p class=
"first">ignore
"matching comments
" in literal strings?
</p>
1676 <p>(would need a specific detection algorithm for every language that
1677 supports multi-line literal strings (C++, PHP, Python)
</p>
1679 <li><p class=
"first">Warn if a comment in code will become text after round-trip?
</p>
1683 <div class=
"section">
1684 <h2><a class=
"toc-backref" href=
"#id57" id=
"code-syntax-highlight" name=
"code-syntax-highlight">5.3 code syntax highlight
</a></h2>
1685 <p>use
<cite>listing
</cite> package in LaTeX-
>PDF
</p>
1688 <li>the syntax highlight support in rest2web
1689 (uses the Moin-Moin Python colorizer, see a version at
1690 <a class=
"reference" href=
"http://www.standards-schmandards.com/2005/fangs-093/">http://www.standards-schmandards.com/
2005/fangs-
093/
</a>)
</li>
1691 <li>Pygments (pure Python, many languages, rst integration recipe):
1692 <a class=
"reference" href=
"http://pygments.org/docs/rstdirective/">http://pygments.org/docs/rstdirective/
</a></li>
1693 <li>Silvercity, enscript, ...
</li>
1695 <p>Some plug-ins require a special
"code block
" directive instead of the
1696 <cite>::
</cite>-literal block. TODO: make this an option
</p>
1697 <p>Ask at docutils users|developers
</p>
1699 <li>How to handle docstrings in code blocks? (it would be nice to convert them
1700 to rst-text if
<tt class=
"docutils literal"><span class=
"pre">__docformat__
</span> <span class=
"pre">==
</span> <span class=
"pre">restructuredtext
</span></tt>)
</li>
1705 <div class=
"footer">
1706 <hr class=
"footer" />
1707 Generated on:
2007-
03-
02.