1 ===========================================
2 Plan for Enthought API Documentation Tool
3 ===========================================
6 :Contact: docutils-develop@lists.sourceforge.net
9 :Copyright: 2004 by `Enthought, Inc. <http://www.enthought.com>`_
10 :License: `Enthought License`_ (BSD-style)
12 .. _Enthought License: http://docutils.sf.net/licenses/enthought.txt
14 This document should be read in conjunction with the `Enthought API
15 Documentation Tool RFP`__ prepared by Janet Swisher.
26 In March 2004 at I met Eric Jones, president and CTO of `Enthought,
27 Inc.`_, at `PyCon 2004`_ in Washington DC. He told me that Enthought
28 was using reStructuredText_ for source code documentation, but they
29 had some issues. He asked if I'd be interested in doing some work on
30 a customized API documentation tool. Shortly after PyCon, Janet
31 Swisher, Enthought's senior technical writer, contacted me to work out
32 details. Some email, a trip to Austin in May, and plenty of Texas
33 hospitality later, we had a project. This document will record the
34 details, milestones, and evolution of the project.
36 In a nutshell, Enthought is sponsoring the implementation of an open
37 source API documentation tool that meets their needs. Fortuitously,
38 their needs coincide well with the "Python Source Reader" description
39 in `PEP 258`_. In other words, Enthought is funding some significant
40 improvements to Docutils, improvements that were planned but never
41 implemented due to time and other constraints. The implementation
42 will take place gradually over several months, on a part-time basis.
44 This is an ideal example of cooperation between a corporation and an
45 open-source project. The corporation, the project, I personally, and
46 the community all benefit. Enthought, whose commitment to open source
47 is also evidenced by their sponsorship of SciPy_, benefits by
48 obtaining a useful piece of software, much more quickly than would
49 have been possible without their support. Docutils benefits directly
50 from the implementation of one of its core subsystems. I benefit from
51 the funding, which allows me to justify the long hours to my wife and
52 family. All the corporations, projects, and individuals that make up
53 the community will benefit from the end result, which will be great.
55 All that's left now is to actually do the work!
57 .. _PyCon 2004: http://pycon.org/dc2004/
58 .. _reStructuredText: http://docutils.sf.net/rst.html
59 .. _SciPy: http://www.scipy.org/
65 1. Analyze prior art, most notably Epydoc_ and HappyDoc_, to see how
66 they do what they do. I have no desire to reinvent wheels
67 unnecessarily. I want to take the best ideas from each tool,
68 combined with the outline in `PEP 258`_ (which will evolve), and
69 build at least the foundation of the definitive Python
70 auto-documentation tool.
72 .. _Epydoc: http://epydoc.sourceforge.net/
73 .. _HappyDoc: http://happydoc.sourceforge.net/
75 http://docutils.sf.net/docs/peps/pep-0258.html#python-source-reader
77 2. Decide on a base platform. The best way to achieve Enthought's
78 goals in a reasonable time frame may be to extend Epydoc or
79 HappyDoc. Or it may be necessary to start fresh.
81 3. Extend the reStructuredText parser. See `Proposed Changes to
82 reStructuredText`_ below.
84 4. Depending on the base platform chosen, build or extend the
85 docstring & doc comment extraction tool. This may be the biggest
86 part of the project, but I won't be able to break it down into
87 details until more is known.
93 If possible, all software and documentation files will be stored in
94 the Subversion repository of Docutils and/or the base project, which
95 are all publicly-available via anonymous pserver access.
97 The Docutils project is very open about granting Subversion write
98 access; so far, everyone who asked has been given access. Any
99 Enthought staff member who would like Subversion write access will get
102 If either Epydoc or HappyDoc is chosen as the base platform, I will
103 ask the project's administrator for CVS access for myself and any
104 Enthought staff member who wants it. If sufficient access is not
105 granted -- although I doubt that there would be any problem -- we may
106 have to begin a fork, which could be hosted on SourceForge, on
107 Enthought's Subversion server, or anywhere else deemed appropriate.
113 Most existing Docutils files have been placed in the public domain, as
116 :Copyright: This document has been placed in the public domain.
118 This is in conjunction with the "Public Domain Dedication" section of
121 __ http://docutils.sourceforge.net/COPYING.html
123 The code and documentation originating from Enthought funding will
124 have Enthought's copyright and license declaration. While I will try
125 to keep Enthought-specific code and documentation separate from the
126 existing files, there will inevitably be cases where it makes the most
127 sense to extend existing files.
129 I propose the following:
131 1. New files related to this Enthought-funded work will be identified
132 with the following field-list headers::
134 :Copyright: 2004 by Enthought, Inc.
135 :License: Enthought License (BSD Style)
137 The license field text will be linked to the license file itself.
139 2. For significant or major changes to an existing file (more than 10%
140 change), the headers shall change as follows (for example)::
142 :Copyright: 2001-2004 by David Goodger
143 :Copyright: 2004 by Enthought, Inc.
146 If the Enthought-funded portion becomes greater than the previously
147 existing portion, Enthought's copyright line will be shown first.
149 3. In cases of insignificant or minor changes to an existing file
150 (less than 10% change), the public domain status shall remain
153 A section describing all of this will be added to the Docutils
154 `COPYING`__ instructions file.
156 If another project is chosen as the base project, similar changes
157 would be made to their files, subject to negotiation.
159 __ http://docutils.sf.net/COPYING.html
162 Proposed Changes to reStructuredText
163 ====================================
168 The "traits" construct is implemented as dictionaries, where
169 standalone strings would be Python syntax errors. Therefore traits
170 require documentation in comments. We also need a way to
171 differentiate between ordinary "internal" comments and documentation
172 comments (doc comments).
174 Javadoc uses the following syntax for doc comments::
177 * The first line of a multi-line doc comment begins with a slash
178 * and *two* asterisks. The doc comment ends normally.
181 Python doesn't have multi-line comments; only single-line. A similar
182 convention in Python might look like this::
185 # The first line of a doc comment begins with *two* hash marks.
186 # The doc comment ends with the first non-comment line.
189 ## The double-hash-marks could occur on the first line of text,
190 # saving a line in the source.
193 How to indicate the end of the doc comment? ::
196 # The first line of a doc comment begins with *two* hash marks.
197 # The doc comment ends with the first non-comment line, or another
200 # This is an ordinary, internal, non-doc comment.
203 ## First line of a doc comment, terse syntax.
204 # Second (and last) line. Ends here: ##
205 # This is an ordinary, internal, non-doc comment.
208 Or do we even need to worry about this case? A simple blank line
211 ## First line of a doc comment, terse syntax.
212 # Second (and last) line. Ends with a blank line.
214 # This is an ordinary, internal, non-doc comment.
217 Other possibilities::
219 #" Instead of double-hash-marks, we could use a hash mark and a
220 # quotation mark to begin the doc comment.
223 ## We could require double-hash-marks on every line. This has the
224 ## added benefit of delimiting the *end* of the doc comment, as
225 ## well as working well with line wrapping in Emacs
226 ## ("fill-paragraph" command).
227 # Ordinary non-doc comment.
230 #" A hash mark and a quotation mark on each line looks funny, and
231 #" it doesn't work well with line wrapping in Emacs.
234 These styles (repeated on each line) work well with line wrapping in
239 These styles do *not* work well with line wrapping in Emacs::
241 #" #' #: #) #. #/ #@ #$ #^ #= #+ #_ #~
243 The style of doc comment indicator used could be a runtime, global
244 and/or per-module setting. That may add more complexity than it's
251 I recommend adopting "#*" on every line::
253 # This is an ordinary non-doc comment.
255 #* This is a documentation comment, with an asterisk after the
256 #* hash marks on every line.
259 I initially recommended adopting double-hash-marks::
261 # This is an ordinary non-doc comment.
263 ## This is a documentation comment, with double-hash-marks on
267 But Janet Swisher rightly pointed out that this could collide with
268 ordinary comments that are then block-commented. This applies to
269 double-hash-marks on the first line only as well. So they're out.
271 On the other hand, the JavaDoc-comment style ("##" on the first line
272 only, "#" after that) is used in Fredrik Lundh's PythonDoc_. It may
273 be worthwhile to conform to this syntax, reinforcing it as a standard.
274 PythonDoc does not support terse doc comments (text after "##" on the
277 .. _PythonDoc: http://effbot.org/zone/pythondoc.htm
283 Enthought's Traits system has switched to a metaclass base, and traits
284 are now defined via ordinary attributes. Therefore doc comments are
285 no longer absolutely necessary; attribute docstrings will suffice.
286 Doc comments may still be desirable though, since they allow
287 documentation to precede the thing being documented.
290 Docstring Density & Whitespace Minimization
291 -------------------------------------------
293 One problem with extensively documented classes & functions, is that
294 there is a lot of screen space wasted on whitespace. Here's some
295 current Enthought code (from lib/cp/fluids/gassmann.py)::
297 def max_gas(temperature, pressure, api, specific_gravity=.56):
299 Computes the maximum dissolved gas in oil using Batzle and
304 temperature : sequence
305 Temperature in degrees Celsius
310 specific_gravity : sequence
311 Specific gravity of gas at STP, default is .56
316 Maximum dissolved gas in liters/liter
320 This estimate is based on equations given by Mavko, Mukerji,
321 and Dvorkin, (1998, pp. 218-219, or 2003, p. 236) obtained
322 originally from Batzle and Wang (1992).
326 The docstring is 24 lines long.
328 Rather than using subsections, field lists (which exist now) can save
331 def max_gas(temperature, pressure, api, specific_gravity=.56):
333 Computes the maximum dissolved gas in oil using Batzle and
337 temperature : sequence
338 Temperature in degrees Celsius
343 specific_gravity : sequence
344 Specific gravity of gas at STP, default is .56
347 Maximum dissolved gas in liters/liter
348 :Description: This estimate is based on equations given by
349 Mavko, Mukerji, and Dvorkin, (1998, pp. 218-219, or 2003,
350 p. 236) obtained originally from Batzle and Wang (1992).
354 As with the "Description" field above, field bodies may begin on the
355 same line as the field name, which also saves space.
357 The output for field lists is typically a table structure. For
361 temperature : sequence
362 Temperature in degrees Celsius
367 specific_gravity : sequence
368 Specific gravity of gas at STP, default is .56
371 Maximum dissolved gas in liters/liter
373 This estimate is based on equations given by Mavko,
374 Mukerji, and Dvorkin, (1998, pp. 218-219, or 2003, p. 236)
375 obtained originally from Batzle and Wang (1992).
377 But the definition lists describing the parameters and return values
378 are still wasteful of space. There are a lot of half-filled lines.
380 Definition lists are currently defined as::
385 Where the classifier part is optional. Ideas for improvements:
387 1. We could allow multiple classifiers::
389 term : classifier one : two : three ...
392 2. We could allow the definition on the same line as the term, using
393 some embedded/inline markup:
395 * "--" could be used, but only in limited and well-known contexts::
399 This is the syntax used by StructuredText (one of
400 reStructuredText's predecessors). It was not adopted for
401 reStructuredText because it is ambiguous -- people often use "--"
402 in their text, as I just did. But given a constrained context,
403 the ambiguity would be acceptable (or would it?). That context
404 would be: in docstrings, within a field list, perhaps only with
405 certain well-defined field names (parameters, returns).
407 * The "constrained context" above isn't really enough to make the
408 ambiguity acceptable. Instead, a slightly more verbose but far
409 less ambiguous syntax is possible::
413 This syntax has advantages. Equals signs lend themselves to the
414 connotation of "definition". And whereas one or two equals signs
415 are commonly used in program code, three equals signs in a row
416 have no conflicting meanings that I know of. (Update: there
417 *are* uses out there.)
419 The problem with this approach is that using inline markup for
420 structure is inherently ambiguous in reStructuredText. For
421 example, writing *about* definition lists would be difficult::
423 ``term === definition`` is an example of a compact definition list item
425 The parser checks for structural markup before it does inline
426 markup processing. But the "===" should be protected by its inline
429 3. We could allow the definition on the same line as the term, using
430 structural markup. A variation on bullet lists would work well::
433 : another term :: and a definition that
436 Some ambiguity remains::
438 : term ``containing :: double colons`` :: definition
440 But the likelihood of such cases is negligible, and they can be
441 covered in the documentation.
443 Other possibilities for the definition delimiter include::
445 : term : classifier -- definition
446 : term : classifier --- definition
447 : term : classifier : : definition
448 : term : classifier === definition
450 The third idea currently has the best chance of being adopted and
457 Combining these ideas, the function definition becomes::
459 def max_gas(temperature, pressure, api, specific_gravity=.56):
461 Computes the maximum dissolved gas in oil using Batzle and
465 : temperature : sequence :: Temperature in degrees Celsius
466 : pressure : sequence :: Pressure in MPa
467 : api : sequence :: Stock tank oil API
468 : specific_gravity : sequence :: Specific gravity of gas at
471 : max_gor : sequence :: Maximum dissolved gas in liters/liter
472 :Description: This estimate is based on equations given by
473 Mavko, Mukerji, and Dvorkin, (1998, pp. 218-219, or 2003,
474 p. 236) obtained originally from Batzle and Wang (1992).
478 The docstring is reduced to 14 lines, from the original 24. For
479 longer docstrings with many parameters and return values, the
480 difference would be more significant.