Added a test for the ability to specify a class attribute in Formatter configuration...
[python.git] / Doc / lib / xmldom.tex
blobd651bf0893960fbccced89fa3a795434eea238a4
1 \section{\module{xml.dom} ---
2 The Document Object Model API}
4 \declaremodule{standard}{xml.dom}
5 \modulesynopsis{Document Object Model API for Python.}
6 \sectionauthor{Paul Prescod}{paul@prescod.net}
7 \sectionauthor{Martin v. L\"owis}{martin@v.loewis.de}
9 \versionadded{2.0}
11 The Document Object Model, or ``DOM,'' is a cross-language API from
12 the World Wide Web Consortium (W3C) for accessing and modifying XML
13 documents. A DOM implementation presents an XML document as a tree
14 structure, or allows client code to build such a structure from
15 scratch. It then gives access to the structure through a set of
16 objects which provided well-known interfaces.
18 The DOM is extremely useful for random-access applications. SAX only
19 allows you a view of one bit of the document at a time. If you are
20 looking at one SAX element, you have no access to another. If you are
21 looking at a text node, you have no access to a containing element.
22 When you write a SAX application, you need to keep track of your
23 program's position in the document somewhere in your own code. SAX
24 does not do it for you. Also, if you need to look ahead in the XML
25 document, you are just out of luck.
27 Some applications are simply impossible in an event driven model with
28 no access to a tree. Of course you could build some sort of tree
29 yourself in SAX events, but the DOM allows you to avoid writing that
30 code. The DOM is a standard tree representation for XML data.
32 %What if your needs are somewhere between SAX and the DOM? Perhaps
33 %you cannot afford to load the entire tree in memory but you find the
34 %SAX model somewhat cumbersome and low-level. There is also a module
35 %called xml.dom.pulldom that allows you to build trees of only the
36 %parts of a document that you need structured access to. It also has
37 %features that allow you to find your way around the DOM.
38 % See http://www.prescod.net/python/pulldom
40 The Document Object Model is being defined by the W3C in stages, or
41 ``levels'' in their terminology. The Python mapping of the API is
42 substantially based on the DOM Level~2 recommendation. The mapping of
43 the Level~3 specification, currently only available in draft form, is
44 being developed by the \ulink{Python XML Special Interest
45 Group}{http://www.python.org/sigs/xml-sig/} as part of the
46 \ulink{PyXML package}{http://pyxml.sourceforge.net/}. Refer to the
47 documentation bundled with that package for information on the current
48 state of DOM Level~3 support.
50 DOM applications typically start by parsing some XML into a DOM. How
51 this is accomplished is not covered at all by DOM Level~1, and Level~2
52 provides only limited improvements: There is a
53 \class{DOMImplementation} object class which provides access to
54 \class{Document} creation methods, but no way to access an XML
55 reader/parser/Document builder in an implementation-independent way.
56 There is also no well-defined way to access these methods without an
57 existing \class{Document} object. In Python, each DOM implementation
58 will provide a function \function{getDOMImplementation()}. DOM Level~3
59 adds a Load/Store specification, which defines an interface to the
60 reader, but this is not yet available in the Python standard library.
62 Once you have a DOM document object, you can access the parts of your
63 XML document through its properties and methods. These properties are
64 defined in the DOM specification; this portion of the reference manual
65 describes the interpretation of the specification in Python.
67 The specification provided by the W3C defines the DOM API for Java,
68 ECMAScript, and OMG IDL. The Python mapping defined here is based in
69 large part on the IDL version of the specification, but strict
70 compliance is not required (though implementations are free to support
71 the strict mapping from IDL). See section \ref{dom-conformance},
72 ``Conformance,'' for a detailed discussion of mapping requirements.
75 \begin{seealso}
76 \seetitle[http://www.w3.org/TR/DOM-Level-2-Core/]{Document Object
77 Model (DOM) Level~2 Specification}
78 {The W3C recommendation upon which the Python DOM API is
79 based.}
80 \seetitle[http://www.w3.org/TR/REC-DOM-Level-1/]{Document Object
81 Model (DOM) Level~1 Specification}
82 {The W3C recommendation for the
83 DOM supported by \module{xml.dom.minidom}.}
84 \seetitle[http://pyxml.sourceforge.net]{PyXML}{Users that require a
85 full-featured implementation of DOM should use the PyXML
86 package.}
87 \seetitle[http://www.omg.org/docs/formal/02-11-05.pdf]{Python
88 Language Mapping Specification}
89 {This specifies the mapping from OMG IDL to Python.}
90 \end{seealso}
92 \subsection{Module Contents}
94 The \module{xml.dom} contains the following functions:
96 \begin{funcdesc}{registerDOMImplementation}{name, factory}
97 Register the \var{factory} function with the name \var{name}. The
98 factory function should return an object which implements the
99 \class{DOMImplementation} interface. The factory function can return
100 the same object every time, or a new one for each call, as appropriate
101 for the specific implementation (e.g. if that implementation supports
102 some customization).
103 \end{funcdesc}
105 \begin{funcdesc}{getDOMImplementation}{\optional{name\optional{, features}}}
106 Return a suitable DOM implementation. The \var{name} is either
107 well-known, the module name of a DOM implementation, or
108 \code{None}. If it is not \code{None}, imports the corresponding
109 module and returns a \class{DOMImplementation} object if the import
110 succeeds. If no name is given, and if the environment variable
111 \envvar{PYTHON_DOM} is set, this variable is used to find the
112 implementation.
114 If name is not given, this examines the available implementations to
115 find one with the required feature set. If no implementation can be
116 found, raise an \exception{ImportError}. The features list must be a
117 sequence of \code{(\var{feature}, \var{version})} pairs which are
118 passed to the \method{hasFeature()} method on available
119 \class{DOMImplementation} objects.
120 \end{funcdesc}
123 Some convenience constants are also provided:
125 \begin{datadesc}{EMPTY_NAMESPACE}
126 The value used to indicate that no namespace is associated with a
127 node in the DOM. This is typically found as the
128 \member{namespaceURI} of a node, or used as the \var{namespaceURI}
129 parameter to a namespaces-specific method.
130 \versionadded{2.2}
131 \end{datadesc}
133 \begin{datadesc}{XML_NAMESPACE}
134 The namespace URI associated with the reserved prefix \code{xml}, as
135 defined by
136 \citetitle[http://www.w3.org/TR/REC-xml-names/]{Namespaces in XML}
137 (section~4).
138 \versionadded{2.2}
139 \end{datadesc}
141 \begin{datadesc}{XMLNS_NAMESPACE}
142 The namespace URI for namespace declarations, as defined by
143 \citetitle[http://www.w3.org/TR/DOM-Level-2-Core/core.html]{Document
144 Object Model (DOM) Level~2 Core Specification} (section~1.1.8).
145 \versionadded{2.2}
146 \end{datadesc}
148 \begin{datadesc}{XHTML_NAMESPACE}
149 The URI of the XHTML namespace as defined by
150 \citetitle[http://www.w3.org/TR/xhtml1/]{XHTML 1.0: The Extensible
151 HyperText Markup Language} (section~3.1.1).
152 \versionadded{2.2}
153 \end{datadesc}
156 % Should the Node documentation go here?
158 In addition, \module{xml.dom} contains a base \class{Node} class and
159 the DOM exception classes. The \class{Node} class provided by this
160 module does not implement any of the methods or attributes defined by
161 the DOM specification; concrete DOM implementations must provide
162 those. The \class{Node} class provided as part of this module does
163 provide the constants used for the \member{nodeType} attribute on
164 concrete \class{Node} objects; they are located within the class
165 rather than at the module level to conform with the DOM
166 specifications.
169 \subsection{Objects in the DOM \label{dom-objects}}
171 The definitive documentation for the DOM is the DOM specification from
172 the W3C.
174 Note that DOM attributes may also be manipulated as nodes instead of
175 as simple strings. It is fairly rare that you must do this, however,
176 so this usage is not yet documented.
179 \begin{tableiii}{l|l|l}{class}{Interface}{Section}{Purpose}
180 \lineiii{DOMImplementation}{\ref{dom-implementation-objects}}
181 {Interface to the underlying implementation.}
182 \lineiii{Node}{\ref{dom-node-objects}}
183 {Base interface for most objects in a document.}
184 \lineiii{NodeList}{\ref{dom-nodelist-objects}}
185 {Interface for a sequence of nodes.}
186 \lineiii{DocumentType}{\ref{dom-documenttype-objects}}
187 {Information about the declarations needed to process a document.}
188 \lineiii{Document}{\ref{dom-document-objects}}
189 {Object which represents an entire document.}
190 \lineiii{Element}{\ref{dom-element-objects}}
191 {Element nodes in the document hierarchy.}
192 \lineiii{Attr}{\ref{dom-attr-objects}}
193 {Attribute value nodes on element nodes.}
194 \lineiii{Comment}{\ref{dom-comment-objects}}
195 {Representation of comments in the source document.}
196 \lineiii{Text}{\ref{dom-text-objects}}
197 {Nodes containing textual content from the document.}
198 \lineiii{ProcessingInstruction}{\ref{dom-pi-objects}}
199 {Processing instruction representation.}
200 \end{tableiii}
202 An additional section describes the exceptions defined for working
203 with the DOM in Python.
206 \subsubsection{DOMImplementation Objects
207 \label{dom-implementation-objects}}
209 The \class{DOMImplementation} interface provides a way for
210 applications to determine the availability of particular features in
211 the DOM they are using. DOM Level~2 added the ability to create new
212 \class{Document} and \class{DocumentType} objects using the
213 \class{DOMImplementation} as well.
215 \begin{methoddesc}[DOMImplementation]{hasFeature}{feature, version}
216 Return true if the feature identified by the pair of strings
217 \var{feature} and \var{version} is implemented.
218 \end{methoddesc}
220 \begin{methoddesc}[DOMImplementation]{createDocument}{namespaceUri, qualifiedName, doctype}
221 Return a new \class{Document} object (the root of the DOM), with a
222 child \class{Element} object having the given \var{namespaceUri} and
223 \var{qualifiedName}. The \var{doctype} must be a \class{DocumentType}
224 object created by \method{createDocumentType()}, or \code{None}.
225 In the Python DOM API, the first two arguments can also be \code{None}
226 in order to indicate that no \class{Element} child is to be created.
227 \end{methoddesc}
229 \begin{methoddesc}[DOMImplementation]{createDocumentType}{qualifiedName, publicId, systemId}
230 Return a new \class{DocumentType} object that encapsulates the given
231 \var{qualifiedName}, \var{publicId}, and \var{systemId} strings,
232 representing the information contained in an XML document type
233 declaration.
234 \end{methoddesc}
237 \subsubsection{Node Objects \label{dom-node-objects}}
239 All of the components of an XML document are subclasses of
240 \class{Node}.
242 \begin{memberdesc}[Node]{nodeType}
243 An integer representing the node type. Symbolic constants for the
244 types are on the \class{Node} object:
245 \constant{ELEMENT_NODE}, \constant{ATTRIBUTE_NODE},
246 \constant{TEXT_NODE}, \constant{CDATA_SECTION_NODE},
247 \constant{ENTITY_NODE}, \constant{PROCESSING_INSTRUCTION_NODE},
248 \constant{COMMENT_NODE}, \constant{DOCUMENT_NODE},
249 \constant{DOCUMENT_TYPE_NODE}, \constant{NOTATION_NODE}.
250 This is a read-only attribute.
251 \end{memberdesc}
253 \begin{memberdesc}[Node]{parentNode}
254 The parent of the current node, or \code{None} for the document node.
255 The value is always a \class{Node} object or \code{None}. For
256 \class{Element} nodes, this will be the parent element, except for the
257 root element, in which case it will be the \class{Document} object.
258 For \class{Attr} nodes, this is always \code{None}.
259 This is a read-only attribute.
260 \end{memberdesc}
262 \begin{memberdesc}[Node]{attributes}
263 A \class{NamedNodeMap} of attribute objects. Only elements have
264 actual values for this; others provide \code{None} for this attribute.
265 This is a read-only attribute.
266 \end{memberdesc}
268 \begin{memberdesc}[Node]{previousSibling}
269 The node that immediately precedes this one with the same parent. For
270 instance the element with an end-tag that comes just before the
271 \var{self} element's start-tag. Of course, XML documents are made
272 up of more than just elements so the previous sibling could be text, a
273 comment, or something else. If this node is the first child of the
274 parent, this attribute will be \code{None}.
275 This is a read-only attribute.
276 \end{memberdesc}
278 \begin{memberdesc}[Node]{nextSibling}
279 The node that immediately follows this one with the same parent. See
280 also \member{previousSibling}. If this is the last child of the
281 parent, this attribute will be \code{None}.
282 This is a read-only attribute.
283 \end{memberdesc}
285 \begin{memberdesc}[Node]{childNodes}
286 A list of nodes contained within this node.
287 This is a read-only attribute.
288 \end{memberdesc}
290 \begin{memberdesc}[Node]{firstChild}
291 The first child of the node, if there are any, or \code{None}.
292 This is a read-only attribute.
293 \end{memberdesc}
295 \begin{memberdesc}[Node]{lastChild}
296 The last child of the node, if there are any, or \code{None}.
297 This is a read-only attribute.
298 \end{memberdesc}
300 \begin{memberdesc}[Node]{localName}
301 The part of the \member{tagName} following the colon if there is one,
302 else the entire \member{tagName}. The value is a string.
303 \end{memberdesc}
305 \begin{memberdesc}[Node]{prefix}
306 The part of the \member{tagName} preceding the colon if there is one,
307 else the empty string. The value is a string, or \code{None}
308 \end{memberdesc}
310 \begin{memberdesc}[Node]{namespaceURI}
311 The namespace associated with the element name. This will be a
312 string or \code{None}. This is a read-only attribute.
313 \end{memberdesc}
315 \begin{memberdesc}[Node]{nodeName}
316 This has a different meaning for each node type; see the DOM
317 specification for details. You can always get the information you
318 would get here from another property such as the \member{tagName}
319 property for elements or the \member{name} property for attributes.
320 For all node types, the value of this attribute will be either a
321 string or \code{None}. This is a read-only attribute.
322 \end{memberdesc}
324 \begin{memberdesc}[Node]{nodeValue}
325 This has a different meaning for each node type; see the DOM
326 specification for details. The situation is similar to that with
327 \member{nodeName}. The value is a string or \code{None}.
328 \end{memberdesc}
330 \begin{methoddesc}[Node]{hasAttributes}{}
331 Returns true if the node has any attributes.
332 \end{methoddesc}
334 \begin{methoddesc}[Node]{hasChildNodes}{}
335 Returns true if the node has any child nodes.
336 \end{methoddesc}
338 \begin{methoddesc}[Node]{isSameNode}{other}
339 Returns true if \var{other} refers to the same node as this node.
340 This is especially useful for DOM implementations which use any sort
341 of proxy architecture (because more than one object can refer to the
342 same node).
344 \begin{notice}
345 This is based on a proposed DOM Level~3 API which is still in the
346 ``working draft'' stage, but this particular interface appears
347 uncontroversial. Changes from the W3C will not necessarily affect
348 this method in the Python DOM interface (though any new W3C API for
349 this would also be supported).
350 \end{notice}
351 \end{methoddesc}
353 \begin{methoddesc}[Node]{appendChild}{newChild}
354 Add a new child node to this node at the end of the list of children,
355 returning \var{newChild}.
356 \end{methoddesc}
358 \begin{methoddesc}[Node]{insertBefore}{newChild, refChild}
359 Insert a new child node before an existing child. It must be the case
360 that \var{refChild} is a child of this node; if not,
361 \exception{ValueError} is raised. \var{newChild} is returned. If
362 \var{refChild} is \code{None}, it inserts \var{newChild} at the end of
363 the children's list.
364 \end{methoddesc}
366 \begin{methoddesc}[Node]{removeChild}{oldChild}
367 Remove a child node. \var{oldChild} must be a child of this node; if
368 not, \exception{ValueError} is raised. \var{oldChild} is returned on
369 success. If \var{oldChild} will not be used further, its
370 \method{unlink()} method should be called.
371 \end{methoddesc}
373 \begin{methoddesc}[Node]{replaceChild}{newChild, oldChild}
374 Replace an existing node with a new node. It must be the case that
375 \var{oldChild} is a child of this node; if not,
376 \exception{ValueError} is raised.
377 \end{methoddesc}
379 \begin{methoddesc}[Node]{normalize}{}
380 Join adjacent text nodes so that all stretches of text are stored as
381 single \class{Text} instances. This simplifies processing text from a
382 DOM tree for many applications.
383 \versionadded{2.1}
384 \end{methoddesc}
386 \begin{methoddesc}[Node]{cloneNode}{deep}
387 Clone this node. Setting \var{deep} means to clone all child nodes as
388 well. This returns the clone.
389 \end{methoddesc}
392 \subsubsection{NodeList Objects \label{dom-nodelist-objects}}
394 A \class{NodeList} represents a sequence of nodes. These objects are
395 used in two ways in the DOM Core recommendation: the
396 \class{Element} objects provides one as its list of child nodes, and
397 the \method{getElementsByTagName()} and
398 \method{getElementsByTagNameNS()} methods of \class{Node} return
399 objects with this interface to represent query results.
401 The DOM Level~2 recommendation defines one method and one attribute
402 for these objects:
404 \begin{methoddesc}[NodeList]{item}{i}
405 Return the \var{i}'th item from the sequence, if there is one, or
406 \code{None}. The index \var{i} is not allowed to be less then zero
407 or greater than or equal to the length of the sequence.
408 \end{methoddesc}
410 \begin{memberdesc}[NodeList]{length}
411 The number of nodes in the sequence.
412 \end{memberdesc}
414 In addition, the Python DOM interface requires that some additional
415 support is provided to allow \class{NodeList} objects to be used as
416 Python sequences. All \class{NodeList} implementations must include
417 support for \method{__len__()} and \method{__getitem__()}; this allows
418 iteration over the \class{NodeList} in \keyword{for} statements and
419 proper support for the \function{len()} built-in function.
421 If a DOM implementation supports modification of the document, the
422 \class{NodeList} implementation must also support the
423 \method{__setitem__()} and \method{__delitem__()} methods.
426 \subsubsection{DocumentType Objects \label{dom-documenttype-objects}}
428 Information about the notations and entities declared by a document
429 (including the external subset if the parser uses it and can provide
430 the information) is available from a \class{DocumentType} object. The
431 \class{DocumentType} for a document is available from the
432 \class{Document} object's \member{doctype} attribute; if there is no
433 \code{DOCTYPE} declaration for the document, the document's
434 \member{doctype} attribute will be set to \code{None} instead of an
435 instance of this interface.
437 \class{DocumentType} is a specialization of \class{Node}, and adds the
438 following attributes:
440 \begin{memberdesc}[DocumentType]{publicId}
441 The public identifier for the external subset of the document type
442 definition. This will be a string or \code{None}.
443 \end{memberdesc}
445 \begin{memberdesc}[DocumentType]{systemId}
446 The system identifier for the external subset of the document type
447 definition. This will be a URI as a string, or \code{None}.
448 \end{memberdesc}
450 \begin{memberdesc}[DocumentType]{internalSubset}
451 A string giving the complete internal subset from the document.
452 This does not include the brackets which enclose the subset. If the
453 document has no internal subset, this should be \code{None}.
454 \end{memberdesc}
456 \begin{memberdesc}[DocumentType]{name}
457 The name of the root element as given in the \code{DOCTYPE}
458 declaration, if present.
459 \end{memberdesc}
461 \begin{memberdesc}[DocumentType]{entities}
462 This is a \class{NamedNodeMap} giving the definitions of external
463 entities. For entity names defined more than once, only the first
464 definition is provided (others are ignored as required by the XML
465 recommendation). This may be \code{None} if the information is not
466 provided by the parser, or if no entities are defined.
467 \end{memberdesc}
469 \begin{memberdesc}[DocumentType]{notations}
470 This is a \class{NamedNodeMap} giving the definitions of notations.
471 For notation names defined more than once, only the first definition
472 is provided (others are ignored as required by the XML
473 recommendation). This may be \code{None} if the information is not
474 provided by the parser, or if no notations are defined.
475 \end{memberdesc}
478 \subsubsection{Document Objects \label{dom-document-objects}}
480 A \class{Document} represents an entire XML document, including its
481 constituent elements, attributes, processing instructions, comments
482 etc. Remeber that it inherits properties from \class{Node}.
484 \begin{memberdesc}[Document]{documentElement}
485 The one and only root element of the document.
486 \end{memberdesc}
488 \begin{methoddesc}[Document]{createElement}{tagName}
489 Create and return a new element node. The element is not inserted
490 into the document when it is created. You need to explicitly insert
491 it with one of the other methods such as \method{insertBefore()} or
492 \method{appendChild()}.
493 \end{methoddesc}
495 \begin{methoddesc}[Document]{createElementNS}{namespaceURI, tagName}
496 Create and return a new element with a namespace. The
497 \var{tagName} may have a prefix. The element is not inserted into the
498 document when it is created. You need to explicitly insert it with
499 one of the other methods such as \method{insertBefore()} or
500 \method{appendChild()}.
501 \end{methoddesc}
503 \begin{methoddesc}[Document]{createTextNode}{data}
504 Create and return a text node containing the data passed as a
505 parameter. As with the other creation methods, this one does not
506 insert the node into the tree.
507 \end{methoddesc}
509 \begin{methoddesc}[Document]{createComment}{data}
510 Create and return a comment node containing the data passed as a
511 parameter. As with the other creation methods, this one does not
512 insert the node into the tree.
513 \end{methoddesc}
515 \begin{methoddesc}[Document]{createProcessingInstruction}{target, data}
516 Create and return a processing instruction node containing the
517 \var{target} and \var{data} passed as parameters. As with the other
518 creation methods, this one does not insert the node into the tree.
519 \end{methoddesc}
521 \begin{methoddesc}[Document]{createAttribute}{name}
522 Create and return an attribute node. This method does not associate
523 the attribute node with any particular element. You must use
524 \method{setAttributeNode()} on the appropriate \class{Element} object
525 to use the newly created attribute instance.
526 \end{methoddesc}
528 \begin{methoddesc}[Document]{createAttributeNS}{namespaceURI, qualifiedName}
529 Create and return an attribute node with a namespace. The
530 \var{tagName} may have a prefix. This method does not associate the
531 attribute node with any particular element. You must use
532 \method{setAttributeNode()} on the appropriate \class{Element} object
533 to use the newly created attribute instance.
534 \end{methoddesc}
536 \begin{methoddesc}[Document]{getElementsByTagName}{tagName}
537 Search for all descendants (direct children, children's children,
538 etc.) with a particular element type name.
539 \end{methoddesc}
541 \begin{methoddesc}[Document]{getElementsByTagNameNS}{namespaceURI, localName}
542 Search for all descendants (direct children, children's children,
543 etc.) with a particular namespace URI and localname. The localname is
544 the part of the namespace after the prefix.
545 \end{methoddesc}
548 \subsubsection{Element Objects \label{dom-element-objects}}
550 \class{Element} is a subclass of \class{Node}, so inherits all the
551 attributes of that class.
553 \begin{memberdesc}[Element]{tagName}
554 The element type name. In a namespace-using document it may have
555 colons in it. The value is a string.
556 \end{memberdesc}
558 \begin{methoddesc}[Element]{getElementsByTagName}{tagName}
559 Same as equivalent method in the \class{Document} class.
560 \end{methoddesc}
562 \begin{methoddesc}[Element]{getElementsByTagNameNS}{tagName}
563 Same as equivalent method in the \class{Document} class.
564 \end{methoddesc}
566 \begin{methoddesc}[Element]{hasAttribute}{name}
567 Returns true if the element has an attribute named by \var{name}.
568 \end{methoddesc}
570 \begin{methoddesc}[Element]{hasAttributeNS}{namespaceURI, localName}
571 Returns true if the element has an attribute named by
572 \var{namespaceURI} and \var{localName}.
573 \end{methoddesc}
575 \begin{methoddesc}[Element]{getAttribute}{name}
576 Return the value of the attribute named by \var{name} as a
577 string. If no such attribute exists, an empty string is returned,
578 as if the attribute had no value.
579 \end{methoddesc}
581 \begin{methoddesc}[Element]{getAttributeNode}{attrname}
582 Return the \class{Attr} node for the attribute named by
583 \var{attrname}.
584 \end{methoddesc}
586 \begin{methoddesc}[Element]{getAttributeNS}{namespaceURI, localName}
587 Return the value of the attribute named by \var{namespaceURI} and
588 \var{localName} as a string. If no such attribute exists, an empty
589 string is returned, as if the attribute had no value.
590 \end{methoddesc}
592 \begin{methoddesc}[Element]{getAttributeNodeNS}{namespaceURI, localName}
593 Return an attribute value as a node, given a \var{namespaceURI} and
594 \var{localName}.
595 \end{methoddesc}
597 \begin{methoddesc}[Element]{removeAttribute}{name}
598 Remove an attribute by name. No exception is raised if there is no
599 matching attribute.
600 \end{methoddesc}
602 \begin{methoddesc}[Element]{removeAttributeNode}{oldAttr}
603 Remove and return \var{oldAttr} from the attribute list, if present.
604 If \var{oldAttr} is not present, \exception{NotFoundErr} is raised.
605 \end{methoddesc}
607 \begin{methoddesc}[Element]{removeAttributeNS}{namespaceURI, localName}
608 Remove an attribute by name. Note that it uses a localName, not a
609 qname. No exception is raised if there is no matching attribute.
610 \end{methoddesc}
612 \begin{methoddesc}[Element]{setAttribute}{name, value}
613 Set an attribute value from a string.
614 \end{methoddesc}
616 \begin{methoddesc}[Element]{setAttributeNode}{newAttr}
617 Add a new attribute node to the element, replacing an existing
618 attribute if necessary if the \member{name} attribute matches. If a
619 replacement occurs, the old attribute node will be returned. If
620 \var{newAttr} is already in use, \exception{InuseAttributeErr} will be
621 raised.
622 \end{methoddesc}
624 \begin{methoddesc}[Element]{setAttributeNodeNS}{newAttr}
625 Add a new attribute node to the element, replacing an existing
626 attribute if necessary if the \member{namespaceURI} and
627 \member{localName} attributes match. If a replacement occurs, the old
628 attribute node will be returned. If \var{newAttr} is already in use,
629 \exception{InuseAttributeErr} will be raised.
630 \end{methoddesc}
632 \begin{methoddesc}[Element]{setAttributeNS}{namespaceURI, qname, value}
633 Set an attribute value from a string, given a \var{namespaceURI} and a
634 \var{qname}. Note that a qname is the whole attribute name. This is
635 different than above.
636 \end{methoddesc}
639 \subsubsection{Attr Objects \label{dom-attr-objects}}
641 \class{Attr} inherits from \class{Node}, so inherits all its
642 attributes.
644 \begin{memberdesc}[Attr]{name}
645 The attribute name. In a namespace-using document it may have colons
646 in it.
647 \end{memberdesc}
649 \begin{memberdesc}[Attr]{localName}
650 The part of the name following the colon if there is one, else the
651 entire name. This is a read-only attribute.
652 \end{memberdesc}
654 \begin{memberdesc}[Attr]{prefix}
655 The part of the name preceding the colon if there is one, else the
656 empty string.
657 \end{memberdesc}
660 \subsubsection{NamedNodeMap Objects \label{dom-attributelist-objects}}
662 \class{NamedNodeMap} does \emph{not} inherit from \class{Node}.
664 \begin{memberdesc}[NamedNodeMap]{length}
665 The length of the attribute list.
666 \end{memberdesc}
668 \begin{methoddesc}[NamedNodeMap]{item}{index}
669 Return an attribute with a particular index. The order you get the
670 attributes in is arbitrary but will be consistent for the life of a
671 DOM. Each item is an attribute node. Get its value with the
672 \member{value} attribute.
673 \end{methoddesc}
675 There are also experimental methods that give this class more mapping
676 behavior. You can use them or you can use the standardized
677 \method{getAttribute*()} family of methods on the \class{Element}
678 objects.
681 \subsubsection{Comment Objects \label{dom-comment-objects}}
683 \class{Comment} represents a comment in the XML document. It is a
684 subclass of \class{Node}, but cannot have child nodes.
686 \begin{memberdesc}[Comment]{data}
687 The content of the comment as a string. The attribute contains all
688 characters between the leading \code{<!-}\code{-} and trailing
689 \code{-}\code{->}, but does not include them.
690 \end{memberdesc}
693 \subsubsection{Text and CDATASection Objects \label{dom-text-objects}}
695 The \class{Text} interface represents text in the XML document. If
696 the parser and DOM implementation support the DOM's XML extension,
697 portions of the text enclosed in CDATA marked sections are stored in
698 \class{CDATASection} objects. These two interfaces are identical, but
699 provide different values for the \member{nodeType} attribute.
701 These interfaces extend the \class{Node} interface. They cannot have
702 child nodes.
704 \begin{memberdesc}[Text]{data}
705 The content of the text node as a string.
706 \end{memberdesc}
708 \begin{notice}
709 The use of a \class{CDATASection} node does not indicate that the
710 node represents a complete CDATA marked section, only that the
711 content of the node was part of a CDATA section. A single CDATA
712 section may be represented by more than one node in the document
713 tree. There is no way to determine whether two adjacent
714 \class{CDATASection} nodes represent different CDATA marked
715 sections.
716 \end{notice}
719 \subsubsection{ProcessingInstruction Objects \label{dom-pi-objects}}
721 Represents a processing instruction in the XML document; this inherits
722 from the \class{Node} interface and cannot have child nodes.
724 \begin{memberdesc}[ProcessingInstruction]{target}
725 The content of the processing instruction up to the first whitespace
726 character. This is a read-only attribute.
727 \end{memberdesc}
729 \begin{memberdesc}[ProcessingInstruction]{data}
730 The content of the processing instruction following the first
731 whitespace character.
732 \end{memberdesc}
735 \subsubsection{Exceptions \label{dom-exceptions}}
737 \versionadded{2.1}
739 The DOM Level~2 recommendation defines a single exception,
740 \exception{DOMException}, and a number of constants that allow
741 applications to determine what sort of error occurred.
742 \exception{DOMException} instances carry a \member{code} attribute
743 that provides the appropriate value for the specific exception.
745 The Python DOM interface provides the constants, but also expands the
746 set of exceptions so that a specific exception exists for each of the
747 exception codes defined by the DOM. The implementations must raise
748 the appropriate specific exception, each of which carries the
749 appropriate value for the \member{code} attribute.
751 \begin{excdesc}{DOMException}
752 Base exception class used for all specific DOM exceptions. This
753 exception class cannot be directly instantiated.
754 \end{excdesc}
756 \begin{excdesc}{DomstringSizeErr}
757 Raised when a specified range of text does not fit into a string.
758 This is not known to be used in the Python DOM implementations, but
759 may be received from DOM implementations not written in Python.
760 \end{excdesc}
762 \begin{excdesc}{HierarchyRequestErr}
763 Raised when an attempt is made to insert a node where the node type
764 is not allowed.
765 \end{excdesc}
767 \begin{excdesc}{IndexSizeErr}
768 Raised when an index or size parameter to a method is negative or
769 exceeds the allowed values.
770 \end{excdesc}
772 \begin{excdesc}{InuseAttributeErr}
773 Raised when an attempt is made to insert an \class{Attr} node that
774 is already present elsewhere in the document.
775 \end{excdesc}
777 \begin{excdesc}{InvalidAccessErr}
778 Raised if a parameter or an operation is not supported on the
779 underlying object.
780 \end{excdesc}
782 \begin{excdesc}{InvalidCharacterErr}
783 This exception is raised when a string parameter contains a
784 character that is not permitted in the context it's being used in by
785 the XML 1.0 recommendation. For example, attempting to create an
786 \class{Element} node with a space in the element type name will
787 cause this error to be raised.
788 \end{excdesc}
790 \begin{excdesc}{InvalidModificationErr}
791 Raised when an attempt is made to modify the type of a node.
792 \end{excdesc}
794 \begin{excdesc}{InvalidStateErr}
795 Raised when an attempt is made to use an object that is not defined or is no
796 longer usable.
797 \end{excdesc}
799 \begin{excdesc}{NamespaceErr}
800 If an attempt is made to change any object in a way that is not
801 permitted with regard to the
802 \citetitle[http://www.w3.org/TR/REC-xml-names/]{Namespaces in XML}
803 recommendation, this exception is raised.
804 \end{excdesc}
806 \begin{excdesc}{NotFoundErr}
807 Exception when a node does not exist in the referenced context. For
808 example, \method{NamedNodeMap.removeNamedItem()} will raise this if
809 the node passed in does not exist in the map.
810 \end{excdesc}
812 \begin{excdesc}{NotSupportedErr}
813 Raised when the implementation does not support the requested type
814 of object or operation.
815 \end{excdesc}
817 \begin{excdesc}{NoDataAllowedErr}
818 This is raised if data is specified for a node which does not
819 support data.
820 % XXX a better explanation is needed!
821 \end{excdesc}
823 \begin{excdesc}{NoModificationAllowedErr}
824 Raised on attempts to modify an object where modifications are not
825 allowed (such as for read-only nodes).
826 \end{excdesc}
828 \begin{excdesc}{SyntaxErr}
829 Raised when an invalid or illegal string is specified.
830 % XXX how is this different from InvalidCharacterErr ???
831 \end{excdesc}
833 \begin{excdesc}{WrongDocumentErr}
834 Raised when a node is inserted in a different document than it
835 currently belongs to, and the implementation does not support
836 migrating the node from one document to the other.
837 \end{excdesc}
839 The exception codes defined in the DOM recommendation map to the
840 exceptions described above according to this table:
842 \begin{tableii}{l|l}{constant}{Constant}{Exception}
843 \lineii{DOMSTRING_SIZE_ERR}{\exception{DomstringSizeErr}}
844 \lineii{HIERARCHY_REQUEST_ERR}{\exception{HierarchyRequestErr}}
845 \lineii{INDEX_SIZE_ERR}{\exception{IndexSizeErr}}
846 \lineii{INUSE_ATTRIBUTE_ERR}{\exception{InuseAttributeErr}}
847 \lineii{INVALID_ACCESS_ERR}{\exception{InvalidAccessErr}}
848 \lineii{INVALID_CHARACTER_ERR}{\exception{InvalidCharacterErr}}
849 \lineii{INVALID_MODIFICATION_ERR}{\exception{InvalidModificationErr}}
850 \lineii{INVALID_STATE_ERR}{\exception{InvalidStateErr}}
851 \lineii{NAMESPACE_ERR}{\exception{NamespaceErr}}
852 \lineii{NOT_FOUND_ERR}{\exception{NotFoundErr}}
853 \lineii{NOT_SUPPORTED_ERR}{\exception{NotSupportedErr}}
854 \lineii{NO_DATA_ALLOWED_ERR}{\exception{NoDataAllowedErr}}
855 \lineii{NO_MODIFICATION_ALLOWED_ERR}{\exception{NoModificationAllowedErr}}
856 \lineii{SYNTAX_ERR}{\exception{SyntaxErr}}
857 \lineii{WRONG_DOCUMENT_ERR}{\exception{WrongDocumentErr}}
858 \end{tableii}
861 \subsection{Conformance \label{dom-conformance}}
863 This section describes the conformance requirements and relationships
864 between the Python DOM API, the W3C DOM recommendations, and the OMG
865 IDL mapping for Python.
868 \subsubsection{Type Mapping \label{dom-type-mapping}}
870 The primitive IDL types used in the DOM specification are mapped to
871 Python types according to the following table.
873 \begin{tableii}{l|l}{code}{IDL Type}{Python Type}
874 \lineii{boolean}{\code{IntegerType} (with a value of \code{0} or \code{1})}
875 \lineii{int}{\code{IntegerType}}
876 \lineii{long int}{\code{IntegerType}}
877 \lineii{unsigned int}{\code{IntegerType}}
878 \end{tableii}
880 Additionally, the \class{DOMString} defined in the recommendation is
881 mapped to a Python string or Unicode string. Applications should
882 be able to handle Unicode whenever a string is returned from the DOM.
884 The IDL \keyword{null} value is mapped to \code{None}, which may be
885 accepted or provided by the implementation whenever \keyword{null} is
886 allowed by the API.
889 \subsubsection{Accessor Methods \label{dom-accessor-methods}}
891 The mapping from OMG IDL to Python defines accessor functions for IDL
892 \keyword{attribute} declarations in much the way the Java mapping
893 does. Mapping the IDL declarations
895 \begin{verbatim}
896 readonly attribute string someValue;
897 attribute string anotherValue;
898 \end{verbatim}
900 yields three accessor functions: a ``get'' method for
901 \member{someValue} (\method{_get_someValue()}), and ``get'' and
902 ``set'' methods for
903 \member{anotherValue} (\method{_get_anotherValue()} and
904 \method{_set_anotherValue()}). The mapping, in particular, does not
905 require that the IDL attributes are accessible as normal Python
906 attributes: \code{\var{object}.someValue} is \emph{not} required to
907 work, and may raise an \exception{AttributeError}.
909 The Python DOM API, however, \emph{does} require that normal attribute
910 access work. This means that the typical surrogates generated by
911 Python IDL compilers are not likely to work, and wrapper objects may
912 be needed on the client if the DOM objects are accessed via CORBA.
913 While this does require some additional consideration for CORBA DOM
914 clients, the implementers with experience using DOM over CORBA from
915 Python do not consider this a problem. Attributes that are declared
916 \keyword{readonly} may not restrict write access in all DOM
917 implementations.
919 In the Python DOM API, accessor functions are not required. If provided,
920 they should take the form defined by the Python IDL mapping, but
921 these methods are considered unnecessary since the attributes are
922 accessible directly from Python. ``Set'' accessors should never be
923 provided for \keyword{readonly} attributes.
925 The IDL definitions do not fully embody the requirements of the W3C DOM
926 API, such as the notion of certain objects, such as the return value of
927 \method{getElementsByTagName()}, being ``live''. The Python DOM API
928 does not require implementations to enforce such requirements.