|
1 |
|
2 :mod:`xml.dom` --- The Document Object Model API |
|
3 ================================================ |
|
4 |
|
5 .. module:: xml.dom |
|
6 :synopsis: Document Object Model API for Python. |
|
7 .. sectionauthor:: Paul Prescod <paul@prescod.net> |
|
8 .. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> |
|
9 |
|
10 |
|
11 .. versionadded:: 2.0 |
|
12 |
|
13 The Document Object Model, or "DOM," is a cross-language API from the World Wide |
|
14 Web Consortium (W3C) for accessing and modifying XML documents. A DOM |
|
15 implementation presents an XML document as a tree structure, or allows client |
|
16 code to build such a structure from scratch. It then gives access to the |
|
17 structure through a set of objects which provided well-known interfaces. |
|
18 |
|
19 The DOM is extremely useful for random-access applications. SAX only allows you |
|
20 a view of one bit of the document at a time. If you are looking at one SAX |
|
21 element, you have no access to another. If you are looking at a text node, you |
|
22 have no access to a containing element. When you write a SAX application, you |
|
23 need to keep track of your program's position in the document somewhere in your |
|
24 own code. SAX does not do it for you. Also, if you need to look ahead in the |
|
25 XML document, you are just out of luck. |
|
26 |
|
27 Some applications are simply impossible in an event driven model with no access |
|
28 to a tree. Of course you could build some sort of tree yourself in SAX events, |
|
29 but the DOM allows you to avoid writing that code. The DOM is a standard tree |
|
30 representation for XML data. |
|
31 |
|
32 The Document Object Model is being defined by the W3C in stages, or "levels" in |
|
33 their terminology. The Python mapping of the API is substantially based on the |
|
34 DOM Level 2 recommendation. |
|
35 |
|
36 .. XXX PyXML is dead... |
|
37 .. The mapping of the Level 3 specification, currently |
|
38 only available in draft form, is being developed by the `Python XML Special |
|
39 Interest Group <http://www.python.org/sigs/xml-sig/>`_ as part of the `PyXML |
|
40 package <http://pyxml.sourceforge.net/>`_. Refer to the documentation bundled |
|
41 with that package for information on the current state of DOM Level 3 support. |
|
42 |
|
43 .. What if your needs are somewhere between SAX and the DOM? Perhaps |
|
44 you cannot afford to load the entire tree in memory but you find the |
|
45 SAX model somewhat cumbersome and low-level. There is also a module |
|
46 called xml.dom.pulldom that allows you to build trees of only the |
|
47 parts of a document that you need structured access to. It also has |
|
48 features that allow you to find your way around the DOM. |
|
49 See http://www.prescod.net/python/pulldom |
|
50 |
|
51 DOM applications typically start by parsing some XML into a DOM. How this is |
|
52 accomplished is not covered at all by DOM Level 1, and Level 2 provides only |
|
53 limited improvements: There is a :class:`DOMImplementation` object class which |
|
54 provides access to :class:`Document` creation methods, but no way to access an |
|
55 XML reader/parser/Document builder in an implementation-independent way. There |
|
56 is also no well-defined way to access these methods without an existing |
|
57 :class:`Document` object. In Python, each DOM implementation will provide a |
|
58 function :func:`getDOMImplementation`. DOM Level 3 adds a Load/Store |
|
59 specification, which defines an interface to the reader, but this is not yet |
|
60 available in the Python standard library. |
|
61 |
|
62 Once you have a DOM document object, you can access the parts of your XML |
|
63 document through its properties and methods. These properties are defined in |
|
64 the DOM specification; this portion of the reference manual describes the |
|
65 interpretation of the specification in Python. |
|
66 |
|
67 The specification provided by the W3C defines the DOM API for Java, ECMAScript, |
|
68 and OMG IDL. The Python mapping defined here is based in large part on the IDL |
|
69 version of the specification, but strict compliance is not required (though |
|
70 implementations are free to support the strict mapping from IDL). See section |
|
71 :ref:`dom-conformance` for a detailed discussion of mapping requirements. |
|
72 |
|
73 |
|
74 .. seealso:: |
|
75 |
|
76 `Document Object Model (DOM) Level 2 Specification <http://www.w3.org/TR/DOM-Level-2-Core/>`_ |
|
77 The W3C recommendation upon which the Python DOM API is based. |
|
78 |
|
79 `Document Object Model (DOM) Level 1 Specification <http://www.w3.org/TR/REC-DOM-Level-1/>`_ |
|
80 The W3C recommendation for the DOM supported by :mod:`xml.dom.minidom`. |
|
81 |
|
82 `Python Language Mapping Specification <http://www.omg.org/docs/formal/02-11-05.pdf>`_ |
|
83 This specifies the mapping from OMG IDL to Python. |
|
84 |
|
85 |
|
86 Module Contents |
|
87 --------------- |
|
88 |
|
89 The :mod:`xml.dom` contains the following functions: |
|
90 |
|
91 |
|
92 .. function:: registerDOMImplementation(name, factory) |
|
93 |
|
94 Register the *factory* function with the name *name*. The factory function |
|
95 should return an object which implements the :class:`DOMImplementation` |
|
96 interface. The factory function can return the same object every time, or a new |
|
97 one for each call, as appropriate for the specific implementation (e.g. if that |
|
98 implementation supports some customization). |
|
99 |
|
100 |
|
101 .. function:: getDOMImplementation([name[, features]]) |
|
102 |
|
103 Return a suitable DOM implementation. The *name* is either well-known, the |
|
104 module name of a DOM implementation, or ``None``. If it is not ``None``, imports |
|
105 the corresponding module and returns a :class:`DOMImplementation` object if the |
|
106 import succeeds. If no name is given, and if the environment variable |
|
107 :envvar:`PYTHON_DOM` is set, this variable is used to find the implementation. |
|
108 |
|
109 If name is not given, this examines the available implementations to find one |
|
110 with the required feature set. If no implementation can be found, raise an |
|
111 :exc:`ImportError`. The features list must be a sequence of ``(feature, |
|
112 version)`` pairs which are passed to the :meth:`hasFeature` method on available |
|
113 :class:`DOMImplementation` objects. |
|
114 |
|
115 Some convenience constants are also provided: |
|
116 |
|
117 |
|
118 .. data:: EMPTY_NAMESPACE |
|
119 |
|
120 The value used to indicate that no namespace is associated with a node in the |
|
121 DOM. This is typically found as the :attr:`namespaceURI` of a node, or used as |
|
122 the *namespaceURI* parameter to a namespaces-specific method. |
|
123 |
|
124 .. versionadded:: 2.2 |
|
125 |
|
126 |
|
127 .. data:: XML_NAMESPACE |
|
128 |
|
129 The namespace URI associated with the reserved prefix ``xml``, as defined by |
|
130 `Namespaces in XML <http://www.w3.org/TR/REC-xml-names/>`_ (section 4). |
|
131 |
|
132 .. versionadded:: 2.2 |
|
133 |
|
134 |
|
135 .. data:: XMLNS_NAMESPACE |
|
136 |
|
137 The namespace URI for namespace declarations, as defined by `Document Object |
|
138 Model (DOM) Level 2 Core Specification |
|
139 <http://www.w3.org/TR/DOM-Level-2-Core/core.html>`_ (section 1.1.8). |
|
140 |
|
141 .. versionadded:: 2.2 |
|
142 |
|
143 |
|
144 .. data:: XHTML_NAMESPACE |
|
145 |
|
146 The URI of the XHTML namespace as defined by `XHTML 1.0: The Extensible |
|
147 HyperText Markup Language <http://www.w3.org/TR/xhtml1/>`_ (section 3.1.1). |
|
148 |
|
149 .. versionadded:: 2.2 |
|
150 |
|
151 In addition, :mod:`xml.dom` contains a base :class:`Node` class and the DOM |
|
152 exception classes. The :class:`Node` class provided by this module does not |
|
153 implement any of the methods or attributes defined by the DOM specification; |
|
154 concrete DOM implementations must provide those. The :class:`Node` class |
|
155 provided as part of this module does provide the constants used for the |
|
156 :attr:`nodeType` attribute on concrete :class:`Node` objects; they are located |
|
157 within the class rather than at the module level to conform with the DOM |
|
158 specifications. |
|
159 |
|
160 .. Should the Node documentation go here? |
|
161 |
|
162 |
|
163 .. _dom-objects: |
|
164 |
|
165 Objects in the DOM |
|
166 ------------------ |
|
167 |
|
168 The definitive documentation for the DOM is the DOM specification from the W3C. |
|
169 |
|
170 Note that DOM attributes may also be manipulated as nodes instead of as simple |
|
171 strings. It is fairly rare that you must do this, however, so this usage is not |
|
172 yet documented. |
|
173 |
|
174 +--------------------------------+-----------------------------------+---------------------------------+ |
|
175 | Interface | Section | Purpose | |
|
176 +================================+===================================+=================================+ |
|
177 | :class:`DOMImplementation` | :ref:`dom-implementation-objects` | Interface to the underlying | |
|
178 | | | implementation. | |
|
179 +--------------------------------+-----------------------------------+---------------------------------+ |
|
180 | :class:`Node` | :ref:`dom-node-objects` | Base interface for most objects | |
|
181 | | | in a document. | |
|
182 +--------------------------------+-----------------------------------+---------------------------------+ |
|
183 | :class:`NodeList` | :ref:`dom-nodelist-objects` | Interface for a sequence of | |
|
184 | | | nodes. | |
|
185 +--------------------------------+-----------------------------------+---------------------------------+ |
|
186 | :class:`DocumentType` | :ref:`dom-documenttype-objects` | Information about the | |
|
187 | | | declarations needed to process | |
|
188 | | | a document. | |
|
189 +--------------------------------+-----------------------------------+---------------------------------+ |
|
190 | :class:`Document` | :ref:`dom-document-objects` | Object which represents an | |
|
191 | | | entire document. | |
|
192 +--------------------------------+-----------------------------------+---------------------------------+ |
|
193 | :class:`Element` | :ref:`dom-element-objects` | Element nodes in the document | |
|
194 | | | hierarchy. | |
|
195 +--------------------------------+-----------------------------------+---------------------------------+ |
|
196 | :class:`Attr` | :ref:`dom-attr-objects` | Attribute value nodes on | |
|
197 | | | element nodes. | |
|
198 +--------------------------------+-----------------------------------+---------------------------------+ |
|
199 | :class:`Comment` | :ref:`dom-comment-objects` | Representation of comments in | |
|
200 | | | the source document. | |
|
201 +--------------------------------+-----------------------------------+---------------------------------+ |
|
202 | :class:`Text` | :ref:`dom-text-objects` | Nodes containing textual | |
|
203 | | | content from the document. | |
|
204 +--------------------------------+-----------------------------------+---------------------------------+ |
|
205 | :class:`ProcessingInstruction` | :ref:`dom-pi-objects` | Processing instruction | |
|
206 | | | representation. | |
|
207 +--------------------------------+-----------------------------------+---------------------------------+ |
|
208 |
|
209 An additional section describes the exceptions defined for working with the DOM |
|
210 in Python. |
|
211 |
|
212 |
|
213 .. _dom-implementation-objects: |
|
214 |
|
215 DOMImplementation Objects |
|
216 ^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
217 |
|
218 The :class:`DOMImplementation` interface provides a way for applications to |
|
219 determine the availability of particular features in the DOM they are using. |
|
220 DOM Level 2 added the ability to create new :class:`Document` and |
|
221 :class:`DocumentType` objects using the :class:`DOMImplementation` as well. |
|
222 |
|
223 |
|
224 .. method:: DOMImplementation.hasFeature(feature, version) |
|
225 |
|
226 Return true if the feature identified by the pair of strings *feature* and |
|
227 *version* is implemented. |
|
228 |
|
229 |
|
230 .. method:: DOMImplementation.createDocument(namespaceUri, qualifiedName, doctype) |
|
231 |
|
232 Return a new :class:`Document` object (the root of the DOM), with a child |
|
233 :class:`Element` object having the given *namespaceUri* and *qualifiedName*. The |
|
234 *doctype* must be a :class:`DocumentType` object created by |
|
235 :meth:`createDocumentType`, or ``None``. In the Python DOM API, the first two |
|
236 arguments can also be ``None`` in order to indicate that no :class:`Element` |
|
237 child is to be created. |
|
238 |
|
239 |
|
240 .. method:: DOMImplementation.createDocumentType(qualifiedName, publicId, systemId) |
|
241 |
|
242 Return a new :class:`DocumentType` object that encapsulates the given |
|
243 *qualifiedName*, *publicId*, and *systemId* strings, representing the |
|
244 information contained in an XML document type declaration. |
|
245 |
|
246 |
|
247 .. _dom-node-objects: |
|
248 |
|
249 Node Objects |
|
250 ^^^^^^^^^^^^ |
|
251 |
|
252 All of the components of an XML document are subclasses of :class:`Node`. |
|
253 |
|
254 |
|
255 .. attribute:: Node.nodeType |
|
256 |
|
257 An integer representing the node type. Symbolic constants for the types are on |
|
258 the :class:`Node` object: :const:`ELEMENT_NODE`, :const:`ATTRIBUTE_NODE`, |
|
259 :const:`TEXT_NODE`, :const:`CDATA_SECTION_NODE`, :const:`ENTITY_NODE`, |
|
260 :const:`PROCESSING_INSTRUCTION_NODE`, :const:`COMMENT_NODE`, |
|
261 :const:`DOCUMENT_NODE`, :const:`DOCUMENT_TYPE_NODE`, :const:`NOTATION_NODE`. |
|
262 This is a read-only attribute. |
|
263 |
|
264 |
|
265 .. attribute:: Node.parentNode |
|
266 |
|
267 The parent of the current node, or ``None`` for the document node. The value is |
|
268 always a :class:`Node` object or ``None``. For :class:`Element` nodes, this |
|
269 will be the parent element, except for the root element, in which case it will |
|
270 be the :class:`Document` object. For :class:`Attr` nodes, this is always |
|
271 ``None``. This is a read-only attribute. |
|
272 |
|
273 |
|
274 .. attribute:: Node.attributes |
|
275 |
|
276 A :class:`NamedNodeMap` of attribute objects. Only elements have actual values |
|
277 for this; others provide ``None`` for this attribute. This is a read-only |
|
278 attribute. |
|
279 |
|
280 |
|
281 .. attribute:: Node.previousSibling |
|
282 |
|
283 The node that immediately precedes this one with the same parent. For |
|
284 instance the element with an end-tag that comes just before the *self* |
|
285 element's start-tag. Of course, XML documents are made up of more than just |
|
286 elements so the previous sibling could be text, a comment, or something else. |
|
287 If this node is the first child of the parent, this attribute will be |
|
288 ``None``. This is a read-only attribute. |
|
289 |
|
290 |
|
291 .. attribute:: Node.nextSibling |
|
292 |
|
293 The node that immediately follows this one with the same parent. See also |
|
294 :attr:`previousSibling`. If this is the last child of the parent, this |
|
295 attribute will be ``None``. This is a read-only attribute. |
|
296 |
|
297 |
|
298 .. attribute:: Node.childNodes |
|
299 |
|
300 A list of nodes contained within this node. This is a read-only attribute. |
|
301 |
|
302 |
|
303 .. attribute:: Node.firstChild |
|
304 |
|
305 The first child of the node, if there are any, or ``None``. This is a read-only |
|
306 attribute. |
|
307 |
|
308 |
|
309 .. attribute:: Node.lastChild |
|
310 |
|
311 The last child of the node, if there are any, or ``None``. This is a read-only |
|
312 attribute. |
|
313 |
|
314 |
|
315 .. attribute:: Node.localName |
|
316 |
|
317 The part of the :attr:`tagName` following the colon if there is one, else the |
|
318 entire :attr:`tagName`. The value is a string. |
|
319 |
|
320 |
|
321 .. attribute:: Node.prefix |
|
322 |
|
323 The part of the :attr:`tagName` preceding the colon if there is one, else the |
|
324 empty string. The value is a string, or ``None`` |
|
325 |
|
326 |
|
327 .. attribute:: Node.namespaceURI |
|
328 |
|
329 The namespace associated with the element name. This will be a string or |
|
330 ``None``. This is a read-only attribute. |
|
331 |
|
332 |
|
333 .. attribute:: Node.nodeName |
|
334 |
|
335 This has a different meaning for each node type; see the DOM specification for |
|
336 details. You can always get the information you would get here from another |
|
337 property such as the :attr:`tagName` property for elements or the :attr:`name` |
|
338 property for attributes. For all node types, the value of this attribute will be |
|
339 either a string or ``None``. This is a read-only attribute. |
|
340 |
|
341 |
|
342 .. attribute:: Node.nodeValue |
|
343 |
|
344 This has a different meaning for each node type; see the DOM specification for |
|
345 details. The situation is similar to that with :attr:`nodeName`. The value is |
|
346 a string or ``None``. |
|
347 |
|
348 |
|
349 .. method:: Node.hasAttributes() |
|
350 |
|
351 Returns true if the node has any attributes. |
|
352 |
|
353 |
|
354 .. method:: Node.hasChildNodes() |
|
355 |
|
356 Returns true if the node has any child nodes. |
|
357 |
|
358 |
|
359 .. method:: Node.isSameNode(other) |
|
360 |
|
361 Returns true if *other* refers to the same node as this node. This is especially |
|
362 useful for DOM implementations which use any sort of proxy architecture (because |
|
363 more than one object can refer to the same node). |
|
364 |
|
365 .. note:: |
|
366 |
|
367 This is based on a proposed DOM Level 3 API which is still in the "working |
|
368 draft" stage, but this particular interface appears uncontroversial. Changes |
|
369 from the W3C will not necessarily affect this method in the Python DOM interface |
|
370 (though any new W3C API for this would also be supported). |
|
371 |
|
372 |
|
373 .. method:: Node.appendChild(newChild) |
|
374 |
|
375 Add a new child node to this node at the end of the list of |
|
376 children, returning *newChild*. If the node was already in |
|
377 in the tree, it is removed first. |
|
378 |
|
379 |
|
380 .. method:: Node.insertBefore(newChild, refChild) |
|
381 |
|
382 Insert a new child node before an existing child. It must be the case that |
|
383 *refChild* is a child of this node; if not, :exc:`ValueError` is raised. |
|
384 *newChild* is returned. If *refChild* is ``None``, it inserts *newChild* at the |
|
385 end of the children's list. |
|
386 |
|
387 |
|
388 .. method:: Node.removeChild(oldChild) |
|
389 |
|
390 Remove a child node. *oldChild* must be a child of this node; if not, |
|
391 :exc:`ValueError` is raised. *oldChild* is returned on success. If *oldChild* |
|
392 will not be used further, its :meth:`unlink` method should be called. |
|
393 |
|
394 |
|
395 .. method:: Node.replaceChild(newChild, oldChild) |
|
396 |
|
397 Replace an existing node with a new node. It must be the case that *oldChild* |
|
398 is a child of this node; if not, :exc:`ValueError` is raised. |
|
399 |
|
400 |
|
401 .. method:: Node.normalize() |
|
402 |
|
403 Join adjacent text nodes so that all stretches of text are stored as single |
|
404 :class:`Text` instances. This simplifies processing text from a DOM tree for |
|
405 many applications. |
|
406 |
|
407 .. versionadded:: 2.1 |
|
408 |
|
409 |
|
410 .. method:: Node.cloneNode(deep) |
|
411 |
|
412 Clone this node. Setting *deep* means to clone all child nodes as well. This |
|
413 returns the clone. |
|
414 |
|
415 |
|
416 .. _dom-nodelist-objects: |
|
417 |
|
418 NodeList Objects |
|
419 ^^^^^^^^^^^^^^^^ |
|
420 |
|
421 A :class:`NodeList` represents a sequence of nodes. These objects are used in |
|
422 two ways in the DOM Core recommendation: the :class:`Element` objects provides |
|
423 one as its list of child nodes, and the :meth:`getElementsByTagName` and |
|
424 :meth:`getElementsByTagNameNS` methods of :class:`Node` return objects with this |
|
425 interface to represent query results. |
|
426 |
|
427 The DOM Level 2 recommendation defines one method and one attribute for these |
|
428 objects: |
|
429 |
|
430 |
|
431 .. method:: NodeList.item(i) |
|
432 |
|
433 Return the *i*'th item from the sequence, if there is one, or ``None``. The |
|
434 index *i* is not allowed to be less then zero or greater than or equal to the |
|
435 length of the sequence. |
|
436 |
|
437 |
|
438 .. attribute:: NodeList.length |
|
439 |
|
440 The number of nodes in the sequence. |
|
441 |
|
442 In addition, the Python DOM interface requires that some additional support is |
|
443 provided to allow :class:`NodeList` objects to be used as Python sequences. All |
|
444 :class:`NodeList` implementations must include support for :meth:`__len__` and |
|
445 :meth:`__getitem__`; this allows iteration over the :class:`NodeList` in |
|
446 :keyword:`for` statements and proper support for the :func:`len` built-in |
|
447 function. |
|
448 |
|
449 If a DOM implementation supports modification of the document, the |
|
450 :class:`NodeList` implementation must also support the :meth:`__setitem__` and |
|
451 :meth:`__delitem__` methods. |
|
452 |
|
453 |
|
454 .. _dom-documenttype-objects: |
|
455 |
|
456 DocumentType Objects |
|
457 ^^^^^^^^^^^^^^^^^^^^ |
|
458 |
|
459 Information about the notations and entities declared by a document (including |
|
460 the external subset if the parser uses it and can provide the information) is |
|
461 available from a :class:`DocumentType` object. The :class:`DocumentType` for a |
|
462 document is available from the :class:`Document` object's :attr:`doctype` |
|
463 attribute; if there is no ``DOCTYPE`` declaration for the document, the |
|
464 document's :attr:`doctype` attribute will be set to ``None`` instead of an |
|
465 instance of this interface. |
|
466 |
|
467 :class:`DocumentType` is a specialization of :class:`Node`, and adds the |
|
468 following attributes: |
|
469 |
|
470 |
|
471 .. attribute:: DocumentType.publicId |
|
472 |
|
473 The public identifier for the external subset of the document type definition. |
|
474 This will be a string or ``None``. |
|
475 |
|
476 |
|
477 .. attribute:: DocumentType.systemId |
|
478 |
|
479 The system identifier for the external subset of the document type definition. |
|
480 This will be a URI as a string, or ``None``. |
|
481 |
|
482 |
|
483 .. attribute:: DocumentType.internalSubset |
|
484 |
|
485 A string giving the complete internal subset from the document. This does not |
|
486 include the brackets which enclose the subset. If the document has no internal |
|
487 subset, this should be ``None``. |
|
488 |
|
489 |
|
490 .. attribute:: DocumentType.name |
|
491 |
|
492 The name of the root element as given in the ``DOCTYPE`` declaration, if |
|
493 present. |
|
494 |
|
495 |
|
496 .. attribute:: DocumentType.entities |
|
497 |
|
498 This is a :class:`NamedNodeMap` giving the definitions of external entities. |
|
499 For entity names defined more than once, only the first definition is provided |
|
500 (others are ignored as required by the XML recommendation). This may be |
|
501 ``None`` if the information is not provided by the parser, or if no entities are |
|
502 defined. |
|
503 |
|
504 |
|
505 .. attribute:: DocumentType.notations |
|
506 |
|
507 This is a :class:`NamedNodeMap` giving the definitions of notations. For |
|
508 notation names defined more than once, only the first definition is provided |
|
509 (others are ignored as required by the XML recommendation). This may be |
|
510 ``None`` if the information is not provided by the parser, or if no notations |
|
511 are defined. |
|
512 |
|
513 |
|
514 .. _dom-document-objects: |
|
515 |
|
516 Document Objects |
|
517 ^^^^^^^^^^^^^^^^ |
|
518 |
|
519 A :class:`Document` represents an entire XML document, including its constituent |
|
520 elements, attributes, processing instructions, comments etc. Remember that it |
|
521 inherits properties from :class:`Node`. |
|
522 |
|
523 |
|
524 .. attribute:: Document.documentElement |
|
525 |
|
526 The one and only root element of the document. |
|
527 |
|
528 |
|
529 .. method:: Document.createElement(tagName) |
|
530 |
|
531 Create and return a new element node. The element is not inserted into the |
|
532 document when it is created. You need to explicitly insert it with one of the |
|
533 other methods such as :meth:`insertBefore` or :meth:`appendChild`. |
|
534 |
|
535 |
|
536 .. method:: Document.createElementNS(namespaceURI, tagName) |
|
537 |
|
538 Create and return a new element with a namespace. The *tagName* may have a |
|
539 prefix. The element is not inserted into the document when it is created. You |
|
540 need to explicitly insert it with one of the other methods such as |
|
541 :meth:`insertBefore` or :meth:`appendChild`. |
|
542 |
|
543 |
|
544 .. method:: Document.createTextNode(data) |
|
545 |
|
546 Create and return a text node containing the data passed as a parameter. As |
|
547 with the other creation methods, this one does not insert the node into the |
|
548 tree. |
|
549 |
|
550 |
|
551 .. method:: Document.createComment(data) |
|
552 |
|
553 Create and return a comment node containing the data passed as a parameter. As |
|
554 with the other creation methods, this one does not insert the node into the |
|
555 tree. |
|
556 |
|
557 |
|
558 .. method:: Document.createProcessingInstruction(target, data) |
|
559 |
|
560 Create and return a processing instruction node containing the *target* and |
|
561 *data* passed as parameters. As with the other creation methods, this one does |
|
562 not insert the node into the tree. |
|
563 |
|
564 |
|
565 .. method:: Document.createAttribute(name) |
|
566 |
|
567 Create and return an attribute node. This method does not associate the |
|
568 attribute node with any particular element. You must use |
|
569 :meth:`setAttributeNode` on the appropriate :class:`Element` object to use the |
|
570 newly created attribute instance. |
|
571 |
|
572 |
|
573 .. method:: Document.createAttributeNS(namespaceURI, qualifiedName) |
|
574 |
|
575 Create and return an attribute node with a namespace. The *tagName* may have a |
|
576 prefix. This method does not associate the attribute node with any particular |
|
577 element. You must use :meth:`setAttributeNode` on the appropriate |
|
578 :class:`Element` object to use the newly created attribute instance. |
|
579 |
|
580 |
|
581 .. method:: Document.getElementsByTagName(tagName) |
|
582 |
|
583 Search for all descendants (direct children, children's children, etc.) with a |
|
584 particular element type name. |
|
585 |
|
586 |
|
587 .. method:: Document.getElementsByTagNameNS(namespaceURI, localName) |
|
588 |
|
589 Search for all descendants (direct children, children's children, etc.) with a |
|
590 particular namespace URI and localname. The localname is the part of the |
|
591 namespace after the prefix. |
|
592 |
|
593 |
|
594 .. _dom-element-objects: |
|
595 |
|
596 Element Objects |
|
597 ^^^^^^^^^^^^^^^ |
|
598 |
|
599 :class:`Element` is a subclass of :class:`Node`, so inherits all the attributes |
|
600 of that class. |
|
601 |
|
602 |
|
603 .. attribute:: Element.tagName |
|
604 |
|
605 The element type name. In a namespace-using document it may have colons in it. |
|
606 The value is a string. |
|
607 |
|
608 |
|
609 .. method:: Element.getElementsByTagName(tagName) |
|
610 |
|
611 Same as equivalent method in the :class:`Document` class. |
|
612 |
|
613 |
|
614 .. method:: Element.getElementsByTagNameNS(tagName) |
|
615 |
|
616 Same as equivalent method in the :class:`Document` class. |
|
617 |
|
618 |
|
619 .. method:: Element.hasAttribute(name) |
|
620 |
|
621 Returns true if the element has an attribute named by *name*. |
|
622 |
|
623 |
|
624 .. method:: Element.hasAttributeNS(namespaceURI, localName) |
|
625 |
|
626 Returns true if the element has an attribute named by *namespaceURI* and |
|
627 *localName*. |
|
628 |
|
629 |
|
630 .. method:: Element.getAttribute(name) |
|
631 |
|
632 Return the value of the attribute named by *name* as a string. If no such |
|
633 attribute exists, an empty string is returned, as if the attribute had no value. |
|
634 |
|
635 |
|
636 .. method:: Element.getAttributeNode(attrname) |
|
637 |
|
638 Return the :class:`Attr` node for the attribute named by *attrname*. |
|
639 |
|
640 |
|
641 .. method:: Element.getAttributeNS(namespaceURI, localName) |
|
642 |
|
643 Return the value of the attribute named by *namespaceURI* and *localName* as a |
|
644 string. If no such attribute exists, an empty string is returned, as if the |
|
645 attribute had no value. |
|
646 |
|
647 |
|
648 .. method:: Element.getAttributeNodeNS(namespaceURI, localName) |
|
649 |
|
650 Return an attribute value as a node, given a *namespaceURI* and *localName*. |
|
651 |
|
652 |
|
653 .. method:: Element.removeAttribute(name) |
|
654 |
|
655 Remove an attribute by name. If there is no matching attribute, a |
|
656 :exc:`NotFoundErr` is raised. |
|
657 |
|
658 |
|
659 .. method:: Element.removeAttributeNode(oldAttr) |
|
660 |
|
661 Remove and return *oldAttr* from the attribute list, if present. If *oldAttr* is |
|
662 not present, :exc:`NotFoundErr` is raised. |
|
663 |
|
664 |
|
665 .. method:: Element.removeAttributeNS(namespaceURI, localName) |
|
666 |
|
667 Remove an attribute by name. Note that it uses a localName, not a qname. No |
|
668 exception is raised if there is no matching attribute. |
|
669 |
|
670 |
|
671 .. method:: Element.setAttribute(name, value) |
|
672 |
|
673 Set an attribute value from a string. |
|
674 |
|
675 |
|
676 .. method:: Element.setAttributeNode(newAttr) |
|
677 |
|
678 Add a new attribute node to the element, replacing an existing attribute if |
|
679 necessary if the :attr:`name` attribute matches. If a replacement occurs, the |
|
680 old attribute node will be returned. If *newAttr* is already in use, |
|
681 :exc:`InuseAttributeErr` will be raised. |
|
682 |
|
683 |
|
684 .. method:: Element.setAttributeNodeNS(newAttr) |
|
685 |
|
686 Add a new attribute node to the element, replacing an existing attribute if |
|
687 necessary if the :attr:`namespaceURI` and :attr:`localName` attributes match. |
|
688 If a replacement occurs, the old attribute node will be returned. If *newAttr* |
|
689 is already in use, :exc:`InuseAttributeErr` will be raised. |
|
690 |
|
691 |
|
692 .. method:: Element.setAttributeNS(namespaceURI, qname, value) |
|
693 |
|
694 Set an attribute value from a string, given a *namespaceURI* and a *qname*. |
|
695 Note that a qname is the whole attribute name. This is different than above. |
|
696 |
|
697 |
|
698 .. _dom-attr-objects: |
|
699 |
|
700 Attr Objects |
|
701 ^^^^^^^^^^^^ |
|
702 |
|
703 :class:`Attr` inherits from :class:`Node`, so inherits all its attributes. |
|
704 |
|
705 |
|
706 .. attribute:: Attr.name |
|
707 |
|
708 The attribute name. In a namespace-using document it may have colons in it. |
|
709 |
|
710 |
|
711 .. attribute:: Attr.localName |
|
712 |
|
713 The part of the name following the colon if there is one, else the entire name. |
|
714 This is a read-only attribute. |
|
715 |
|
716 |
|
717 .. attribute:: Attr.prefix |
|
718 |
|
719 The part of the name preceding the colon if there is one, else the empty string. |
|
720 |
|
721 |
|
722 .. _dom-attributelist-objects: |
|
723 |
|
724 NamedNodeMap Objects |
|
725 ^^^^^^^^^^^^^^^^^^^^ |
|
726 |
|
727 :class:`NamedNodeMap` does *not* inherit from :class:`Node`. |
|
728 |
|
729 |
|
730 .. attribute:: NamedNodeMap.length |
|
731 |
|
732 The length of the attribute list. |
|
733 |
|
734 |
|
735 .. method:: NamedNodeMap.item(index) |
|
736 |
|
737 Return an attribute with a particular index. The order you get the attributes |
|
738 in is arbitrary but will be consistent for the life of a DOM. Each item is an |
|
739 attribute node. Get its value with the :attr:`value` attribute. |
|
740 |
|
741 There are also experimental methods that give this class more mapping behavior. |
|
742 You can use them or you can use the standardized :meth:`getAttribute\*` family |
|
743 of methods on the :class:`Element` objects. |
|
744 |
|
745 |
|
746 .. _dom-comment-objects: |
|
747 |
|
748 Comment Objects |
|
749 ^^^^^^^^^^^^^^^ |
|
750 |
|
751 :class:`Comment` represents a comment in the XML document. It is a subclass of |
|
752 :class:`Node`, but cannot have child nodes. |
|
753 |
|
754 |
|
755 .. attribute:: Comment.data |
|
756 |
|
757 The content of the comment as a string. The attribute contains all characters |
|
758 between the leading ``<!-``\ ``-`` and trailing ``-``\ ``->``, but does not |
|
759 include them. |
|
760 |
|
761 |
|
762 .. _dom-text-objects: |
|
763 |
|
764 Text and CDATASection Objects |
|
765 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
766 |
|
767 The :class:`Text` interface represents text in the XML document. If the parser |
|
768 and DOM implementation support the DOM's XML extension, portions of the text |
|
769 enclosed in CDATA marked sections are stored in :class:`CDATASection` objects. |
|
770 These two interfaces are identical, but provide different values for the |
|
771 :attr:`nodeType` attribute. |
|
772 |
|
773 These interfaces extend the :class:`Node` interface. They cannot have child |
|
774 nodes. |
|
775 |
|
776 |
|
777 .. attribute:: Text.data |
|
778 |
|
779 The content of the text node as a string. |
|
780 |
|
781 .. note:: |
|
782 |
|
783 The use of a :class:`CDATASection` node does not indicate that the node |
|
784 represents a complete CDATA marked section, only that the content of the node |
|
785 was part of a CDATA section. A single CDATA section may be represented by more |
|
786 than one node in the document tree. There is no way to determine whether two |
|
787 adjacent :class:`CDATASection` nodes represent different CDATA marked sections. |
|
788 |
|
789 |
|
790 .. _dom-pi-objects: |
|
791 |
|
792 ProcessingInstruction Objects |
|
793 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
794 |
|
795 Represents a processing instruction in the XML document; this inherits from the |
|
796 :class:`Node` interface and cannot have child nodes. |
|
797 |
|
798 |
|
799 .. attribute:: ProcessingInstruction.target |
|
800 |
|
801 The content of the processing instruction up to the first whitespace character. |
|
802 This is a read-only attribute. |
|
803 |
|
804 |
|
805 .. attribute:: ProcessingInstruction.data |
|
806 |
|
807 The content of the processing instruction following the first whitespace |
|
808 character. |
|
809 |
|
810 |
|
811 .. _dom-exceptions: |
|
812 |
|
813 Exceptions |
|
814 ^^^^^^^^^^ |
|
815 |
|
816 .. versionadded:: 2.1 |
|
817 |
|
818 The DOM Level 2 recommendation defines a single exception, :exc:`DOMException`, |
|
819 and a number of constants that allow applications to determine what sort of |
|
820 error occurred. :exc:`DOMException` instances carry a :attr:`code` attribute |
|
821 that provides the appropriate value for the specific exception. |
|
822 |
|
823 The Python DOM interface provides the constants, but also expands the set of |
|
824 exceptions so that a specific exception exists for each of the exception codes |
|
825 defined by the DOM. The implementations must raise the appropriate specific |
|
826 exception, each of which carries the appropriate value for the :attr:`code` |
|
827 attribute. |
|
828 |
|
829 |
|
830 .. exception:: DOMException |
|
831 |
|
832 Base exception class used for all specific DOM exceptions. This exception class |
|
833 cannot be directly instantiated. |
|
834 |
|
835 |
|
836 .. exception:: DomstringSizeErr |
|
837 |
|
838 Raised when a specified range of text does not fit into a string. This is not |
|
839 known to be used in the Python DOM implementations, but may be received from DOM |
|
840 implementations not written in Python. |
|
841 |
|
842 |
|
843 .. exception:: HierarchyRequestErr |
|
844 |
|
845 Raised when an attempt is made to insert a node where the node type is not |
|
846 allowed. |
|
847 |
|
848 |
|
849 .. exception:: IndexSizeErr |
|
850 |
|
851 Raised when an index or size parameter to a method is negative or exceeds the |
|
852 allowed values. |
|
853 |
|
854 |
|
855 .. exception:: InuseAttributeErr |
|
856 |
|
857 Raised when an attempt is made to insert an :class:`Attr` node that is already |
|
858 present elsewhere in the document. |
|
859 |
|
860 |
|
861 .. exception:: InvalidAccessErr |
|
862 |
|
863 Raised if a parameter or an operation is not supported on the underlying object. |
|
864 |
|
865 |
|
866 .. exception:: InvalidCharacterErr |
|
867 |
|
868 This exception is raised when a string parameter contains a character that is |
|
869 not permitted in the context it's being used in by the XML 1.0 recommendation. |
|
870 For example, attempting to create an :class:`Element` node with a space in the |
|
871 element type name will cause this error to be raised. |
|
872 |
|
873 |
|
874 .. exception:: InvalidModificationErr |
|
875 |
|
876 Raised when an attempt is made to modify the type of a node. |
|
877 |
|
878 |
|
879 .. exception:: InvalidStateErr |
|
880 |
|
881 Raised when an attempt is made to use an object that is not defined or is no |
|
882 longer usable. |
|
883 |
|
884 |
|
885 .. exception:: NamespaceErr |
|
886 |
|
887 If an attempt is made to change any object in a way that is not permitted with |
|
888 regard to the `Namespaces in XML <http://www.w3.org/TR/REC-xml-names/>`_ |
|
889 recommendation, this exception is raised. |
|
890 |
|
891 |
|
892 .. exception:: NotFoundErr |
|
893 |
|
894 Exception when a node does not exist in the referenced context. For example, |
|
895 :meth:`NamedNodeMap.removeNamedItem` will raise this if the node passed in does |
|
896 not exist in the map. |
|
897 |
|
898 |
|
899 .. exception:: NotSupportedErr |
|
900 |
|
901 Raised when the implementation does not support the requested type of object or |
|
902 operation. |
|
903 |
|
904 |
|
905 .. exception:: NoDataAllowedErr |
|
906 |
|
907 This is raised if data is specified for a node which does not support data. |
|
908 |
|
909 .. XXX a better explanation is needed! |
|
910 |
|
911 |
|
912 .. exception:: NoModificationAllowedErr |
|
913 |
|
914 Raised on attempts to modify an object where modifications are not allowed (such |
|
915 as for read-only nodes). |
|
916 |
|
917 |
|
918 .. exception:: SyntaxErr |
|
919 |
|
920 Raised when an invalid or illegal string is specified. |
|
921 |
|
922 .. XXX how is this different from InvalidCharacterErr? |
|
923 |
|
924 |
|
925 .. exception:: WrongDocumentErr |
|
926 |
|
927 Raised when a node is inserted in a different document than it currently belongs |
|
928 to, and the implementation does not support migrating the node from one document |
|
929 to the other. |
|
930 |
|
931 The exception codes defined in the DOM recommendation map to the exceptions |
|
932 described above according to this table: |
|
933 |
|
934 +--------------------------------------+---------------------------------+ |
|
935 | Constant | Exception | |
|
936 +======================================+=================================+ |
|
937 | :const:`DOMSTRING_SIZE_ERR` | :exc:`DomstringSizeErr` | |
|
938 +--------------------------------------+---------------------------------+ |
|
939 | :const:`HIERARCHY_REQUEST_ERR` | :exc:`HierarchyRequestErr` | |
|
940 +--------------------------------------+---------------------------------+ |
|
941 | :const:`INDEX_SIZE_ERR` | :exc:`IndexSizeErr` | |
|
942 +--------------------------------------+---------------------------------+ |
|
943 | :const:`INUSE_ATTRIBUTE_ERR` | :exc:`InuseAttributeErr` | |
|
944 +--------------------------------------+---------------------------------+ |
|
945 | :const:`INVALID_ACCESS_ERR` | :exc:`InvalidAccessErr` | |
|
946 +--------------------------------------+---------------------------------+ |
|
947 | :const:`INVALID_CHARACTER_ERR` | :exc:`InvalidCharacterErr` | |
|
948 +--------------------------------------+---------------------------------+ |
|
949 | :const:`INVALID_MODIFICATION_ERR` | :exc:`InvalidModificationErr` | |
|
950 +--------------------------------------+---------------------------------+ |
|
951 | :const:`INVALID_STATE_ERR` | :exc:`InvalidStateErr` | |
|
952 +--------------------------------------+---------------------------------+ |
|
953 | :const:`NAMESPACE_ERR` | :exc:`NamespaceErr` | |
|
954 +--------------------------------------+---------------------------------+ |
|
955 | :const:`NOT_FOUND_ERR` | :exc:`NotFoundErr` | |
|
956 +--------------------------------------+---------------------------------+ |
|
957 | :const:`NOT_SUPPORTED_ERR` | :exc:`NotSupportedErr` | |
|
958 +--------------------------------------+---------------------------------+ |
|
959 | :const:`NO_DATA_ALLOWED_ERR` | :exc:`NoDataAllowedErr` | |
|
960 +--------------------------------------+---------------------------------+ |
|
961 | :const:`NO_MODIFICATION_ALLOWED_ERR` | :exc:`NoModificationAllowedErr` | |
|
962 +--------------------------------------+---------------------------------+ |
|
963 | :const:`SYNTAX_ERR` | :exc:`SyntaxErr` | |
|
964 +--------------------------------------+---------------------------------+ |
|
965 | :const:`WRONG_DOCUMENT_ERR` | :exc:`WrongDocumentErr` | |
|
966 +--------------------------------------+---------------------------------+ |
|
967 |
|
968 |
|
969 .. _dom-conformance: |
|
970 |
|
971 Conformance |
|
972 ----------- |
|
973 |
|
974 This section describes the conformance requirements and relationships between |
|
975 the Python DOM API, the W3C DOM recommendations, and the OMG IDL mapping for |
|
976 Python. |
|
977 |
|
978 |
|
979 .. _dom-type-mapping: |
|
980 |
|
981 Type Mapping |
|
982 ^^^^^^^^^^^^ |
|
983 |
|
984 The primitive IDL types used in the DOM specification are mapped to Python types |
|
985 according to the following table. |
|
986 |
|
987 +------------------+-------------------------------------------+ |
|
988 | IDL Type | Python Type | |
|
989 +==================+===========================================+ |
|
990 | ``boolean`` | ``IntegerType`` (with a value of ``0`` or | |
|
991 | | ``1``) | |
|
992 +------------------+-------------------------------------------+ |
|
993 | ``int`` | ``IntegerType`` | |
|
994 +------------------+-------------------------------------------+ |
|
995 | ``long int`` | ``IntegerType`` | |
|
996 +------------------+-------------------------------------------+ |
|
997 | ``unsigned int`` | ``IntegerType`` | |
|
998 +------------------+-------------------------------------------+ |
|
999 |
|
1000 Additionally, the :class:`DOMString` defined in the recommendation is mapped to |
|
1001 a Python string or Unicode string. Applications should be able to handle |
|
1002 Unicode whenever a string is returned from the DOM. |
|
1003 |
|
1004 The IDL ``null`` value is mapped to ``None``, which may be accepted or |
|
1005 provided by the implementation whenever ``null`` is allowed by the API. |
|
1006 |
|
1007 |
|
1008 .. _dom-accessor-methods: |
|
1009 |
|
1010 Accessor Methods |
|
1011 ^^^^^^^^^^^^^^^^ |
|
1012 |
|
1013 The mapping from OMG IDL to Python defines accessor functions for IDL |
|
1014 ``attribute`` declarations in much the way the Java mapping does. |
|
1015 Mapping the IDL declarations :: |
|
1016 |
|
1017 readonly attribute string someValue; |
|
1018 attribute string anotherValue; |
|
1019 |
|
1020 yields three accessor functions: a "get" method for :attr:`someValue` |
|
1021 (:meth:`_get_someValue`), and "get" and "set" methods for :attr:`anotherValue` |
|
1022 (:meth:`_get_anotherValue` and :meth:`_set_anotherValue`). The mapping, in |
|
1023 particular, does not require that the IDL attributes are accessible as normal |
|
1024 Python attributes: ``object.someValue`` is *not* required to work, and may |
|
1025 raise an :exc:`AttributeError`. |
|
1026 |
|
1027 The Python DOM API, however, *does* require that normal attribute access work. |
|
1028 This means that the typical surrogates generated by Python IDL compilers are not |
|
1029 likely to work, and wrapper objects may be needed on the client if the DOM |
|
1030 objects are accessed via CORBA. While this does require some additional |
|
1031 consideration for CORBA DOM clients, the implementers with experience using DOM |
|
1032 over CORBA from Python do not consider this a problem. Attributes that are |
|
1033 declared ``readonly`` may not restrict write access in all DOM |
|
1034 implementations. |
|
1035 |
|
1036 In the Python DOM API, accessor functions are not required. If provided, they |
|
1037 should take the form defined by the Python IDL mapping, but these methods are |
|
1038 considered unnecessary since the attributes are accessible directly from Python. |
|
1039 "Set" accessors should never be provided for ``readonly`` attributes. |
|
1040 |
|
1041 The IDL definitions do not fully embody the requirements of the W3C DOM API, |
|
1042 such as the notion of certain objects, such as the return value of |
|
1043 :meth:`getElementsByTagName`, being "live". The Python DOM API does not require |
|
1044 implementations to enforce such requirements. |
|
1045 |