0
|
1 |
<HTML>
|
|
2 |
<TITLE>Canonical XML</TITLE>
|
|
3 |
<BODY>
|
|
4 |
<H1>Canonical XML</H1>
|
|
5 |
<P>
|
|
6 |
This document defines a subset of XML called canonical XML.
|
|
7 |
The intended use of canonical XML is in testing XML processors,
|
|
8 |
as a representation of the result of parsing an XML document.
|
|
9 |
<P>
|
|
10 |
Every well-formed XML document has a unique structurally equivalent
|
|
11 |
canonical XML document. Two structurally equivalent XML
|
|
12 |
documents have a byte-for-byte identical canonical XML document.
|
|
13 |
Canonicalizing an XML document requires only information that an XML
|
|
14 |
processor is required to make available to an application.
|
|
15 |
<P>
|
|
16 |
A canonical XML document conforms to the following grammar:
|
|
17 |
<PRE>
|
|
18 |
CanonXML ::= Pi* element Pi*
|
|
19 |
element ::= Stag (Datachar | Pi | element)* Etag
|
|
20 |
Stag ::= '<' Name Atts '>'
|
|
21 |
Etag ::= '</' Name '>'
|
|
22 |
Pi ::= '<?' Name ' ' (((Char - S) Char*)? - (Char* '?>' Char*)) '?>'
|
|
23 |
Atts ::= (' ' Name '=' '"' Datachar* '"')*
|
|
24 |
Datachar ::= '&amp;' | '&lt;' | '&gt;' | '&quot;'
|
|
25 |
| '&#9;'| '&#10;'| '&#13;'
|
|
26 |
| (Char - ('&' | '<' | '>' | '"' | #x9 | #xA | #xD))
|
|
27 |
Name ::= (see XML spec)
|
|
28 |
Char ::= (see XML spec)
|
|
29 |
S ::= (see XML spec)
|
|
30 |
</PRE>
|
|
31 |
<P>
|
|
32 |
Attributes are in lexicographical order (in Unicode bit order).
|
|
33 |
<P>
|
|
34 |
A canonical XML document is encoded in UTF-8.
|
|
35 |
<P>
|
|
36 |
Ignorable white space is considered significant and is treated equivalently
|
|
37 |
to data.
|
|
38 |
<P>
|
|
39 |
<ADDRESS>
|
|
40 |
<A HREF="mailto:jjc@jclark.com">James Clark</A>
|
|
41 |
</ADDRESS>
|
|
42 |
|
|
43 |
</BODY>
|
|
44 |
</HTML> |