equal
deleted
inserted
replaced
|
1 <HTML> |
|
2 <TITLE>Canonical XML</TITLE> |
|
3 <BODY> |
|
4 <H1>Canonical XML</H1> |
|
5 <P> |
|
6 This document defines a subset of XML called canonical XML. |
|
7 The intended use of canonical XML is in testing XML processors, |
|
8 as a representation of the result of parsing an XML document. |
|
9 <P> |
|
10 Every well-formed XML document has a unique structurally equivalent |
|
11 canonical XML document. Two structurally equivalent XML |
|
12 documents have a byte-for-byte identical canonical XML document. |
|
13 Canonicalizing an XML document requires only information that an XML |
|
14 processor is required to make available to an application. |
|
15 <P> |
|
16 A canonical XML document conforms to the following grammar: |
|
17 <PRE> |
|
18 CanonXML ::= Pi* element Pi* |
|
19 element ::= Stag (Datachar | Pi | element)* Etag |
|
20 Stag ::= '<' Name Atts '>' |
|
21 Etag ::= '</' Name '>' |
|
22 Pi ::= '<?' Name ' ' (((Char - S) Char*)? - (Char* '?>' Char*)) '?>' |
|
23 Atts ::= (' ' Name '=' '"' Datachar* '"')* |
|
24 Datachar ::= '&amp;' | '&lt;' | '&gt;' | '&quot;' |
|
25 | '&#9;'| '&#10;'| '&#13;' |
|
26 | (Char - ('&' | '<' | '>' | '"' | #x9 | #xA | #xD)) |
|
27 Name ::= (see XML spec) |
|
28 Char ::= (see XML spec) |
|
29 S ::= (see XML spec) |
|
30 </PRE> |
|
31 <P> |
|
32 Attributes are in lexicographical order (in Unicode bit order). |
|
33 <P> |
|
34 A canonical XML document is encoded in UTF-8. |
|
35 <P> |
|
36 Ignorable white space is considered significant and is treated equivalently |
|
37 to data. |
|
38 <P> |
|
39 <ADDRESS> |
|
40 <A HREF="mailto:jjc@jclark.com">James Clark</A> |
|
41 </ADDRESS> |
|
42 |
|
43 </BODY> |
|
44 </HTML> |