author | eckhart.koppen@nokia.com |
Wed, 31 Mar 2010 11:06:36 +0300 | |
changeset 7 | f7bc934e204c |
parent 0 | 1918ee327afb |
permissions | -rw-r--r-- |
7
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
1 |
<HTML> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
2 |
<TITLE>Canonical XML</TITLE> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
3 |
<BODY> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
4 |
<H1>Canonical XML</H1> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
5 |
<P> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
6 |
This document defines a subset of XML called canonical XML. |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
7 |
The intended use of canonical XML is in testing XML processors, |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
8 |
as a representation of the result of parsing an XML document. |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
9 |
<P> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
10 |
Every well-formed XML document has a unique structurally equivalent |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
11 |
canonical XML document. Two structurally equivalent XML |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
12 |
documents have a byte-for-byte identical canonical XML document. |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
13 |
Canonicalizing an XML document requires only information that an XML |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
14 |
processor is required to make available to an application. |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
15 |
<P> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
16 |
A canonical XML document conforms to the following grammar: |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
17 |
<PRE> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
18 |
CanonXML ::= Pi* element Pi* |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
19 |
element ::= Stag (Datachar | Pi | element)* Etag |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
20 |
Stag ::= '<' Name Atts '>' |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
21 |
Etag ::= '</' Name '>' |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
22 |
Pi ::= '<?' Name ' ' (((Char - S) Char*)? - (Char* '?>' Char*)) '?>' |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
23 |
Atts ::= (' ' Name '=' '"' Datachar* '"')* |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
24 |
Datachar ::= '&amp;' | '&lt;' | '&gt;' | '&quot;' |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
25 |
| '&#9;'| '&#10;'| '&#13;' |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
26 |
| (Char - ('&' | '<' | '>' | '"' | #x9 | #xA | #xD)) |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
27 |
Name ::= (see XML spec) |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
28 |
Char ::= (see XML spec) |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
29 |
S ::= (see XML spec) |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
30 |
</PRE> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
31 |
<P> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
32 |
Attributes are in lexicographical order (in Unicode bit order). |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
33 |
<P> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
34 |
A canonical XML document is encoded in UTF-8. |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
35 |
<P> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
36 |
Ignorable white space is considered significant and is treated equivalently |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
37 |
to data. |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
38 |
<P> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
39 |
<ADDRESS> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
40 |
<A HREF="mailto:jjc@jclark.com">James Clark</A> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
41 |
</ADDRESS> |
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
42 |
|
f7bc934e204c
5cabc75a39ca2f064f70b40f72ed93c74c4dc19b
eckhart.koppen@nokia.com
parents:
0
diff
changeset
|
43 |
</BODY> |
0 | 44 |
</HTML> |