tests/auto/qxmlstream/XML-Test-Suite/xmlconf/xmltest/canonxml.html
changeset 0 1918ee327afb
child 7 3f74d0d4af4c
equal deleted inserted replaced
-1:000000000000 0:1918ee327afb
       
     1 <HTML>
       
     2 <TITLE>Canonical XML</TITLE>
       
     3 <BODY>
       
     4 <H1>Canonical XML</H1>
       
     5 <P>
       
     6 This document defines a subset of XML called canonical XML.
       
     7 The intended use of canonical XML is in testing XML processors,
       
     8 as a representation of the result of parsing an XML document.
       
     9 <P>
       
    10 Every well-formed XML document has a unique structurally equivalent
       
    11 canonical XML document.  Two structurally equivalent XML
       
    12 documents have a byte-for-byte identical canonical XML document.
       
    13 Canonicalizing an XML document requires only information that an XML
       
    14 processor is required to make available to an application.
       
    15 <P>
       
    16 A canonical XML document conforms to the following grammar:
       
    17 <PRE>
       
    18 CanonXML    ::= Pi* element Pi*
       
    19 element     ::= Stag (Datachar | Pi | element)* Etag
       
    20 Stag        ::= '&lt;'  Name Atts '&gt;'
       
    21 Etag        ::= '&lt;/' Name '&gt;'
       
    22 Pi          ::= '&lt;?' Name ' ' (((Char - S) Char*)? - (Char* '?&gt;' Char*)) '?&gt;'
       
    23 Atts        ::= (' ' Name '=' '"' Datachar* '"')*
       
    24 Datachar    ::= '&amp;amp;' | '&amp;lt;' | '&amp;gt;' | '&amp;quot;'
       
    25                  | '&amp;#9;'| '&amp;#10;'| '&amp;#13;'
       
    26                  | (Char - ('&amp;' | '&lt;' | '&gt;' | '"' | #x9 | #xA | #xD))
       
    27 Name        ::= (see XML spec)
       
    28 Char        ::= (see XML spec)
       
    29 S           ::= (see XML spec)
       
    30 </PRE>
       
    31 <P>
       
    32 Attributes are in lexicographical order (in Unicode bit order).
       
    33 <P>
       
    34 A canonical XML document is encoded in UTF-8.
       
    35 <P>
       
    36 Ignorable white space is considered significant and is treated equivalently
       
    37 to data.
       
    38 <P>
       
    39 <ADDRESS>
       
    40 <A HREF="mailto:jjc@jclark.com">James Clark</A>
       
    41 </ADDRESS>
       
    42 
       
    43 </BODY>
       
    44 </HTML>