Symbian3/SDK/Source/GUID-CB568D14-1B0C-568F-B9CA-DDD15C53EEF3.dita
author Dominic Pinkman <dominic.pinkman@nokia.com>
Wed, 16 Jun 2010 10:24:13 +0100
changeset 10 d4524d6a4472
parent 0 89d6a7a84779
permissions -rw-r--r--
removal of PIPS 'antiword' example pending a decision on its license

<?xml version="1.0" encoding="utf-8"?>
<!-- Copyright (c) 2007-2010 Nokia Corporation and/or its subsidiary(-ies) All rights reserved. -->
<!-- This component and the accompanying materials are made available under the terms of the License 
"Eclipse Public License v1.0" which accompanies this distribution, 
and is available at the URL "http://www.eclipse.org/legal/epl-v10.html". -->
<!-- Initial Contributors:
    Nokia Corporation - initial contribution.
Contributors: 
-->
<!DOCTYPE concept
  PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept xml:lang="en" id="GUID-CB568D14-1B0C-568F-B9CA-DDD15C53EEF3"><title>Writing a Parser Plug-in</title><prolog><metadata><keywords/></metadata></prolog><conbody><p>This section describes how to write a parser plug-in. </p> <section><title>Introduction</title> <p>The Symbian XML framework supplies an XML parser plug-in which is based on <xref scope="external" href="http://expat.sourceforge.net/">Expat</xref>. The framework provides plug-ins with standard features. However a user can customize a plug-in according to his requirement such as parsing only a part of a document or releasing a specific resource. </p> <p>The Symbian platform XML framework defines certain standard features which a parser may have, and while designing a parser, consider the features it provides. The following is a list of standard features provided by a parser: </p> <ul><li id="GUID-86B70B90-0CAA-53CE-9B81-1B22A9B83A89"><p>Report unrecognised tags, namespace, namespace prefix and mappings. </p> </li> <li id="GUID-20C1FD61-F800-57D0-A489-5C4243D661CA"><p>Convert elements and attributes to lower case, that is, it is case-insensitive like an HTML parser. </p> </li> <li id="GUID-D67D42B2-0B50-5F2A-A41F-57D04FDEB610"><p>Describe the data in a specified encoding: the default is UTF-8. </p> </li> <li id="GUID-99EA2A0E-D83E-5E1A-8E81-B1005AEE67E8"><p>Accept XML 1.0 and XML 1.1. The default is to accept XML 1.0 only. </p> </li> </ul> <p>A user defined parser plug-in must implement the <xref href="GUID-6F334B00-8026-3FA3-AE96-B0A511030B7B.dita"><apiname>MParser</apiname></xref> interface which has six pure virtual APIs. Three of them concern the parser features listed above. Two other methods perform the parsing; their purpose is to implement the parse functions of the <xref href="GUID-3C824E3B-68AB-31C5-A3D7-26A73B53D076.dita"><apiname>CParser</apiname></xref> class discussed in <xref href="GUID-E16070E5-379A-5818-81CC-B00059A40084.dita">Choosing a Parser Plug-in</xref> and one method is for releasing resources. The following is the list of APIs of <xref href="GUID-6F334B00-8026-3FA3-AE96-B0A511030B7B.dita"><apiname>MParser</apiname></xref>: </p> <table id="GUID-66D19903-DB61-5E37-B79E-65DD12F80A7C"><tgroup cols="2"><colspec colname="col0"/><colspec colname="col1"/><thead><row><entry>Class</entry> <entry>Description</entry> </row> </thead> <tbody><row><entry><p> <xref href="GUID-7A7E7EF1-BF0B-32E7-B278-2E0D32B51629.dita"><apiname>EnableFeature()</apiname></xref>  </p> </entry> <entry><p>Enables the feature. </p> </entry> </row> <row><entry><p> <xref href="GUID-E5481858-1D95-3870-B0E9-66DFA521E873.dita"><apiname>DisableFeature()</apiname></xref>  </p> </entry> <entry><p>Disables the feature. </p> </entry> </row> <row><entry><p> <xref href="GUID-F593E8CE-4D9C-39B6-9CE1-AA4CA6D5A240.dita"><apiname>IsFeatureEnabled()</apiname></xref>  </p> </entry> <entry><p>Checks if the feature is enabled. </p> </entry> </row> <row><entry><p> <xref href="GUID-E603C5F8-7D4E-3857-A82F-45C78D654C68.dita"><apiname>ParseChunkL()</apiname></xref>  </p> </entry> <entry><p>Parses part of a document. Implements <xref href="GUID-3C824E3B-68AB-31C5-A3D7-26A73B53D076.dita#GUID-3C824E3B-68AB-31C5-A3D7-26A73B53D076/GUID-ED062E34-DE9F-3191-952C-9E5DB081E389"><apiname>CParser::ParseL()</apiname></xref>. </p> </entry> </row> <row><entry><p> <xref href="GUID-FC9D6980-D12C-39DE-9DE3-C3236D699EFB.dita"><apiname>ParseLastChunkL()</apiname></xref>  </p> </entry> <entry><p>Parses the last part of a document. Implements <codeph>CParser::ParseEndL()</codeph>. </p> </entry> </row> <row><entry><p> <xref href="GUID-7F8FDB43-B847-3AFF-A78F-48F2E3DBFDC2.dita"><apiname>Release()</apiname></xref>  </p> </entry> <entry><p>Must be called to release resources when the framework has finished using the parser implementation. </p> </entry> </row> </tbody> </tgroup> </table> <p>Some documents contain markups from more than one XML application, which means that the parser may encounter tags and attributes which look the same but belong to different namespaces. This is why the <xref href="GUID-6F334B00-8026-3FA3-AE96-B0A511030B7B.dita"><apiname>MParser</apiname></xref> interface provides a feature for the reporting of namespaces. XML associates tags and attributes with namespaces by adding a prefix to them, and the prefixes are mapped to the URI where the namespace is defined. The class <xref href="GUID-6CEDFB6D-16B6-3860-922B-15A23C182DB2.dita"><apiname>RTagInfo</apiname></xref> is provided to hold this information. It is initialised with three strings representing the URI, prefix and local name, and these information can be retrieved by <xref href="GUID-DA922EE4-F0EB-38BA-9A68-08B9AD536E22.dita"><apiname>Uri()</apiname></xref>, <xref href="GUID-C9E56BB6-4DD9-3735-84F6-48C671B7270E.dita"><apiname>Prefix()</apiname></xref> and <xref href="GUID-BE775E7E-560D-3B41-9BE3-87F7495808C3.dita"><apiname>LocalName()</apiname></xref> respectively. If the application has to parse documents which combine multiple namespaces, then the implementation of <xref href="GUID-6F334B00-8026-3FA3-AE96-B0A511030B7B.dita"><apiname>MParser</apiname></xref> must hold a parsed tag and attributes in an <xref href="GUID-6CEDFB6D-16B6-3860-922B-15A23C182DB2.dita"><apiname>RTagInfo</apiname></xref> object. The content handler will then have sufficient information to react differently to tags in different namespaces. </p> <p>Some XML applications, notably WBXML, extend XML syntax by adding extension tokens to the markup language. The <xref scope="external" href="http://www.w3.org/TR/wbxml/">WBXML specification</xref> defines nine global extension tokens but does not assign semantics to them. The meaning of extension tokens is specific to the document in which they are used, but they are typically used for compression to identify certain data which must be compressed in a specific way. For instance, extension tokens are sometimes used to identify data as being variables not constants, or as having a particular data type. To handle extension tokens, a parser plugin must implement the method <xref href="GUID-9D19F08C-CB2F-3644-A74F-8A7C453690B5.dita#GUID-9D19F08C-CB2F-3644-A74F-8A7C453690B5/GUID-60CBCF44-5675-359A-8626-58BDEF767C0C"><apiname>WbxmlExtensionHandler::OnExtensionL()</apiname></xref> with three parameters <xref href="GUID-33D3FD1B-06E6-38A5-9446-571A116894B0.dita"><apiname>aData</apiname></xref>, <xref href="GUID-52E94B14-A781-3645-8794-1442A1B05D71.dita"><apiname>aToken</apiname></xref>, <xref href="GUID-A7FB1F60-B735-37BC-ACEA-675C4B44CCF8.dita"><apiname>aErrorCode</apiname></xref>. The first parameter holds the actual data, the second specifies the global extension token and the third is the error code. </p> <p> </p> </section> <section><title>Procedure</title> <p>To write a parser plug-in, follow the steps given below: </p> <ol id="GUID-33F33876-0B72-528F-AAF5-3AC785AF7ED7"><li id="GUID-92AFD990-5EE3-5244-9DD7-38E3730A5352"><p>Encapsulate the data structures in <codeph>TParserInitParams</codeph>. </p> <p>The main data structures required are contained in the <xref href="GUID-5F929E9F-E895-3DA6-8072-28C4B8B1CF81.dita"><apiname>TParserInitParams</apiname></xref> class, which is typically passed as a parameter to the constructor method of an <xref href="GUID-6F334B00-8026-3FA3-AE96-B0A511030B7B.dita"><apiname>MParser</apiname></xref> implementation. <xref href="GUID-5F929E9F-E895-3DA6-8072-28C4B8B1CF81.dita"><apiname>TParserInitParams</apiname></xref> has the following member classes: </p> <table id="GUID-342A5B66-015B-5490-BDDA-B5ED84068652"><tgroup cols="2"><colspec colname="col0"/><colspec colname="col1"/><thead><row><entry>API</entry> <entry>Description</entry> </row> </thead> <tbody><row><entry><p> <xref href="GUID-E221E7DA-BDB3-3B92-9D96-5D4605CBF195.dita"><apiname>CCharSetConverter</apiname></xref>  </p> </entry> <entry><p>Used to convert text to and from Unicode. </p> </entry> </row> <row><entry><p> <xref href="GUID-A6B8386B-29F6-3BEC-9D77-D8E0900DEAC2.dita"><apiname>MContentHandler</apiname></xref>  </p> </entry> <entry><p>Interface to the application which is writen to handle the output of the parser. It is discussed in Using Symbian XML Framework. </p> </entry> </row> <row><entry><p> <xref href="GUID-094A4884-182E-3A10-80F5-85A925020BC1.dita"><apiname>RStringDictionaryCollection</apiname></xref>  </p> </entry> <entry><p>A collection of string dictionaries discussed in <xref href="GUID-F4099885-55A0-5ACF-A73B-9C647B02B142.dita">Customising a Parser</xref>. A string dictionary is an implementation of the <xref href="GUID-AC915E8D-8B87-3A83-BF28-6C0B3B6CFB89.dita"><apiname>MStringDictionary</apiname></xref> interface which is used to tokenise XML input into tagged elements in accordance with the DTD associated with the document to be parsed. </p> </entry> </row> <row><entry><p> <xref href="GUID-AE53784D-B405-34D8-9A93-ACDE6F8ECA44.dita"><apiname>RElementStack</apiname></xref>  </p> </entry> <entry><p>An array structure used to stack elements in the order in which the parser encounters them. </p> </entry> </row> </tbody> </tgroup> </table> <codeblock id="GUID-EC2BA881-38B9-5E89-AB69-03CF224998DF" xml:space="preserve">
MParser* CMyParser::NewL(TAny* aInitParams)
    {
     CMyParser* self = new( ELeave ) CMyParser( reinterpret_cast&lt;TParserInitParams*&gt;( aInitParams ) );
     return( static_cast&lt;MParser*&gt;( self ) );
    }
CMyParser::CMyParser( TParserInitParams* aInitParams )
:   iContentHandler( reinterpret_cast&lt;MContentHandler*&gt;( aInitParams-&gt;iContentHandler ) ),
    iStringDictionaryCollection( reinterpret_cast&lt;RStringDictionaryCollection*&gt;( aInitParams-&gt;iStringDictionaryCollection ) ),
    iCharSetConverter( reinterpret_cast&lt;CCharSetConverter*&gt;( aInitParams-&gt;iCharSetConverter ) ),
    iElementStack( reinterpret_cast&lt;RElementStack*&gt;( aInitParams-&gt;iElementStack ) )
    {
    }</codeblock> </li> <li id="GUID-1C9D73E8-1A14-51EF-9E41-03E6EFA3165B"><p>Select XML parser features. </p> </li> <li id="GUID-B5493F5A-41CF-5A3A-806C-D7A56C26B981"><p>Implement CMyParser derived from <codeph>MParser</codeph>. </p> <codeblock id="GUID-2D9DFA8A-A6E1-5279-AAD3-870AB12387BF" xml:space="preserve">
class CMyParser : public MParser
    {
    static MParser* NewL(TAny* aInitParams);
    virtual ~CMyParser();
    
    public:
        /** Enable a feature. */
        TInt EnableFeature( TInt aParserFeature )
            { 
            // your code here to enable the specified feature
            }
       /** Disable a feature. */
       TInt DisableFeature( TInt aParserFeature )
            { 
            // your code here to disable the specified feature
            }
       /** See if a feature is enabled. */
       TBool IsFeatureEnabled( TInt aParserFeature ) const
            { 
            // your code here to check if the specified feature is enabled
            } 
       /** Parses a descriptor that contains part of a document. */
       void ParseChunkL( const TDesC8&amp; aChunk )
            { 
            // your code here
            } 
       /** Parses a descriptor that contains the last  part of a document. */
       void ParseLastChunkL( const TDesC8&amp; aFinalChunk )
            { 
            // your code here
            } 
       /** Interfaces don't have a destructor, so we have an explicit method instead. */
       void Release()
            { 
            // your code here 
            }
    };</codeblock> </li> <li id="GUID-B6B9ED7E-EFD0-5788-A0BE-6C8911FBF66E"><p>Release resources using <xref href="GUID-7F8FDB43-B847-3AFF-A78F-48F2E3DBFDC2.dita"><apiname>Release()</apiname></xref>. </p> <p>When a parse fails, the parser object must be destroyed. This means that the implementation of the <xref href="GUID-6F334B00-8026-3FA3-AE96-B0A511030B7B.dita"><apiname>MParser</apiname></xref> and <xref href="GUID-A6B8386B-29F6-3BEC-9D77-D8E0900DEAC2.dita"><apiname>MContentHandler</apiname></xref> methods must contain calls to <xref href="GUID-C197C9A7-EA05-3F24-9854-542E984C612D.dita#GUID-C197C9A7-EA05-3F24-9854-542E984C612D/GUID-96AFAC46-F3AD-392B-8A97-AFCBF2978CFB"><apiname>User::LeaveIfError()</apiname></xref> with an error code as parameter. Specific error codes are supplied for various cases as discussed in the <xref href="GUID-5DACAB53-6D32-5250-9BC2-3E8597C3E2B2.dita">Error Codes</xref> section of this document. </p> </li> </ol> </section> </conbody></concept>