author | MattD <mattd@symbian.org> |
Tue, 04 Aug 2009 14:40:11 +0100 | |
changeset 307 | 989c70555820 |
permissions | -rw-r--r-- |
307
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
1 |
=head1 NAME |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
2 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
3 |
XML::SAX::Intro - An Introduction to SAX Parsing with Perl |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
4 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
5 |
=head1 Introduction |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
6 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
7 |
XML::SAX is a new way to work with XML Parsers in Perl. In this article |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
8 |
we'll discuss why you should be using SAX, why you should be using |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
9 |
XML::SAX, and we'll see some of the finer implementation details. The |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
10 |
text below assumes some familiarity with callback, or push based |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
11 |
parsing, but if you are unfamiliar with these techniques then a good |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
12 |
place to start is Kip Hampton's excellent series of articles on XML.com. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
13 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
14 |
=head1 Replacing XML::Parser |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
15 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
16 |
The de-facto way of parsing XML under perl is to use Larry Wall and |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
17 |
Clark Cooper's XML::Parser. This module is a Perl and XS wrapper around |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
18 |
the expat XML parser library by James Clark. It has been a hugely |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
19 |
successful project, but suffers from a couple of rather major flaws. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
20 |
Firstly it is a proprietary API, designed before the SAX API was |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
21 |
conceived, which means that it is not easily replaceable by other |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
22 |
streaming parsers. Secondly it's callbacks are subrefs. This doesn't |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
23 |
sound like much of an issue, but unfortunately leads to code like: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
24 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
25 |
sub handle_start { |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
26 |
my ($e, $el, %attrs) = @_; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
27 |
if ($el eq 'foo') { |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
28 |
$e->{inside_foo}++; # BAD! $e is an XML::Parser::Expat object. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
29 |
} |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
30 |
} |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
31 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
32 |
As you can see, we're using the $e object to hold our state |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
33 |
information, which is a bad idea because we don't own that object - we |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
34 |
didn't create it. It's an internal object of XML::Parser, that happens |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
35 |
to be a hashref. We could all too easily overwrite XML::Parser internal |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
36 |
state variables by using this, or Clark could change it to an array ref |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
37 |
(not that he would, because it would break so much code, but he could). |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
38 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
39 |
The only way currently with XML::Parser to safely maintain state is to |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
40 |
use a closure: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
41 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
42 |
my $state = MyState->new(); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
43 |
$parser->setHandlers(Start => sub { handle_start($state, @_) }); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
44 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
45 |
This closure traps the $state variable, which now gets passed as the |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
46 |
first parameter to your callback. Unfortunately very few people use |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
47 |
this technique, as it is not documented in the XML::Parser POD files. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
48 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
49 |
Another reason you might not want to use XML::Parser is because you |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
50 |
need some feature that it doesn't provide (such as validation), or you |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
51 |
might need to use a library that doesn't use expat, due to it not being |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
52 |
installed on your system, or due to having a restrictive ISP. Using SAX |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
53 |
allows you to work around these restrictions. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
54 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
55 |
=head1 Introducing SAX |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
56 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
57 |
SAX stands for the Simple API for XML. And simple it really is. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
58 |
Constructing a SAX parser and passing events to handlers is done as |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
59 |
simply as: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
60 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
61 |
use XML::SAX; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
62 |
use MySAXHandler; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
63 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
64 |
my $parser = XML::SAX::ParserFactory->parser( |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
65 |
Handler => MySAXHandler->new |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
66 |
); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
67 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
68 |
$parser->parse_uri("foo.xml"); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
69 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
70 |
The important concept to grasp here is that SAX uses a factory class |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
71 |
called XML::SAX::ParserFactory to create a new parser instance. The |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
72 |
reason for this is so that you can support other underlying |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
73 |
parser implementations for different feature sets. This is one thing |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
74 |
that XML::Parser has always sorely lacked. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
75 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
76 |
In the code above we see the parse_uri method used, but we could |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
77 |
have equally well |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
78 |
called parse_file, parse_string, or parse(). Please see XML::SAX::Base |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
79 |
for what these methods take as parameters, but don't be fooled into |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
80 |
believing parse_file takes a filename. No, it takes a file handle, a |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
81 |
glob, or a subclass of IO::Handle. Beware. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
82 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
83 |
SAX works very similarly to XML::Parser's default callback method, |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
84 |
except it has one major difference: rather than setting individual |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
85 |
callbacks, you create a new class in which to recieve the callbacks. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
86 |
Each callback is called as a method call on an instance of that handler |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
87 |
class. An example will best demonstrate this: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
88 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
89 |
package MySAXHandler; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
90 |
use base qw(XML::SAX::Base); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
91 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
92 |
sub start_document { |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
93 |
my ($self, $doc) = @_; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
94 |
# process document start event |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
95 |
} |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
96 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
97 |
sub start_element { |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
98 |
my ($self, $el) = @_; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
99 |
# process element start event |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
100 |
} |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
101 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
102 |
Now, when we instantiate this as above, and parse some XML with this as |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
103 |
the handler, the methods start_document and start_element will be |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
104 |
called as method calls, so this would be the equivalent of directly |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
105 |
calling: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
106 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
107 |
$object->start_element($el); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
108 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
109 |
Notice how this is different to XML::Parser's calling style, which |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
110 |
calls: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
111 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
112 |
start_element($e, $name, %attribs); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
113 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
114 |
It's the difference between function calling and method calling which |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
115 |
allows you to subclass SAX handlers which contributes to SAX being a |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
116 |
powerful solution. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
117 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
118 |
As you can see, unlike XML::Parser, we have to define a new package in |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
119 |
which to do our processing (there are hacks you can do to make this |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
120 |
uneccessary, but I'll leave figuring those out to the experts). The |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
121 |
biggest benefit of this is that you maintain your own state variable |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
122 |
($self in the above example) thus freeing you of the concerns listed |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
123 |
above. It is also an improvement in maintainability - you can place the |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
124 |
code in a separate file if you wish to, and your callback methods are |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
125 |
always called the same thing, rather than having to choose a suitable |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
126 |
name for them as you had to with XML::Parser. This is an obvious win. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
127 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
128 |
SAX parsers are also very flexible in how you pass a handler to them. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
129 |
You can use a constructor parameter as we saw above, or we can pass the |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
130 |
handler directly in the call to one of the parse methods: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
131 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
132 |
$parser->parse(Handler => $handler, |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
133 |
Source => { SystemId => "foo.xml" }); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
134 |
# or... |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
135 |
$parser->parse_file($fh, Handler => $handler); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
136 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
137 |
This flexibility allows for one parser to be used in many different |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
138 |
scenarios throughout your script (though one shouldn't feel pressure to |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
139 |
use this method, as parser construction is generally not a time |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
140 |
consuming process). |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
141 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
142 |
=head1 Callback Parameters |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
143 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
144 |
The only other thing you need to know to understand basic SAX is the |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
145 |
structure of the parameters passed to each of the callbacks. In |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
146 |
XML::Parser, all parameters are passed as multiple options to the |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
147 |
callbacks, so for example the Start callback would be called as |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
148 |
my_start($e, $name, %attributes), and the PI callback would be called |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
149 |
as my_processing_instruction($e, $target, $data). In SAX, every |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
150 |
callback is passed a hash reference, containing entries that define our |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
151 |
"node". The key callbacks and the structures they receive are: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
152 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
153 |
=head2 start_element |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
154 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
155 |
The start_element handler is called whenever a parser sees an opening |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
156 |
tag. It is passed an element structure consisting of: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
157 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
158 |
=over 4 |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
159 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
160 |
=item LocalName |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
161 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
162 |
The name of the element minus any namespace prefix it may |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
163 |
have come with in the document. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
164 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
165 |
=item NamespaceURI |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
166 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
167 |
The URI of the namespace associated with this element, |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
168 |
or the empty string for none. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
169 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
170 |
=item Attributes |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
171 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
172 |
A set of attributes as described below. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
173 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
174 |
=item Name |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
175 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
176 |
The name of the element as it was seen in the document (i.e. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
177 |
including any prefix associated with it) |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
178 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
179 |
=item Prefix |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
180 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
181 |
The prefix used to qualify this element's namespace, or the |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
182 |
empty string if none. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
183 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
184 |
=back |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
185 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
186 |
The B<Attributes> are a hash reference, keyed by what we have called |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
187 |
"James Clark" notation. This means that the attribute name has been |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
188 |
expanded to include any associated namespace URI, and put together as |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
189 |
{ns}name, where "ns" is the expanded namespace URI of the attribute if |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
190 |
and only if the attribute had a prefix, and "name" is the LocalName of |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
191 |
the attribute. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
192 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
193 |
The value of each entry in the attributes hash is another hash |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
194 |
structure consisting of: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
195 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
196 |
=over 4 |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
197 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
198 |
=item LocalName |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
199 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
200 |
The name of the attribute minus any namespace prefix it may have |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
201 |
come with in the document. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
202 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
203 |
=item NamespaceURI |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
204 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
205 |
The URI of the namespace associated with this attribute. If the |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
206 |
attribute had no prefix, then this consists of just the empty string. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
207 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
208 |
=item Name |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
209 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
210 |
The attribute's name as it appeared in the document, including any |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
211 |
namespace prefix. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
212 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
213 |
=item Prefix |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
214 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
215 |
The prefix used to qualify this attribute's namepace, or the |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
216 |
empty string if none. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
217 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
218 |
=item Value |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
219 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
220 |
The value of the attribute. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
221 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
222 |
=back |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
223 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
224 |
So a full example, as output by Data::Dumper might be: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
225 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
226 |
.... |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
227 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
228 |
=head2 end_element |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
229 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
230 |
The end_element handler is called either when a parser sees a closing |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
231 |
tag, or after start_element has been called for an empty element (do |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
232 |
note however that a parser may if it is so inclined call characters |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
233 |
with an empty string when it sees an empty element. There is no simple |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
234 |
way in SAX to determine if the parser in fact saw an empty element, a |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
235 |
start and end element with no content.. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
236 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
237 |
The end_element handler receives exactly the same structure as |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
238 |
start_element, minus the Attributes entry. One must note though that it |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
239 |
should not be a reference to the same data as start_element receives, |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
240 |
so you may change the values in start_element but this will not affect |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
241 |
the values later seen by end_element. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
242 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
243 |
=head2 characters |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
244 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
245 |
The characters callback may be called in serveral circumstances. The |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
246 |
most obvious one is when seeing ordinary character data in the markup. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
247 |
But it is also called for text in a CDATA section, and is also called |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
248 |
in other situations. A SAX parser has to make no guarantees whatsoever |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
249 |
about how many times it may call characters for a stretch of text in an |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
250 |
XML document - it may call once, or it may call once for every |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
251 |
character in the text. In order to work around this it is often |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
252 |
important for the SAX developer to use a bundling technique, where text |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
253 |
is gathered up and processed in one of the other callbacks. This is not |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
254 |
always necessary, but it is a worthwhile technique to learn, which we |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
255 |
will cover in XML::SAX::Advanced (when I get around to writing it). |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
256 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
257 |
The characters handler is called with a very simple structure - a hash |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
258 |
reference consisting of just one entry: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
259 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
260 |
=over 4 |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
261 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
262 |
=item Data |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
263 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
264 |
The text data that was received. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
265 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
266 |
=back |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
267 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
268 |
=head2 comment |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
269 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
270 |
The comment callback is called for comment text. Unlike with |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
271 |
C<characters()>, the comment callback *must* be invoked just once for an |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
272 |
entire comment string. It receives a single simple structure - a hash |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
273 |
reference containing just one entry: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
274 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
275 |
=over 4 |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
276 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
277 |
=item Data |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
278 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
279 |
The text of the comment. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
280 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
281 |
=back |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
282 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
283 |
=head2 processing_instruction |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
284 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
285 |
The processing instruction handler is called for all processing |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
286 |
instructions in the document. Note that these processing instructions |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
287 |
may appear before the document root element, or after it, or anywhere |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
288 |
where text and elements would normally appear within the document, |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
289 |
according to the XML specification. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
290 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
291 |
The handler is passed a structure containing just two entries: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
292 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
293 |
=over 4 |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
294 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
295 |
=item Target |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
296 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
297 |
The target of the processing instrcution |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
298 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
299 |
=item Data |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
300 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
301 |
The text data in the processing instruction. Can be an empty |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
302 |
string for a processing instruction that has no data element. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
303 |
For example E<lt>?wiggle?E<gt> is a perfectly valid processing instruction. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
304 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
305 |
=back |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
306 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
307 |
=head1 Tip of the iceberg |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
308 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
309 |
What we have discussed above is really the tip of the SAX iceberg. And |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
310 |
so far it looks like there's not much of interest to SAX beyond what we |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
311 |
have seen with XML::Parser. But it does go much further than that, I |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
312 |
promise. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
313 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
314 |
People who hate Object Oriented code for the sake of it may be thinking |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
315 |
here that creating a new package just to parse something is a waste |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
316 |
when they've been parsing things just fine up to now using procedural |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
317 |
code. But there's reason to all this madness. And that reason is SAX |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
318 |
Filters. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
319 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
320 |
As you saw right at the very start, to let the parser know about our |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
321 |
class, we pass it an instance of our class as the Handler to the |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
322 |
parser. But now imagine what would happen if our class could also take |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
323 |
a Handler option, and simply do some processing and pass on our data |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
324 |
further down the line? That in a nutshell is how SAX filters work. It's |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
325 |
Unix pipes for the 21st century! |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
326 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
327 |
There are two downsides to this. Number 1 - writing SAX filters can be |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
328 |
tricky. If you look into the future and read the advanced tutorial I'm |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
329 |
writing, you'll see that Handler can come in several shapes and sizes. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
330 |
So making sure your filter does the right thing can be tricky. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
331 |
Secondly, constructing complex filter chains can be difficult, and |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
332 |
simple thinking tells us that we only get one pass at our document, |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
333 |
when often we'll need more than that. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
334 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
335 |
Luckily though, those downsides have been fixed by the release of two |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
336 |
very cool modules. What's even better is that I didn't write either of |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
337 |
them! |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
338 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
339 |
The first module is XML::SAX::Base. This is a VITAL SAX module that |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
340 |
acts as a base class for all SAX parsers and filters. It provides an |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
341 |
abstraction away from calling the handler methods, that makes sure your |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
342 |
filter or parser does the right thing, and it does it FAST. So, if you |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
343 |
ever need to write a SAX filter, which if you're processing XML -> XML, |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
344 |
or XML -> HTML, then you probably do, then you need to be writing it as |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
345 |
a subclass of XML::SAX::Base. Really - this is advice not to ignore |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
346 |
lightly. I will not go into the details of writing a SAX filter here. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
347 |
Kip Hampton, the author of XML::SAX::Base has covered this nicely in |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
348 |
his article on XML.com here <URI>. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
349 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
350 |
To construct SAX pipelines, Barrie Slaymaker, a long time Perl hacker |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
351 |
who's modules you will probably have heard of or used, wrote a very |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
352 |
clever module called XML::SAX::Machines. This combines some really |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
353 |
clever SAX filter-type modules, with a construction toolkit for filters |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
354 |
that makes building pipelines easy. But before we see how it makes |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
355 |
things easy, first lets see how tricky it looks to build complex SAX |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
356 |
filter pipelines. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
357 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
358 |
use XML::SAX::ParserFactory; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
359 |
use XML::Filter::Filter1; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
360 |
use XML::Filter::Filter2; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
361 |
use XML::SAX::Writer; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
362 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
363 |
my $output_string; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
364 |
my $writer = XML::SAX::Writer->new(Output => \$output_string); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
365 |
my $filter2 = XML::SAX::Filter2->new(Handler => $writer); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
366 |
my $filter1 = XML::SAX::Filter1->new(Handler => $filter2); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
367 |
my $parser = XML::SAX::ParserFactory->parser(Handler => $filter1); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
368 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
369 |
$parser->parse_uri("foo.xml"); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
370 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
371 |
This is a lot easier with XML::SAX::Machines: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
372 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
373 |
use XML::SAX::Machines qw(Pipeline); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
374 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
375 |
my $output_string; |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
376 |
my $parser = Pipeline( |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
377 |
XML::SAX::Filter1 => XML::SAX::Filter2 => \$output_string |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
378 |
); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
379 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
380 |
$parser->parse_uri("foo.xml"); |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
381 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
382 |
One of the main benefits of XML::SAX::Machines is that the pipelines |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
383 |
are constructed in natural order, rather than the reverse order we saw |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
384 |
with manual pipeline construction. XML::SAX::Machines takes care of all |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
385 |
the internals of pipe construction, providing you at the end with just |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
386 |
a parser you can use (and you can re-use the same parser as many times |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
387 |
as you need to). |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
388 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
389 |
Just a final tip. If you ever get stuck and are confused about what is |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
390 |
being passed from one SAX filter or parser to the next, then |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
391 |
Devel::TraceSAX will come to your rescue. This perl debugger plugin |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
392 |
will allow you to dump the SAX stream of events as it goes by. Usage is |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
393 |
really very simple just call your perl script that uses SAX as follows: |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
394 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
395 |
$ perl -d:TraceSAX <scriptname> |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
396 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
397 |
And preferably pipe the output to a pager of some sort, such as more or |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
398 |
less. The output is extremely verbose, but should help clear some |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
399 |
issues up. |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
400 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
401 |
=head1 AUTHOR |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
402 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
403 |
Matt Sergeant, matt@sergeant.org |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
404 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
405 |
$Id: Intro.pod,v 1.3 2002/04/30 07:16:00 matt Exp $ |
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
406 |
|
989c70555820
raptor parser.pl - adding XML::SAX perl modules (public domain code - http://www.saxproject.org/faq.html) so we don't have to patch up every single build machine.
MattD <mattd@symbian.org>
parents:
diff
changeset
|
407 |
=cut |