libraries/spcre/libpcre/pcre/doc/pcrepartial.3
author Tom Sutcliffe <thomas.sutcliffe@accenture.com>
Sat, 31 Jul 2010 19:07:57 +0100
changeset 23 092bcc217d9d
parent 0 7f656887cf89
permissions -rw-r--r--
Tidied iocli exports, build macro tweaks. Removed 4 overloads of CCommandBase::RunCommand[L] that are no longer used at all, and changed one more to not be exported as it's only used internally to iocli.dll. fixed builds on platforms that don't support btrace or any form of tracing.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
0
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
     1
.TH PCREPARTIAL 3
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
     2
.SH NAME
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
     3
PCRE - Perl-compatible regular expressions
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
     4
.SH "PARTIAL MATCHING IN PCRE"
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
     5
.rs
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
     6
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
     7
In normal use of PCRE, if the subject string that is passed to
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
     8
\fBpcre_exec()\fP or \fBpcre_dfa_exec()\fP matches as far as it goes, but is
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
     9
too short to match the entire pattern, PCRE_ERROR_NOMATCH is returned. There
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    10
are circumstances where it might be helpful to distinguish this case from other
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    11
cases in which there is no match.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    12
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    13
Consider, for example, an application where a human is required to type in data
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    14
for a field with specific formatting requirements. An example might be a date
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    15
in the form \fIddmmmyy\fP, defined by this pattern:
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    16
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    17
  ^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    18
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    19
If the application sees the user's keystrokes one by one, and can check that
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    20
what has been typed so far is potentially valid, it is able to raise an error
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    21
as soon as a mistake is made, possibly beeping and not reflecting the
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    22
character that has been typed. This immediate feedback is likely to be a better
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    23
user interface than a check that is delayed until the entire string has been
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    24
entered.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    25
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    26
PCRE supports the concept of partial matching by means of the PCRE_PARTIAL
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    27
option, which can be set when calling \fBpcre_exec()\fP or
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    28
\fBpcre_dfa_exec()\fP. When this flag is set for \fBpcre_exec()\fP, the return
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    29
code PCRE_ERROR_NOMATCH is converted into PCRE_ERROR_PARTIAL if at any time
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    30
during the matching process the last part of the subject string matched part of
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    31
the pattern. Unfortunately, for non-anchored matching, it is not possible to
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    32
obtain the position of the start of the partial match. No captured data is set
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    33
when PCRE_ERROR_PARTIAL is returned.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    34
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    35
When PCRE_PARTIAL is set for \fBpcre_dfa_exec()\fP, the return code
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    36
PCRE_ERROR_NOMATCH is converted into PCRE_ERROR_PARTIAL if the end of the
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    37
subject is reached, there have been no complete matches, but there is still at
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    38
least one matching possibility. The portion of the string that provided the
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    39
partial match is set as the first matching string.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    40
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    41
Using PCRE_PARTIAL disables one of PCRE's optimizations. PCRE remembers the
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    42
last literal byte in a pattern, and abandons matching immediately if such a
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    43
byte is not present in the subject string. This optimization cannot be used
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    44
for a subject string that might match only partially.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    45
.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    46
.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    47
.SH "RESTRICTED PATTERNS FOR PCRE_PARTIAL"
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    48
.rs
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    49
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    50
Because of the way certain internal optimizations are implemented in the
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    51
\fBpcre_exec()\fP function, the PCRE_PARTIAL option cannot be used with all
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    52
patterns. These restrictions do not apply when \fBpcre_dfa_exec()\fP is used.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    53
For \fBpcre_exec()\fP, repeated single characters such as
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    54
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    55
  a{2,4}
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    56
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    57
and repeated single metasequences such as
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    58
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    59
  \ed+
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    60
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    61
are not permitted if the maximum number of occurrences is greater than one.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    62
Optional items such as \ed? (where the maximum is one) are permitted.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    63
Quantifiers with any values are permitted after parentheses, so the invalid
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    64
examples above can be coded thus:
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    65
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    66
  (a){2,4}
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    67
  (\ed)+
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    68
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    69
These constructions run more slowly, but for the kinds of application that are
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    70
envisaged for this facility, this is not felt to be a major restriction.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    71
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    72
If PCRE_PARTIAL is set for a pattern that does not conform to the restrictions,
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    73
\fBpcre_exec()\fP returns the error code PCRE_ERROR_BADPARTIAL (-13).
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    74
You can use the PCRE_INFO_OKPARTIAL call to \fBpcre_fullinfo()\fP to find out
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    75
if a compiled pattern can be used for partial matching.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    76
.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    77
.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    78
.SH "EXAMPLE OF PARTIAL MATCHING USING PCRETEST"
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    79
.rs
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    80
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    81
If the escape sequence \eP is present in a \fBpcretest\fP data line, the
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    82
PCRE_PARTIAL flag is used for the match. Here is a run of \fBpcretest\fP that
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    83
uses the date example quoted above:
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    84
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    85
    re> /^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$/
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    86
  data> 25jun04\eP
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    87
   0: 25jun04
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    88
   1: jun
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    89
  data> 25dec3\eP
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    90
  Partial match
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    91
  data> 3ju\eP
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    92
  Partial match
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    93
  data> 3juj\eP
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    94
  No match
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    95
  data> j\eP
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    96
  No match
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    97
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    98
The first data string is matched completely, so \fBpcretest\fP shows the
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
    99
matched substrings. The remaining four strings do not match the complete
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   100
pattern, but the first two are partial matches. The same test, using
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   101
\fBpcre_dfa_exec()\fP matching (by means of the \eD escape sequence), produces
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   102
the following output:
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   103
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   104
    re> /^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$/
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   105
  data> 25jun04\eP\eD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   106
   0: 25jun04
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   107
  data> 23dec3\eP\eD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   108
  Partial match: 23dec3
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   109
  data> 3ju\eP\eD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   110
  Partial match: 3ju
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   111
  data> 3juj\eP\eD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   112
  No match
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   113
  data> j\eP\eD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   114
  No match
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   115
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   116
Notice that in this case the portion of the string that was matched is made
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   117
available.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   118
.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   119
.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   120
.SH "MULTI-SEGMENT MATCHING WITH pcre_dfa_exec()"
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   121
.rs
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   122
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   123
When a partial match has been found using \fBpcre_dfa_exec()\fP, it is possible
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   124
to continue the match by providing additional subject data and calling
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   125
\fBpcre_dfa_exec()\fP again with the same compiled regular expression, this
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   126
time setting the PCRE_DFA_RESTART option. You must also pass the same working
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   127
space as before, because this is where details of the previous partial match
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   128
are stored. Here is an example using \fBpcretest\fP, using the \eR escape
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   129
sequence to set the PCRE_DFA_RESTART option (\eP and \eD are as above):
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   130
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   131
    re> /^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$/
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   132
  data> 23ja\eP\eD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   133
  Partial match: 23ja
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   134
  data> n05\eR\eD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   135
   0: n05
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   136
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   137
The first call has "23ja" as the subject, and requests partial matching; the
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   138
second call has "n05" as the subject for the continued (restarted) match.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   139
Notice that when the match is complete, only the last part is shown; PCRE does
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   140
not retain the previously partially-matched string. It is up to the calling
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   141
program to do that if it needs to.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   142
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   143
You can set PCRE_PARTIAL with PCRE_DFA_RESTART to continue partial matching
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   144
over multiple segments. This facility can be used to pass very long subject
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   145
strings to \fBpcre_dfa_exec()\fP. However, some care is needed for certain
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   146
types of pattern.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   147
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   148
1. If the pattern contains tests for the beginning or end of a line, you need
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   149
to pass the PCRE_NOTBOL or PCRE_NOTEOL options, as appropriate, when the
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   150
subject string for any call does not contain the beginning or end of a line.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   151
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   152
2. If the pattern contains backward assertions (including \eb or \eB), you need
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   153
to arrange for some overlap in the subject strings to allow for this. For
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   154
example, you could pass the subject in chunks that are 500 bytes long, but in
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   155
a buffer of 700 bytes, with the starting offset set to 200 and the previous 200
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   156
bytes at the start of the buffer.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   157
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   158
3. Matching a subject string that is split into multiple segments does not
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   159
always produce exactly the same result as matching over one single long string.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   160
The difference arises when there are multiple matching possibilities, because a
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   161
partial match result is given only when there are no completed matches in a
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   162
call to \fBpcre_dfa_exec()\fP. This means that as soon as the shortest match has
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   163
been found, continuation to a new subject segment is no longer possible.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   164
Consider this \fBpcretest\fP example:
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   165
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   166
    re> /dog(sbody)?/
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   167
  data> do\eP\eD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   168
  Partial match: do
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   169
  data> gsb\eR\eP\eD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   170
   0: g
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   171
  data> dogsbody\eD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   172
   0: dogsbody
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   173
   1: dog
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   174
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   175
The pattern matches the words "dog" or "dogsbody". When the subject is
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   176
presented in several parts ("do" and "gsb" being the first two) the match stops
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   177
when "dog" has been found, and it is not possible to continue. On the other
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   178
hand, if "dogsbody" is presented as a single string, both matches are found.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   179
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   180
Because of this phenomenon, it does not usually make sense to end a pattern
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   181
that is going to be matched in this way with a variable repeat.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   182
.P
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   183
4. Patterns that contain alternatives at the top level which do not all
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   184
start with the same pattern item may not work as expected. For example,
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   185
consider this pattern:
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   186
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   187
  1234|3789
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   188
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   189
If the first part of the subject is "ABC123", a partial match of the first
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   190
alternative is found at offset 3. There is no partial match for the second
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   191
alternative, because such a match does not start at the same point in the
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   192
subject string. Attempting to continue with the string "789" does not yield a
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   193
match because only those alternatives that match at one point in the subject
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   194
are remembered. The problem arises because the start of the second alternative
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   195
matches within the first alternative. There is no problem with anchored
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   196
patterns or patterns such as:
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   197
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   198
  1234|ABCD
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   199
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   200
where no string can be a partial match for both alternatives.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   201
.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   202
.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   203
.SH AUTHOR
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   204
.rs
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   205
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   206
.nf
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   207
Philip Hazel
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   208
University Computing Service
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   209
Cambridge CB2 3QH, England.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   210
.fi
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   211
.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   212
.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   213
.SH REVISION
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   214
.rs
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   215
.sp
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   216
.nf
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   217
Last updated: 04 June 2007
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   218
Copyright (c) 1997-2007 University of Cambridge.
7f656887cf89 First submission to Symbian Foundation staging server.
Tom Sutcliffe <thomas.sutcliffe@accenture.com>
parents:
diff changeset
   219
.fi