libraries/spcre/libpcre/pcre/doc/pcregrep.txt
changeset 0 7f656887cf89
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/libraries/spcre/libpcre/pcre/doc/pcregrep.txt	Wed Jun 23 15:52:26 2010 +0100
@@ -0,0 +1,498 @@
+PCREGREP(1)                                                        PCREGREP(1)
+
+
+NAME
+       pcregrep - a grep with Perl-compatible regular expressions.
+
+
+SYNOPSIS
+       pcregrep [options] [long options] [pattern] [path1 path2 ...]
+
+
+DESCRIPTION
+
+       pcregrep  searches  files  for  character  patterns, in the same way as
+       other grep commands do, but it uses the PCRE regular expression library
+       to support patterns that are compatible with the regular expressions of
+       Perl 5. See pcrepattern(3) for a full description of syntax and  seman-
+       tics of the regular expressions that PCRE supports.
+
+       Patterns,  whether  supplied on the command line or in a separate file,
+       are given without delimiters. For example:
+
+         pcregrep Thursday /etc/motd
+
+       If you attempt to use delimiters (for example, by surrounding a pattern
+       with  slashes,  as  is common in Perl scripts), they are interpreted as
+       part of the pattern. Quotes can of course be used to  delimit  patterns
+       on  the  command  line  because  they are interpreted by the shell, and
+       indeed they are required if a pattern contains  white  space  or  shell
+       metacharacters.
+
+       The  first  argument that follows any option settings is treated as the
+       single pattern to be matched when neither -e nor -f is  present.   Con-
+       versely,  when  one  or  both of these options are used to specify pat-
+       terns, all arguments are treated as path names. At least one of -e, -f,
+       or an argument pattern must be provided.
+
+       If no files are specified, pcregrep reads the standard input. The stan-
+       dard input can also be referenced by a  name  consisting  of  a  single
+       hyphen.  For example:
+
+         pcregrep some-pattern /file1 - /file3
+
+       By  default, each line that matches a pattern is copied to the standard
+       output, and if there is more than one file, the file name is output  at
+       the start of each line, followed by a colon. However, there are options
+       that can change how pcregrep behaves.  In  particular,  the  -M  option
+       makes  it  possible  to  search for patterns that span line boundaries.
+       What defines a line  boundary  is  controlled  by  the  -N  (--newline)
+       option.
+
+       Patterns  are  limited  to  8K  or  BUFSIZ characters, whichever is the
+       greater.  BUFSIZ is defined in <stdio.h>. When there is more  than  one
+       pattern (specified by the use of -e and/or -f), each pattern is applied
+       to each line in the order in which they are defined,  except  that  all
+       the  -e  patterns are tried before the -f patterns. As soon as one pat-
+       tern matches (or fails to match when -v is used), no  further  patterns
+       are considered.
+
+       When  --only-matching,  --file-offsets,  or --line-offsets is used, the
+       output is the part of the line that matched (either shown literally, or
+       as an offset). In this case, scanning resumes immediately following the
+       match, so that further matches on the same line can be found.  If there
+       are multiple patterns, they are all tried on the remainder of the line.
+       However, patterns that follow the one that matched are not tried on the
+       earlier part of the line.
+
+       If  the  LC_ALL  or LC_CTYPE environment variable is set, pcregrep uses
+       the value to set a locale when calling the PCRE library.  The  --locale
+       option can be used to override this.
+
+
+SUPPORT FOR COMPRESSED FILES
+
+       It  is  possible  to compile pcregrep so that it uses libz or libbz2 to
+       read files whose names end in .gz or .bz2, respectively. You  can  find
+       out whether your binary has support for one or both of these file types
+       by running it with the --help option. If the appropriate support is not
+       present,  files are treated as plain text. The standard input is always
+       so treated.
+
+
+OPTIONS
+
+       --        This terminate the list of options. It is useful if the  next
+                 item  on  the command line starts with a hyphen but is not an
+                 option. This allows for the processing of patterns and  file-
+                 names that start with hyphens.
+
+       -A number, --after-context=number
+                 Output  number  lines of context after each matching line. If
+                 filenames and/or line numbers are being output, a hyphen sep-
+                 arator  is  used  instead of a colon for the context lines. A
+                 line containing "--" is output between each group  of  lines,
+                 unless  they  are  in  fact contiguous in the input file. The
+                 value of number is expected to be relatively small.  However,
+                 pcregrep guarantees to have up to 8K of following text avail-
+                 able for context output.
+
+       -B number, --before-context=number
+                 Output number lines of context before each matching line.  If
+                 filenames and/or line numbers are being output, a hyphen sep-
+                 arator is used instead of a colon for the  context  lines.  A
+                 line  containing  "--" is output between each group of lines,
+                 unless they are in fact contiguous in  the  input  file.  The
+                 value  of number is expected to be relatively small. However,
+                 pcregrep guarantees to have up to 8K of preceding text avail-
+                 able for context output.
+
+       -C number, --context=number
+                 Output  number  lines  of  context both before and after each
+                 matching line.  This is equivalent to setting both -A and  -B
+                 to the same value.
+
+       -c, --count
+                 Do  not  output individual lines; instead just output a count
+                 of the number of lines that would otherwise have been output.
+                 If  several  files  are  given, a count is output for each of
+                 them. In this mode, the -A, -B, and -C options are ignored.
+
+       --colour, --color
+                 If this option is given without any data, it is equivalent to
+                 "--colour=auto".   If  data  is required, it must be given in
+                 the same shell item, separated by an equals sign.
+
+       --colour=value, --color=value
+                 This option specifies under what circumstances the part of  a
+                 line that matched a pattern should be coloured in the output.
+                 The value may be "never" (the default), "always", or  "auto".
+                 In  the  latter  case, colouring happens only if the standard
+                 output is connected to a terminal. The colour can  be  speci-
+                 fied  by  setting the environment variable PCREGREP_COLOUR or
+                 PCREGREP_COLOR. The value of this variable should be a string
+                 of  two  numbers,  separated by a semicolon.  They are copied
+                 directly into the control string for setting colour on a ter-
+                 minal,  so it is your responsibility to ensure that they make
+                 sense. If neither of the environment variables  is  set,  the
+                 default is "1;31", which gives red.
+
+       -D action, --devices=action
+                 If  an  input  path  is  not  a  regular file or a directory,
+                 "action" specifies how it is to be  processed.  Valid  values
+                 are  "read" (the default) or "skip" (silently skip the path).
+
+       -d action, --directories=action
+                 If an input path is a directory, "action" specifies how it is
+                 to  be  processed.   Valid  values  are "read" (the default),
+                 "recurse" (equivalent to the -r option), or "skip"  (silently
+                 skip  the path). In the default case, directories are read as
+                 if they were ordinary files. In some  operating  systems  the
+                 effect  of reading a directory like this is an immediate end-
+                 of-file.
+
+       -e pattern, --regex=pattern, --regexp=pattern
+                 Specify a pattern to be matched. This option can be used mul-
+                 tiple times in order to specify several patterns. It can also
+                 be used as a way of specifying a single pattern  that  starts
+                 with  a hyphen. When -e is used, no argument pattern is taken
+                 from the command line; all  arguments  are  treated  as  file
+                 names.  There is an overall maximum of 100 patterns. They are
+                 applied to each line in the order in which they  are  defined
+                 until one matches (or fails to match if -v is used). If -f is
+                 used with -e, the command line patterns  are  matched  first,
+                 followed  by  the  patterns from the file, independent of the
+                 order in which these options are specified. Note that  multi-
+                 ple use of -e is not the same as a single pattern with alter-
+                 natives. For example, X|Y finds the first character in a line
+                 that  is  X or Y, whereas if the two patterns are given sepa-
+                 rately, pcregrep finds X if it is present, even if it follows
+                 Y  in the line. It finds Y only if there is no X in the line.
+                 This really matters only if you are  using  -o  to  show  the
+                 part(s) of the line that matched.
+
+       --exclude=pattern
+                 When pcregrep is searching the files in a directory as a con-
+                 sequence of the -r (recursive  search)  option,  any  regular
+                 files whose names match the pattern are excluded. Subdirecto-
+                 ries are not excluded  by  this  option;  they  are  searched
+                 recursively,  subject  to the --exclude_dir and --include_dir
+                 options. The pattern is a PCRE  regular  expression,  and  is
+                 matched against the final component of the file name (not the
+                 entire path). If a  file  name  matches  both  --include  and
+                 --exclude,  it  is excluded.  There is no short form for this
+                 option.
+
+       --exclude_dir=pattern
+                 When pcregrep is searching the contents of a directory  as  a
+                 consequence  of  the -r (recursive search) option, any subdi-
+                 rectories whose names match the pattern are  excluded.  (Note
+                 that  the  --exclude  option does not affect subdirectories.)
+                 The pattern is a PCRE  regular  expression,  and  is  matched
+                 against  the  final  component  of  the  name (not the entire
+                 path). If a subdirectory name matches both --include_dir  and
+                 --exclude_dir,  it  is  excluded.  There is no short form for
+                 this option.
+
+       -F, --fixed-strings
+                 Interpret each pattern as a list of fixed strings,  separated
+                 by  newlines,  instead  of  as  a  regular expression. The -w
+                 (match as a word) and -x (match whole line)  options  can  be
+                 used with -F. They apply to each of the fixed strings. A line
+                 is selected if any of the fixed strings are found in it (sub-
+                 ject to -w or -x, if present).
+
+       -f filename, --file=filename
+                 Read  a  number  of patterns from the file, one per line, and
+                 match them against each line of input. A data line is  output
+                 if any of the patterns match it. The filename can be given as
+                 "-" to refer to the standard input. When -f is used, patterns
+                 specified  on  the command line using -e may also be present;
+                 they are tested before the file's patterns. However, no other
+                 pattern  is  taken  from  the command line; all arguments are
+                 treated as file names. There is an  overall  maximum  of  100
+                 patterns. Trailing white space is removed from each line, and
+                 blank lines are ignored. An empty file contains  no  patterns
+                 and  therefore  matches  nothing. See also the comments about
+                 multiple patterns versus a single pattern  with  alternatives
+                 in the description of -e above.
+
+       --file-offsets
+                 Instead  of  showing lines or parts of lines that match, show
+                 each match as an offset from the start  of  the  file  and  a
+                 length,  separated  by  a  comma. In this mode, no context is
+                 shown. That is, the -A, -B, and -C options  are  ignored.  If
+                 there is more than one match in a line, each of them is shown
+                 separately. This option is mutually  exclusive  with  --line-
+                 offsets and --only-matching.
+
+       -H, --with-filename
+                 Force  the  inclusion  of the filename at the start of output
+                 lines when searching a single file. By default, the  filename
+                 is  not  shown in this case. For matching lines, the filename
+                 is followed by a colon and a  space;  for  context  lines,  a
+                 hyphen separator is used. If a line number is also being out-
+                 put, it follows the file name without a space.
+
+       -h, --no-filename
+                 Suppress the output filenames when searching multiple  files.
+                 By  default,  filenames  are  shown  when  multiple files are
+                 searched. For matching lines, the filename is followed  by  a
+                 colon  and  a space; for context lines, a hyphen separator is
+                 used. If a line number is also being output, it  follows  the
+                 file name without a space.
+
+       --help    Output  a  help  message, giving brief details of the command
+                 options and file type support, and then exit.
+
+       -i, --ignore-case
+                 Ignore upper/lower case distinctions during comparisons.
+
+       --include=pattern
+                 When pcregrep is searching the files in a directory as a con-
+                 sequence of the -r (recursive search) option, only those reg-
+                 ular files whose names match the pattern are included. Subdi-
+                 rectories  are always included and searched recursively, sub-
+                 ject to the --include_dir and --exclude_dir options. The pat-
+                 tern is a PCRE regular expression, and is matched against the
+                 final component of the file name (not the entire path). If  a
+                 file  name  matches  both  --include  and  --exclude,  it  is
+                 excluded. There is no short form for this option.
+
+       --include_dir=pattern
+                 When pcregrep is searching the contents of a directory  as  a
+                 consequence  of  the -r (recursive search) option, only those
+                 subdirectories whose names match the  pattern  are  included.
+                 (Note  that  the --include option does not affect subdirecto-
+                 ries.) The pattern is  a  PCRE  regular  expression,  and  is
+                 matched  against  the  final  component  of the name (not the
+                 entire  path).  If   a   subdirectory   name   matches   both
+                 --include_dir  and --exclude_dir, it is excluded. There is no
+                 short form for this option.
+
+       -L, --files-without-match
+                 Instead of outputting lines from the files, just  output  the
+                 names  of  the files that do not contain any lines that would
+                 have been output. Each file name is output once, on  a  sepa-
+                 rate line.
+
+       -l, --files-with-matches
+                 Instead  of  outputting lines from the files, just output the
+                 names of the files containing lines that would have been out-
+                 put.  Each  file  name  is  output  once, on a separate line.
+                 Searching stops as soon as a matching  line  is  found  in  a
+                 file.
+
+       --label=name
+                 This option supplies a name to be used for the standard input
+                 when file names are being output. If not supplied, "(standard
+                 input)" is used. There is no short form for this option.
+
+       --line-offsets
+                 Instead  of  showing lines or parts of lines that match, show
+                 each match as a line number, the offset from the start of the
+                 line,  and a length. The line number is terminated by a colon
+                 (as usual; see the -n option), and the offset and length  are
+                 separated  by  a  comma.  In  this mode, no context is shown.
+                 That is, the -A, -B, and -C options are ignored. If there  is
+                 more  than  one  match in a line, each of them is shown sepa-
+                 rately. This option is mutually exclusive with --file-offsets
+                 and --only-matching.
+
+       --locale=locale-name
+                 This  option specifies a locale to be used for pattern match-
+                 ing. It overrides the value in the LC_ALL or  LC_CTYPE  envi-
+                 ronment  variables.  If  no  locale  is  specified,  the PCRE
+                 library's default (usually the "C" locale) is used. There  is
+                 no short form for this option.
+
+       -M, --multiline
+                 Allow  patterns to match more than one line. When this option
+                 is given, patterns may usefully contain literal newline char-
+                 acters  and  internal  occurrences of ^ and $ characters. The
+                 output for any one match may consist of more than  one  line.
+                 When  this option is set, the PCRE library is called in "mul-
+                 tiline" mode.  There is a limit to the number of  lines  that
+                 can  be matched, imposed by the way that pcregrep buffers the
+                 input file as it scans it. However, pcregrep ensures that  at
+                 least 8K characters or the rest of the document (whichever is
+                 the shorter) are available for forward  matching,  and  simi-
+                 larly the previous 8K characters (or all the previous charac-
+                 ters, if fewer than 8K) are guaranteed to  be  available  for
+                 lookbehind assertions.
+
+       -N newline-type, --newline=newline-type
+                 The  PCRE  library  supports  five  different conventions for
+                 indicating the ends of lines. They are  the  single-character
+                 sequences  CR  (carriage  return) and LF (linefeed), the two-
+                 character sequence CRLF, an "anycrlf" convention, which  rec-
+                 ognizes  any  of the preceding three types, and an "any" con-
+                 vention, in which any Unicode line ending sequence is assumed
+                 to  end a line. The Unicode sequences are the three just men-
+                 tioned,  plus  VT  (vertical  tab,  U+000B),  FF   (formfeed,
+                 U+000C),   NEL  (next  line,  U+0085),  LS  (line  separator,
+                 U+2028), and PS (paragraph separator, U+2029).
+
+                 When  the  PCRE  library  is  built,  a  default  line-ending
+                 sequence   is  specified.   This  is  normally  the  standard
+                 sequence for the operating system. Unless otherwise specified
+                 by  this  option,  pcregrep  uses the library's default.  The
+                 possible values for this option are CR, LF, CRLF, ANYCRLF, or
+                 ANY.  This  makes  it  possible to use pcregrep on files that
+                 have come from other environments without  having  to  modify
+                 their  line  endings.  If the data that is being scanned does
+                 not agree with the convention set by  this  option,  pcregrep
+                 may behave in strange ways.
+
+       -n, --line-number
+                 Precede each output line by its line number in the file, fol-
+                 lowed by a colon and a space for matching lines or  a  hyphen
+                 and  a space for context lines. If the filename is also being
+                 output, it precedes the line number. This option is forced if
+                 --line-offsets is used.
+
+       -o, --only-matching
+                 Show  only  the  part  of the line that matched a pattern. In
+                 this mode, no context is shown. That is, the -A, -B,  and  -C
+                 options  are  ignored.  If  there is more than one match in a
+                 line, each of them is shown separately.  If  -o  is  combined
+                 with  -v  (invert the sense of the match to find non-matching
+                 lines), no output is generated, but the return  code  is  set
+                 appropriately. This option is mutually exclusive with --file-
+                 offsets and --line-offsets.
+
+       -q, --quiet
+                 Work quietly, that is, display nothing except error messages.
+                 The  exit  status  indicates  whether or not any matches were
+                 found.
+
+       -r, --recursive
+                 If any given path is a directory, recursively scan the  files
+                 it  contains, taking note of any --include and --exclude set-
+                 tings. By default, a directory is read as a normal  file;  in
+                 some  operating  systems this gives an immediate end-of-file.
+                 This option is a shorthand  for  setting  the  -d  option  to
+                 "recurse".
+
+       -s, --no-messages
+                 Suppress  error  messages  about  non-existent  or unreadable
+                 files. Such files are quietly skipped.  However,  the  return
+                 code is still 2, even if matches were found in other files.
+
+       -u, --utf-8
+                 Operate  in UTF-8 mode. This option is available only if PCRE
+                 has been compiled with UTF-8 support. Both patterns and  sub-
+                 ject lines must be valid strings of UTF-8 characters.
+
+       -V, --version
+                 Write  the  version  numbers of pcregrep and the PCRE library
+                 that is being used to the standard error stream.
+
+       -v, --invert-match
+                 Invert the sense of the match, so that  lines  which  do  not
+                 match any of the patterns are the ones that are found.
+
+       -w, --word-regex, --word-regexp
+                 Force the patterns to match only whole words. This is equiva-
+                 lent to having \b at the start and end of the pattern.
+
+       -x, --line-regex, --line-regexp
+                 Force the patterns to be anchored (each must  start  matching
+                 at  the beginning of a line) and in addition, require them to
+                 match entire lines. This is equivalent  to  having  ^  and  $
+                 characters at the start and end of each alternative branch in
+                 every pattern.
+
+
+ENVIRONMENT VARIABLES
+
+       The environment variables LC_ALL and LC_CTYPE  are  examined,  in  that
+       order,  for  a  locale.  The first one that is set is used. This can be
+       overridden by the --locale option.  If  no  locale  is  set,  the  PCRE
+       library's default (usually the "C" locale) is used.
+
+
+NEWLINES
+
+       The  -N (--newline) option allows pcregrep to scan files with different
+       newline conventions from the default.  However,  the  setting  of  this
+       option  does not affect the way in which pcregrep writes information to
+       the standard error and output streams. It uses the  string  "\n"  in  C
+       printf()  calls  to  indicate newlines, relying on the C I/O library to
+       convert this to an appropriate sequence if the  output  is  sent  to  a
+       file.
+
+
+OPTIONS COMPATIBILITY
+
+       The majority of short and long forms of pcregrep's options are the same
+       as in the GNU grep program. Any long option of  the  form  --xxx-regexp
+       (GNU  terminology) is also available as --xxx-regex (PCRE terminology).
+       However, the --locale, -M, --multiline, -u,  and  --utf-8  options  are
+       specific to pcregrep.
+
+
+OPTIONS WITH DATA
+
+       There are four different ways in which an option with data can be spec-
+       ified.  If a short form option is used, the  data  may  follow  immedi-
+       ately, or in the next command line item. For example:
+
+         -f/some/file
+         -f /some/file
+
+       If  a long form option is used, the data may appear in the same command
+       line item, separated by an equals character, or (with one exception) it
+       may appear in the next command line item. For example:
+
+         --file=/some/file
+         --file /some/file
+
+       Note,  however, that if you want to supply a file name beginning with ~
+       as data in a shell command, and have the  shell  expand  ~  to  a  home
+       directory, you must separate the file name from the option, because the
+       shell does not treat ~ specially unless it is at the start of an  item.
+
+       The  exception  to  the  above is the --colour (or --color) option, for
+       which the data is optional. If this option does have data, it  must  be
+       given  in  the first form, using an equals character. Otherwise it will
+       be assumed that it has no data.
+
+
+MATCHING ERRORS
+
+       It is possible to supply a regular expression that takes  a  very  long
+       time  to  fail  to  match certain lines. Such patterns normally involve
+       nested indefinite repeats, for example: (a+)*\d when matched against  a
+       line  of  a's  with  no  final  digit. The PCRE matching function has a
+       resource limit that causes it to abort in these circumstances. If  this
+       happens, pcregrep outputs an error message and the line that caused the
+       problem to the standard error stream. If there are more  than  20  such
+       errors, pcregrep gives up.
+
+
+DIAGNOSTICS
+
+       Exit status is 0 if any matches were found, 1 if no matches were found,
+       and 2 for syntax errors and non-existent or inacessible files (even  if
+       matches  were  found in other files) or too many matching errors. Using
+       the -s option to suppress error messages about inaccessble  files  does
+       not affect the return code.
+
+
+SEE ALSO
+
+       pcrepattern(3), pcretest(1).
+
+
+AUTHOR
+
+       Philip Hazel
+       University Computing Service
+       Cambridge CB2 3QH, England.
+
+
+REVISION
+
+       Last updated: 08 March 2008
+       Copyright (c) 1997-2008 University of Cambridge.