summaryrefslogtreecommitdiff
path: root/doc/pcre2grep.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/pcre2grep.txt')
-rw-r--r--doc/pcre2grep.txt103
1 files changed, 80 insertions, 23 deletions
diff --git a/doc/pcre2grep.txt b/doc/pcre2grep.txt
index 29cd75c..31aa610 100644
--- a/doc/pcre2grep.txt
+++ b/doc/pcre2grep.txt
@@ -493,14 +493,20 @@ OPTIONS
end of that line.
When this option is set, the PCRE2 library is called in "mul-
- tiline" mode. However, pcre2grep still processes the input
- line by line. The difference is that a matched string may
- extend past the end of a line and continue on one or more
- subsequent lines. The newline sequence must be matched as
- part of the pattern. For example, to find the phrase "regular
- expression" in a file where "regular" might be at the end of
- a line and "expression" at the start of the next line, you
- could use this command:
+ tiline" mode. This allows a matched string to extend past the
+ end of a line and continue on one or more subsequent lines.
+ However, pcre2grep still processes the input line by line.
+ Once a match has been handled, scanning restarts at the
+ beginning of the next line, just as it does when -M is not
+ present. This means that it is possible for the second or
+ subsequent lines in a multiline match to be output again as
+ part of another match.
+
+ The newline sequence that separates multiple lines must be
+ matched as part of the pattern. For example, to find the
+ phrase "regular expression" in a file where "regular" might
+ be at the end of a line and "expression" at the start of the
+ next line, you could use this command:
pcre2grep -M 'regular\s+expression' <file>
@@ -725,35 +731,86 @@ OPTIONS WITH DATA
equals character. Otherwise pcre2grep will assume that it has no data.
+CALLING EXTERNAL SCRIPTS
+
+ On non-Windows systems, pcre2grep has, by default, support for calling
+ external programs or scripts during matching by making use of PCRE2's
+ callout facility. However, this support can be disabled when pcre2grep
+ is built. You can find out whether your binary has support for call-
+ outs by running it with the --help option. If the support is not
+ enabled, all callouts in patterns are ignored by pcre2grep.
+
+ A callout in a PCRE2 pattern is of the form (?C<arg>) where the argu-
+ ment is either a number or a quoted string (see the pcre2callout docu-
+ mentation for details). Numbered callouts are ignored by pcre2grep.
+ String arguments are parsed as a list of substrings separated by pipe
+ (vertical bar) characters. The first substring must be an executable
+ name, with the following substrings specifying arguments:
+
+ executable_name|arg1|arg2|...
+
+ Any substring (including the executable name) may contain escape
+ sequences started by a dollar character: $<digits> or ${<digits>} is
+ replaced by the captured substring of the given decimal number, which
+ must be greater than zero. If the number is greater than the number of
+ capturing substrings, or if the capture is unset, the replacement is
+ empty.
+
+ Any other character is substituted by itself. In particular, $$ is
+ replaced by a single dollar and $| is replaced by a pipe character.
+ Here is an example:
+
+ echo -e "abcde\n12345" | pcre2grep \
+ '(?x)(.)(..(.))
+ (?C"/bin/echo|Arg1: [$1] [$2] [$3]|Arg2: $|${1}$| ($4)")()' -
+
+ Output:
+
+ Arg1: [a] [bcd] [d] Arg2: |a| ()
+ abcde
+ Arg1: [1] [234] [4] Arg2: |1| ()
+ 12345
+
+ The parameters for the execv() system call that is used to run the pro-
+ gram or script are zero-terminated strings. This means that binary zero
+ characters in the callout argument will cause premature termination of
+ their substrings, and therefore should not be present. Any syntax
+ errors in the string (for example, a dollar not followed by another
+ character) cause the callout to be ignored. If running the program
+ fails for any reason (including the non-existence of the executable), a
+ local matching failure occurs and the matcher backtracks in the normal
+ way.
+
+
MATCHING ERRORS
- It is possible to supply a regular expression that takes a very long
- time to fail to match certain lines. Such patterns normally involve
- nested indefinite repeats, for example: (a+)*\d when matched against a
- line of a's with no final digit. The PCRE2 matching function has a
- resource limit that causes it to abort in these circumstances. If this
- happens, pcre2grep outputs an error message and the line that caused
- the problem to the standard error stream. If there are more than 20
+ It is possible to supply a regular expression that takes a very long
+ time to fail to match certain lines. Such patterns normally involve
+ nested indefinite repeats, for example: (a+)*\d when matched against a
+ line of a's with no final digit. The PCRE2 matching function has a
+ resource limit that causes it to abort in these circumstances. If this
+ happens, pcre2grep outputs an error message and the line that caused
+ the problem to the standard error stream. If there are more than 20
such errors, pcre2grep gives up.
- The --match-limit option of pcre2grep can be used to set the overall
- resource limit; there is a second option called --recursion-limit that
- sets a limit on the amount of memory (usually stack) that is used (see
+ The --match-limit option of pcre2grep can be used to set the overall
+ resource limit; there is a second option called --recursion-limit that
+ sets a limit on the amount of memory (usually stack) that is used (see
the discussion of these options above).
DIAGNOSTICS
Exit status is 0 if any matches were found, 1 if no matches were found,
- and 2 for syntax errors, overlong lines, non-existent or inaccessible
- files (even if matches were found in other files) or too many matching
+ and 2 for syntax errors, overlong lines, non-existent or inaccessible
+ files (even if matches were found in other files) or too many matching
errors. Using the -s option to suppress error messages about inaccessi-
ble files does not affect the return code.
SEE ALSO
- pcre2pattern(3), pcre2syntax(3).
+ pcre2pattern(3), pcre2syntax(3), pcre2callout(3).
AUTHOR
@@ -765,5 +822,5 @@ AUTHOR
REVISION
- Last updated: 03 January 2015
- Copyright (c) 1997-2015 University of Cambridge.
+ Last updated: 19 June 2016
+ Copyright (c) 1997-2016 University of Cambridge.