summaryrefslogtreecommitdiff
path: root/doc/pcre2grep.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/pcre2grep.txt')
-rw-r--r--doc/pcre2grep.txt127
1 files changed, 69 insertions, 58 deletions
diff --git a/doc/pcre2grep.txt b/doc/pcre2grep.txt
index 30517b4..000239c 100644
--- a/doc/pcre2grep.txt
+++ b/doc/pcre2grep.txt
@@ -56,17 +56,17 @@ DESCRIPTION
that is obtained at the start of processing. If an input file contains
very long lines, a larger buffer may be needed; this is handled by
automatically extending the buffer, up to the limit specified by --max-
- buffer-size. The default values for these parameters are specified when
- pcre2grep is built, with the default defaults being 20K and 1M respec-
- tively. An error occurs if a line is too long and the buffer can no
- longer be expanded.
+ buffer-size. The default values for these parameters can be set when
+ pcre2grep is built; if nothing is specified, the defaults are set to
+ 20KiB and 1MiB respectively. An error occurs if a line is too long and
+ the buffer can no longer be expanded.
The block of memory that is actually used is three times the "buffer
size", to allow for buffering "before" and "after" lines. If the buffer
size is too small, fewer than requested "before" and "after" lines may
be output.
- Patterns can be no longer than 8K or BUFSIZ bytes, whichever is the
+ Patterns can be no longer than 8KiB or BUFSIZ bytes, whichever is the
greater. BUFSIZ is defined in <stdio.h>. When there is more than one
pattern (specified by the use of -e and/or -f), each pattern is applied
to each line in the order in which they are defined, except that all
@@ -122,6 +122,13 @@ BINARY FILES
handled.
+BINARY ZEROS IN PATTERNS
+
+ Patterns passed from the command line are strings that are terminated
+ by a binary zero, so cannot contain internal zeros. However, patterns
+ that are read from a file via the -f option may contain binary zeros.
+
+
OPTIONS
The order in which some of the options appear can affect the output.
@@ -329,36 +336,40 @@ OPTIONS
-f filename, --file=filename
Read patterns from the file, one per line, and match them
- against each line of input. What constitutes a newline when
- reading the file is the operating system's default. The
- --newline option has no effect on this option. Trailing
- white space is removed from each line, and blank lines are
- ignored. An empty file contains no patterns and therefore
- matches nothing. See also the comments about multiple pat-
- terns versus a single pattern with alternatives in the
- description of -e above.
-
- If this option is given more than once, all the specified
- files are read. A data line is output if any of the patterns
- match it. A file name can be given as "-" to refer to the
- standard input. When -f is used, patterns specified on the
- command line using -e may also be present; they are tested
- before the file's patterns. However, no other pattern is
+ against each line of input. As is the case with patterns on
+ the command line, no delimiters should be used. What consti-
+ tutes a newline when reading the file is the operating sys-
+ tem's default interpretation of \n. The --newline option has
+ no effect on this option. Trailing white space is removed
+ from each line, and blank lines are ignored. An empty file
+ contains no patterns and therefore matches nothing. Patterns
+ read from a file in this way may contain binary zeros, which
+ are treated as ordinary data characters. See also the com-
+ ments about multiple patterns versus a single pattern with
+ alternatives in the description of -e above.
+
+ If this option is given more than once, all the specified
+ files are read. A data line is output if any of the patterns
+ match it. A file name can be given as "-" to refer to the
+ standard input. When -f is used, patterns specified on the
+ command line using -e may also be present; they are tested
+ before the file's patterns. However, no other pattern is
taken from the command line; all arguments are treated as the
names of paths to be searched.
--file-list=filename
- Read a list of files and/or directories that are to be
- scanned from the given file, one per line. Trailing white
- space is removed from each line, and blank lines are ignored.
- These paths are processed before any that are listed on the
- command line. The file name can be given as "-" to refer to
- the standard input. If --file and --file-list are both spec-
- ified as "-", patterns are read first. This is useful only
- when the standard input is a terminal, from which further
- lines (the list of files) can be read after an end-of-file
- indication. If this option is given more than once, all the
- specified files are read.
+ Read a list of files and/or directories that are to be
+ scanned from the given file, one per line. What constitutes a
+ newline when reading the file is the operating system's
+ default. Trailing white space is removed from each line, and
+ blank lines are ignored. These paths are processed before any
+ that are listed on the command line. The file name can be
+ given as "-" to refer to the standard input. If --file and
+ --file-list are both specified as "-", patterns are read
+ first. This is useful only when the standard input is a ter-
+ minal, from which further lines (the list of files) can be
+ read after an end-of-file indication. If this option is given
+ more than once, all the specified files are read.
--file-offsets
Instead of showing lines or parts of lines that match, show
@@ -464,14 +475,14 @@ OPTIONS
processed line by line, and the output is flushed after each
write. By default, input is read in large chunks, unless
pcre2grep can determine that it is reading from a terminal
- (which is currently possible only in Unix-like environments).
- Output to terminal is normally automatically flushed by the
- operating system. This option can be useful when the input or
- output is attached to a pipe and you do not want pcre2grep to
- buffer up large amounts of data. However, its use will affect
- performance, and the -M (multiline) option ceases to work.
- When input is from a compressed .gz or .bz2 file, --line-
- buffered is ignored.
+ (which is currently possible only in Unix-like environments
+ or Windows). Output to terminal is normally automatically
+ flushed by the operating system. This option can be useful
+ when the input or output is attached to a pipe and you do not
+ want pcre2grep to buffer up large amounts of data. However,
+ its use will affect performance, and the -M (multiline)
+ option ceases to work. When input is from a compressed .gz or
+ .bz2 file, --line-buffered is ignored.
--line-offsets
Instead of showing lines or parts of lines that match, show
@@ -506,12 +517,12 @@ OPTIONS
processing loop. If the value set by --match-limit is
reached, an error occurs.
- The --heap-limit option specifies, as a number of kilobytes,
- the amount of heap memory that may be used for matching. Heap
- memory is needed only if matching the pattern requires a sig-
- nificant number of nested backtracking points to be remem-
- bered. This parameter can be set to zero to forbid the use of
- heap memory altogether.
+ The --heap-limit option specifies, as a number of kibibytes
+ (units of 1024 bytes), the amount of heap memory that may be
+ used for matching. Heap memory is needed only if matching the
+ pattern requires a significant number of nested backtracking
+ points to be remembered. This parameter can be set to zero to
+ forbid the use of heap memory altogether.
The --depth-limit option limits the depth of nested back-
tracking points, which indirectly limits the amount of memory
@@ -521,10 +532,10 @@ OPTIONS
limit acts varies from pattern to pattern. This limit is of
use only if it is set smaller than --match-limit.
- There are no short forms for these options. The default set-
- tings are specified when the PCRE2 library is compiled, with
- the default defaults being very large and so effectively
- unlimited.
+ There are no short forms for these options. The default lim-
+ its can be set when the PCRE2 library is compiled; if they
+ are not specified, the defaults are very large and so effec-
+ tively unlimited.
--max-buffer-size=number
This limits the expansion of the processing buffer, whose
@@ -758,13 +769,13 @@ NEWLINES
newline conventions from the default. Any parts of the input files that
are written to the standard output are copied identically, with what-
ever newline sequences they have in the input. However, the setting of
- this option does not affect the interpretation of files specified by
- the -f, --exclude-from, or --include-from options, which are assumed to
- use the operating system's standard newline sequence, nor does it
- affect the way in which pcre2grep writes informational messages to the
- standard error and output streams. For these it uses the string "\n" to
- indicate newlines, relying on the C I/O library to convert this to an
- appropriate sequence.
+ this option affects only the way scanned files are processed. It does
+ not affect the interpretation of files specified by the -f, --file-
+ list, --exclude-from, or --include-from options, nor does it affect the
+ way in which pcre2grep writes informational messages to the standard
+ error and output streams. For these it uses the string "\n" to indicate
+ newlines, relying on the C I/O library to convert this to an appropri-
+ ate sequence.
OPTIONS COMPATIBILITY
@@ -929,5 +940,5 @@ AUTHOR
REVISION
- Last updated: 13 November 2017
- Copyright (c) 1997-2017 University of Cambridge.
+ Last updated: 24 February 2018
+ Copyright (c) 1997-2018 University of Cambridge.