summaryrefslogtreecommitdiff
path: root/doc/sed-in.texi
diff options
context:
space:
mode:
authorClint Adams <clint@debian.org>2013-05-09 01:03:12 -0400
committerClint Adams <clint@debian.org>2013-05-09 01:03:12 -0400
commitd75f3c567505ad7acd2c1943207b367593652739 (patch)
treee519be160770e6b20bfe88eb923ea6aa8edb3e58 /doc/sed-in.texi
parent86bbc911e93efe1f0957ee887182b3d64bb0eec4 (diff)
Imported Upstream version 4.2.2
Diffstat (limited to 'doc/sed-in.texi')
-rw-r--r--doc/sed-in.texi146
1 files changed, 108 insertions, 38 deletions
diff --git a/doc/sed-in.texi b/doc/sed-in.texi
index c8bb21d..bf5158c 100644
--- a/doc/sed-in.texi
+++ b/doc/sed-in.texi
@@ -1,7 +1,7 @@
\input texinfo @c -*-texinfo-*-
@c
@c -- Stuff that needs adding: ----------------------------------------------
-@c (document the `;' command-separator)
+@c (nothing!)
@c --------------------------------------------------------------------------
@c Check for consistency: regexps in @code, text that they match in @samp.
@c
@@ -280,6 +280,7 @@ A length of 0 (zero) means to never wrap long lines. If
not specified, it is taken to be 70.
@item --posix
+@opindex --posix
@cindex @value{SSEDEXT}, disabling
@value{SSED} includes several extensions to @acronym{POSIX}
sed. In order to simplify writing portable scripts, this
@@ -346,6 +347,8 @@ Perl-style regular expressions}.
@item -s
@itemx --separate
+@opindex -s
+@opindex --separate
@cindex Working on separate files
By default, @command{sed} will consider the files specified on the
command line as a single continuous long stream. This @value{SSED}
@@ -366,6 +369,16 @@ Buffer both input and output as minimally as practical.
the likes of @samp{tail -f}, and you wish to see the transformed
output as soon as possible.)
+@item -z
+@itemx --null-data
+@itemx --zero-terminated
+@opindex -z
+@opindex --null-data
+@opindex --zero-terminated
+Treat the input as a set of lines, each terminated by a zero byte
+(the ASCII @samp{NUL} character) instead of a newline. This option can
+be used with commands like @samp{sort -z} and @samp{find -print0}
+to process arbitrary file names.
@end table
If no @option{-e}, @option{-f}, @option{--expression}, or @option{--file}
@@ -396,6 +409,14 @@ This document will refer to ``the'' @command{sed} script;
this is understood to mean the in-order catenation
of all of the @var{script}s and @var{script-file}s passed in.
+Commands within a @var{script} or @var{script-file} can be
+separated by semicolons (@code{;}) or newlines (ASCII 10).
+Some commands, due to their syntax, cannot be followed by semicolons
+working as command separators and thus should be terminated
+with newlines or be placed at the end of a @var{script} or @var{script-file}.
+Commands can also be preceded with optional non-significant
+whitespace characters.
+
Each @code{sed} command consists of an optional address or
address range, followed by a one-character command name
and any additional command-specific code.
@@ -424,7 +445,7 @@ and any additional command-specific code.
and the auxiliary @emph{hold} space. Both are initially empty.
@command{sed} operates by performing the following cycle on each
-lines of input: first, @command{sed} reads one line from the input
+line of input: first, @command{sed} reads one line from the input
stream, removes any trailing newline, and places it in the pattern space.
Then commands are executed; each command can have an address associated
to it: addresses are a kind of condition code, and a command is only
@@ -521,15 +542,16 @@ a case-insensitive manner.
@item /@var{regexp}/M
@itemx \%@var{regexp}%M
-@ifset PERL
@cindex @value{SSEDEXT}, @code{M} modifier
-@end ifset
+@ifset PERL
@cindex Perl-style regular expressions, multiline
+@end ifset
The @code{M} modifier to regular-expression matching is a @value{SSED}
-extension which causes @code{^} and @code{$} to match respectively
-(in addition to the normal behavior) the empty string after a newline,
-and the empty string before a newline. There are special character
-sequences
+extension which directs @value{SSED} to match the regular expression
+in @cite{multi-line} mode. The modifier causes @code{^} and @code{$} to
+match respectively (in addition to the normal behavior) the empty string
+after a newline, and the empty string before a newline. There are
+special character sequences
@ifset PERL
(@code{\A} and @code{\Z} in Perl mode, @code{\`} and @code{\'}
in basic or extended regular expression modes)
@@ -538,7 +560,12 @@ in basic or extended regular expression modes)
(@code{\`} and @code{\'})
@end ifclear
which always match the beginning or the end of the buffer.
-@code{M} stands for @cite{multi-line}.
+In addition,
+@ifset PERL
+just like in Perl mode without the @code{S} modifier,
+@end ifset
+the period character does not match a new-line character in
+multi-line mode.
@ifset PERL
@item /@var{regexp}/S
@@ -835,7 +862,7 @@ string), while the second matches only strings containing
at least one character.
@item ^main.*(.*)
-his matches a string starting with @samp{main},
+This matches a string starting with @samp{main},
followed by an opening and closing
parenthesis. The @samp{n}, @samp{(} and @samp{)} need not
be adjacent.
@@ -1004,6 +1031,32 @@ to uppercase,
Stop case conversion started by @code{\L} or @code{\U}.
@end table
+When the @code{g} flag is being used, case conversion does not
+propagate from one occurrence of the regular expression to
+another. For example, when the following command is executed
+with @samp{a-b-} in pattern space:
+@example
+s/\(b\?\)-/x\u\1/g
+@end example
+
+@noindent
+the output is @samp{axxB}. When replacing the first @samp{-},
+the @samp{\u} sequence only affects the empty replacement of
+@samp{\1}. It does not affect the @code{x} character that is
+added to pattern space when replacing @code{b-} with @code{xB}.
+
+On the other hand, @code{\l} and @code{\u} do affect the remainder
+of the replacement text if they are followed by an empty substitution.
+With @samp{a-b-} in pattern space, the following command:
+@example
+s/\(b\?\)-/\u\1x/g
+@end example
+
+@noindent
+will replace @samp{-} with @samp{X} (uppercase) and @samp{b-} with
+@samp{Bx}. If this behavior is undesirable, you can prevent it by
+adding a @samp{\E} sequence---after @samp{\1} in this case.
+
To include a literal @code{\}, @code{&}, or newline in the final
replacement, be sure to precede the desired @code{\}, @code{&},
or newline in the @var{replacement} with a @code{\}.
@@ -1091,10 +1144,11 @@ case-insensitive manner.
@cindex Perl-style regular expressions, multiline
@end ifset
The @code{M} modifier to regular-expression matching is a @value{SSED}
-extension which causes @code{^} and @code{$} to match respectively
-(in addition to the normal behavior) the empty string after a newline,
-and the empty string before a newline. There are special character
-sequences
+extension which directs @value{SSED} to match the regular expression
+in @cite{multi-line} mode. The modifier causes @code{^} and @code{$} to
+match respectively (in addition to the normal behavior) the empty string
+after a newline, and the empty string before a newline. There are
+special character sequences
@ifset PERL
(@code{\A} and @code{\Z} in Perl mode, @code{\`} and @code{\'}
in basic or extended regular expression modes)
@@ -1103,7 +1157,12 @@ in basic or extended regular expression modes)
(@code{\`} and @code{\'})
@end ifclear
which always match the beginning or the end of the buffer.
-@code{M} stands for @cite{multi-line}.
+In addition,
+@ifset PERL
+just like in Perl mode without the @code{S} modifier,
+@end ifset
+the period character does not match a new-line character in
+multi-line mode.
@ifset PERL
@item S
@@ -1259,19 +1318,18 @@ error, and @file{/dev/stdout}, which writes to the standard
output.@footnote{This is equivalent to @code{p} unless the @option{-i}
option is being used.}
-The file will be created (or truncated) before the
-first input line is read; all @code{w} commands
-(including instances of @code{w} flag on successful @code{s} commands)
-which refer to the same @var{filename} are output without
-closing and reopening the file.
+The file will be created (or truncated) before the first input line is
+read; all @code{w} commands (including instances of the @code{w} flag
+on successful @code{s} commands) which refer to the same @var{filename}
+are output without closing and reopening the file.
@item D
@findex D (delete first line) command
@cindex Delete first line from pattern space
-Delete text in the pattern space up to the first newline.
-If any text is left, restart cycle with the resultant
-pattern space (without reading a new line of input),
-otherwise start a normal new cycle.
+If pattern space contains no newline, start a normal new cycle as if
+the @code{d} command was issued. Otherwise, delete text in the pattern
+space up to the first newline, and restart cycle with the resultant
+pattern space, without reading a new line of input.
@item N
@findex N (append Next line) command
@@ -1383,13 +1441,24 @@ replaces the pattern space with the output; a trailing newline
is suppressed.
If a parameter is specified, instead, the @code{e} command
-interprets it as a command and sends its output to the output stream
-(like @code{r} does). The command can run across multiple
-lines, all but the last ending with a back-slash.
+interprets it as a command and sends its output to the output stream.
+The command can run across multiple lines, all but the last ending with
+a back-slash.
In both cases, the results are undefined if the command to be
executed contains a @sc{nul} character.
+Note that, unlike the @code{r} command, the output of the command will
+be printed immediately; the @code{r} command instead delays the output
+to the end of the current cycle.
+
+@item F
+@findex F (File name) command
+@cindex Printing file name
+@cindex File name, printing
+Print out the file name of the current input file (with a trailing
+newline).
+
@item L @var{n}
@findex L (fLow paragraphs) command
@cindex Reformat pattern space
@@ -1712,7 +1781,7 @@ and then again substituting underscores with zeros.
/[^0-9]/ d
-# replace all leading 9s by _ (any other character except digits, could
+# replace all trailing 9s by _ (any other character except digits, could
# be used)
:d
s/9\(_*\)$/_\1/
@@ -1720,9 +1789,6 @@ td
# incr last digit only. The first line adds a most-significant
# digit of 1 if we have to add a digit.
-#
-# The @code{tn} commands are not necessary, but make the thing
-# faster
s/^\(_*\)$/1\1/; tn
s/8\(_*\)$/9\1/; tn
@@ -1841,9 +1907,12 @@ x
G
# check if converted file name is equal to original file name,
-# if it is, do not print nothing
+# if it is, do not print anything
/^.*\/\(.*\)\n\1/b
+# escape special characters for the shell
+s/["$`\\]/\\&/g
+
# now, transform path/fromfile\n, into
# mv path/fromfile path/tofile and print it
s/^\(.*\/\)\(.*\)\n\(.*\)$/mv "\1\2" "\1\3"/p
@@ -2621,8 +2690,7 @@ for the @code{sed-users} mailing list.
@chapter Reporting Bugs
@cindex Bugs, reporting
-Email bug reports to @email{bonzini@@gnu.org}.
-Be sure to include the word ``sed'' somewhere in the @code{Subject:} field.
+Email bug reports to @email{bug-sed@@gnu.org}.
Also, please include the output of @samp{sed --version} in the body
of your report if at all possible.
@@ -2808,10 +2876,12 @@ the @env{LC_COLLATE} and @env{LC_CTYPE} environment variables to @samp{C}.
The only difference between basic and extended regular expressions is in
the behavior of a few characters: @samp{?}, @samp{+}, parentheses,
-and braces (@samp{@{@}}). While basic regular expressions require
-these to be escaped if you want them to behave as special characters,
-when using extended regular expressions you must escape them if
-you want them @emph{to match a literal character}.
+braces (@samp{@{@}}), and @samp{|}. While basic regular expressions
+require these to be escaped if you want them to behave as special
+characters, when using extended regular expressions you must escape
+them if you want them @emph{to match a literal character}. @samp{|}
+is special here because @samp{\|} is a GNU extension -- standard
+basic regular expressions do not provide its functionality.
@noindent
Examples: