Imported Upstream version 4.2.2

author: Clint Adams <clint@debian.org> 2013-05-09 01:03:12 -0400
committer: Clint Adams <clint@debian.org> 2013-05-09 01:03:12 -0400
commit: d75f3c567505ad7acd2c1943207b367593652739 (patch)
tree: e519be160770e6b20bfe88eb923ea6aa8edb3e58 /doc/sed-in.texi
parent: 86bbc911e93efe1f0957ee887182b3d64bb0eec4 (diff)
1 files changed, 108 insertions, 38 deletions
diff --git a/doc/sed-in.texi b/doc/sed-in.texi
index c8bb21d..bf5158c 100644
--- a/doc/sed-in.texi
+++ b/doc/sed-in.texi
@@ -1,7 +1,7 @@
 \input texinfo  @c -*-texinfo-*-
 @c
 @c -- Stuff that needs adding: ----------------------------------------------
-@c (document the `;' command-separator)
+@c (nothing!)
 @c --------------------------------------------------------------------------
 @c Check for consistency: regexps in @code, text that they match in @samp.
 @c 
@@ -280,6 +280,7 @@ A length of 0 (zero) means to never wrap long lines.  If
 not specified, it is taken to be 70.
 
 @item --posix
+@opindex --posix
 @cindex @value{SSEDEXT}, disabling
 @value{SSED} includes several extensions to @acronym{POSIX}
 sed.  In order to simplify writing portable scripts, this
@@ -346,6 +347,8 @@ Perl-style regular expressions}.
 
 @item -s
 @itemx --separate
+@opindex -s
+@opindex --separate
 @cindex Working on separate files
 By default, @command{sed} will consider the files specified on the
 command line as a single continuous long stream.  This @value{SSED}
@@ -366,6 +369,16 @@ Buffer both input and output as minimally as practical.
 the likes of @samp{tail -f}, and you wish to see the transformed
 output as soon as possible.)
 
+@item -z
+@itemx --null-data
+@itemx --zero-terminated
+@opindex -z
+@opindex --null-data
+@opindex --zero-terminated
+Treat the input as a set of lines, each terminated by a zero byte
+(the ASCII @samp{NUL} character) instead of a newline.  This option can
+be used with commands like @samp{sort -z} and @samp{find -print0}
+to process arbitrary file names.
 @end table
 
 If no @option{-e}, @option{-f}, @option{--expression}, or @option{--file}
@@ -396,6 +409,14 @@ This document will refer to ``the'' @command{sed} script;
 this is understood to mean the in-order catenation
 of all of the @var{script}s and @var{script-file}s passed in.
 
+Commands within a @var{script} or @var{script-file} can be
+separated by semicolons (@code{;}) or newlines (ASCII 10).
+Some commands, due to their syntax, cannot be followed by semicolons
+working as command separators and thus should be terminated
+with newlines or be placed at the end of a @var{script} or @var{script-file}.
+Commands can also be preceded with optional non-significant
+whitespace characters.
+
 Each @code{sed} command consists of an optional address or
 address range, followed by a one-character command name
 and any additional command-specific code.
@@ -424,7 +445,7 @@ and any additional command-specific code.
 and the auxiliary @emph{hold} space. Both are initially empty.
 
 @command{sed} operates by performing the following cycle on each
-lines of input: first, @command{sed} reads one line from the input
+line of input: first, @command{sed} reads one line from the input
 stream, removes any trailing newline, and places it in the pattern space.
 Then commands are executed; each command can have an address associated
 to it: addresses are a kind of condition code, and a command is only
@@ -521,15 +542,16 @@ a case-insensitive manner.
 
 @item /@var{regexp}/M
 @itemx \%@var{regexp}%M
-@ifset PERL
 @cindex @value{SSEDEXT}, @code{M} modifier
-@end ifset
+@ifset PERL
 @cindex Perl-style regular expressions, multiline
+@end ifset
 The @code{M} modifier to regular-expression matching is a @value{SSED}
-extension which causes @code{^} and @code{$} to match respectively
-(in addition to the normal behavior) the empty string after a newline,
-and the empty string before a newline.  There are special character
-sequences
+extension which directs @value{SSED} to match the regular expression
+in @cite{multi-line} mode.  The modifier causes @code{^} and @code{$} to
+match respectively (in addition to the normal behavior) the empty string
+after a newline, and the empty string before a newline.  There are
+special character sequences
 @ifset PERL
 (@code{\A} and @code{\Z} in Perl mode, @code{\`} and @code{\'}
 in basic or extended regular expression modes)
@@ -538,7 +560,12 @@ in basic or extended regular expression modes)
 (@code{\`} and @code{\'})
 @end ifclear
 which always match the beginning or the end of the buffer.
-@code{M} stands for @cite{multi-line}.
+In addition,
+@ifset PERL
+just like in Perl mode without the @code{S} modifier,
+@end ifset
+the period character does not match a new-line character in
+multi-line mode.
 
 @ifset PERL
 @item /@var{regexp}/S
@@ -835,7 +862,7 @@ string), while the second matches only strings containing
 at least one character.
 
 @item ^main.*(.*)
-his matches a string starting with @samp{main},
+This matches a string starting with @samp{main},
 followed by an opening and closing
 parenthesis.  The @samp{n}, @samp{(} and @samp{)} need not
 be adjacent.
@@ -1004,6 +1031,32 @@ to uppercase,
 Stop case conversion started by @code{\L} or @code{\U}.
 @end table
 
+When the @code{g} flag is being used, case conversion does not
+propagate from one occurrence of the regular expression to
+another.  For example, when the following command is executed
+with @samp{a-b-} in pattern space:
+@example
+s/\(b\?\)-/x\u\1/g
+@end example
+
+@noindent
+the output is @samp{axxB}.  When replacing the first @samp{-},
+the @samp{\u} sequence only affects the empty replacement of
+@samp{\1}.  It does not affect the @code{x} character that is
+added to pattern space when replacing @code{b-} with @code{xB}.
+
+On the other hand, @code{\l} and @code{\u} do affect the remainder
+of the replacement text if they are followed by an empty substitution.
+With @samp{a-b-} in pattern space, the following command:
+@example
+s/\(b\?\)-/\u\1x/g
+@end example
+
+@noindent
+will replace @samp{-} with @samp{X} (uppercase) and @samp{b-} with
+@samp{Bx}.  If this behavior is undesirable, you can prevent it by
+adding a @samp{\E} sequence---after @samp{\1} in this case.
+
 To include a literal @code{\}, @code{&}, or newline in the final
 replacement, be sure to precede the desired @code{\}, @code{&},
 or newline in the @var{replacement} with a @code{\}.
@@ -1091,10 +1144,11 @@ case-insensitive manner.
 @cindex Perl-style regular expressions, multiline
 @end ifset
 The @code{M} modifier to regular-expression matching is a @value{SSED}
-extension which causes @code{^} and @code{$} to match respectively
-(in addition to the normal behavior) the empty string after a newline,
-and the empty string before a newline.  There are special character
-sequences
+extension which directs @value{SSED} to match the regular expression
+in @cite{multi-line} mode.  The modifier causes @code{^} and @code{$} to
+match respectively (in addition to the normal behavior) the empty string
+after a newline, and the empty string before a newline.  There are
+special character sequences
 @ifset PERL
 (@code{\A} and @code{\Z} in Perl mode, @code{\`} and @code{\'}
 in basic or extended regular expression modes)
@@ -1103,7 +1157,12 @@ in basic or extended regular expression modes)
 (@code{\`} and @code{\'})
 @end ifclear
 which always match the beginning or the end of the buffer.
-@code{M} stands for @cite{multi-line}.
+In addition,
+@ifset PERL
+just like in Perl mode without the @code{S} modifier,
+@end ifset
+the period character does not match a new-line character in
+multi-line mode.
 
 @ifset PERL
 @item S
@@ -1259,19 +1318,18 @@ error, and @file{/dev/stdout}, which writes to the standard
 output.@footnote{This is equivalent to @code{p} unless the @option{-i}
 option is being used.}
 
-The file will be created (or truncated) before the
-first input line is read; all @code{w} commands
-(including instances of @code{w} flag on successful @code{s} commands)
-which refer to the same @var{filename} are output without
-closing and reopening the file.
+The file will be created (or truncated) before the first input line is
+read; all @code{w} commands (including instances of the @code{w} flag
+on successful @code{s} commands) which refer to the same @var{filename}
+are output without closing and reopening the file.
 
 @item D
 @findex D (delete first line) command
 @cindex Delete first line from pattern space
-Delete text in the pattern space up to the first newline.
-If any text is left, restart cycle with the resultant
-pattern space (without reading a new line of input),
-otherwise start a normal new cycle.
+If pattern space contains no newline, start a normal new cycle as if
+the @code{d} command was issued.  Otherwise, delete text in the pattern
+space up to the first newline, and restart cycle with the resultant
+pattern space, without reading a new line of input.
 
 @item N
 @findex N (append Next line) command
@@ -1383,13 +1441,24 @@ replaces the pattern space with the output; a trailing newline
 is suppressed.
 
 If a parameter is specified, instead, the @code{e} command
-interprets it as a command and sends its output to the output stream
-(like @code{r} does).  The command can run across multiple
-lines, all but the last ending with a back-slash.
+interprets it as a command and sends its output to the output stream.
+The command can run across multiple lines, all but the last ending with
+a back-slash.
 
 In both cases, the results are undefined if the command to be
 executed contains a @sc{nul} character.
 
+Note that, unlike the @code{r} command, the output of the command will
+be printed immediately; the @code{r} command instead delays the output
+to the end of the current cycle.
+
+@item F
+@findex F (File name) command
+@cindex Printing file name
+@cindex File name, printing
+Print out the file name of the current input file (with a trailing
+newline).
+
 @item L @var{n}
 @findex L (fLow paragraphs) command
 @cindex Reformat pattern space
@@ -1712,7 +1781,7 @@ and then again substituting underscores with zeros.
 
 /[^0-9]/ d
 
-# replace all leading 9s by _ (any other character except digits, could
+# replace all trailing 9s by _ (any other character except digits, could
 # be used)
 :d
 s/9\(_*\)$/_\1/
@@ -1720,9 +1789,6 @@ td
 
 # incr last digit only.  The first line adds a most-significant
 # digit of 1 if we have to add a digit.
-#
-# The @code{tn} commands are not necessary, but make the thing
-# faster
 
 s/^\(_*\)$/1\1/; tn
 s/8\(_*\)$/9\1/; tn
@@ -1841,9 +1907,12 @@ x
 G
 
 # check if converted file name is equal to original file name,
-# if it is, do not print nothing
+# if it is, do not print anything
 /^.*\/\(.*\)\n\1/b
 
+# escape special characters for the shell
+s/["$`\\]/\\&/g
+
 # now, transform path/fromfile\n, into
 # mv path/fromfile path/tofile and print it
 s/^\(.*\/\)\(.*\)\n\(.*\)$/mv "\1\2" "\1\3"/p
@@ -2621,8 +2690,7 @@ for the @code{sed-users} mailing list.
 @chapter Reporting Bugs
 
 @cindex Bugs, reporting
-Email bug reports to @email{bonzini@@gnu.org}.
-Be sure to include the word ``sed'' somewhere in the @code{Subject:} field.
+Email bug reports to @email{bug-sed@@gnu.org}.
 Also, please include the output of @samp{sed --version} in the body
 of your report if at all possible.
 
@@ -2808,10 +2876,12 @@ the @env{LC_COLLATE} and @env{LC_CTYPE} environment variables to @samp{C}.
 
 The only difference between basic and extended regular expressions is in
 the behavior of a few characters: @samp{?}, @samp{+}, parentheses,
-and braces (@samp{@{@}}).  While basic regular expressions require
-these to be escaped if you want them to behave as special characters,
-when using extended regular expressions you must escape them if
-you want them @emph{to match a literal character}.
+braces (@samp{@{@}}), and @samp{|}.  While basic regular expressions
+require these to be escaped if you want them to behave as special
+characters, when using extended regular expressions you must escape
+them if you want them @emph{to match a literal character}.  @samp{|}
+is special here because @samp{\|} is a GNU extension -- standard
+basic regular expressions do not provide its functionality.
 
 @noindent
 Examples:
author	Clint Adams <clint@debian.org>	2013-05-09 01:03:12 -0400
committer	Clint Adams <clint@debian.org>	2013-05-09 01:03:12 -0400
commit	d75f3c567505ad7acd2c1943207b367593652739 (patch)
tree	e519be160770e6b20bfe88eb923ea6aa8edb3e58 /doc/sed-in.texi
parent	86bbc911e93efe1f0957ee887182b3d64bb0eec4 (diff)