diff options
author | Clint Adams <clint@debian.org> | 2013-05-09 01:03:12 -0400 |
---|---|---|
committer | Clint Adams <clint@debian.org> | 2013-05-09 01:03:12 -0400 |
commit | d75f3c567505ad7acd2c1943207b367593652739 (patch) | |
tree | e519be160770e6b20bfe88eb923ea6aa8edb3e58 /doc/sed-in.texi | |
parent | 86bbc911e93efe1f0957ee887182b3d64bb0eec4 (diff) |
Imported Upstream version 4.2.2
Diffstat (limited to 'doc/sed-in.texi')
-rw-r--r-- | doc/sed-in.texi | 146 |
1 files changed, 108 insertions, 38 deletions
diff --git a/doc/sed-in.texi b/doc/sed-in.texi index c8bb21d..bf5158c 100644 --- a/doc/sed-in.texi +++ b/doc/sed-in.texi @@ -1,7 +1,7 @@ \input texinfo @c -*-texinfo-*- @c @c -- Stuff that needs adding: ---------------------------------------------- -@c (document the `;' command-separator) +@c (nothing!) @c -------------------------------------------------------------------------- @c Check for consistency: regexps in @code, text that they match in @samp. @c @@ -280,6 +280,7 @@ A length of 0 (zero) means to never wrap long lines. If not specified, it is taken to be 70. @item --posix +@opindex --posix @cindex @value{SSEDEXT}, disabling @value{SSED} includes several extensions to @acronym{POSIX} sed. In order to simplify writing portable scripts, this @@ -346,6 +347,8 @@ Perl-style regular expressions}. @item -s @itemx --separate +@opindex -s +@opindex --separate @cindex Working on separate files By default, @command{sed} will consider the files specified on the command line as a single continuous long stream. This @value{SSED} @@ -366,6 +369,16 @@ Buffer both input and output as minimally as practical. the likes of @samp{tail -f}, and you wish to see the transformed output as soon as possible.) +@item -z +@itemx --null-data +@itemx --zero-terminated +@opindex -z +@opindex --null-data +@opindex --zero-terminated +Treat the input as a set of lines, each terminated by a zero byte +(the ASCII @samp{NUL} character) instead of a newline. This option can +be used with commands like @samp{sort -z} and @samp{find -print0} +to process arbitrary file names. @end table If no @option{-e}, @option{-f}, @option{--expression}, or @option{--file} @@ -396,6 +409,14 @@ This document will refer to ``the'' @command{sed} script; this is understood to mean the in-order catenation of all of the @var{script}s and @var{script-file}s passed in. +Commands within a @var{script} or @var{script-file} can be +separated by semicolons (@code{;}) or newlines (ASCII 10). +Some commands, due to their syntax, cannot be followed by semicolons +working as command separators and thus should be terminated +with newlines or be placed at the end of a @var{script} or @var{script-file}. +Commands can also be preceded with optional non-significant +whitespace characters. + Each @code{sed} command consists of an optional address or address range, followed by a one-character command name and any additional command-specific code. @@ -424,7 +445,7 @@ and any additional command-specific code. and the auxiliary @emph{hold} space. Both are initially empty. @command{sed} operates by performing the following cycle on each -lines of input: first, @command{sed} reads one line from the input +line of input: first, @command{sed} reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed; each command can have an address associated to it: addresses are a kind of condition code, and a command is only @@ -521,15 +542,16 @@ a case-insensitive manner. @item /@var{regexp}/M @itemx \%@var{regexp}%M -@ifset PERL @cindex @value{SSEDEXT}, @code{M} modifier -@end ifset +@ifset PERL @cindex Perl-style regular expressions, multiline +@end ifset The @code{M} modifier to regular-expression matching is a @value{SSED} -extension which causes @code{^} and @code{$} to match respectively -(in addition to the normal behavior) the empty string after a newline, -and the empty string before a newline. There are special character -sequences +extension which directs @value{SSED} to match the regular expression +in @cite{multi-line} mode. The modifier causes @code{^} and @code{$} to +match respectively (in addition to the normal behavior) the empty string +after a newline, and the empty string before a newline. There are +special character sequences @ifset PERL (@code{\A} and @code{\Z} in Perl mode, @code{\`} and @code{\'} in basic or extended regular expression modes) @@ -538,7 +560,12 @@ in basic or extended regular expression modes) (@code{\`} and @code{\'}) @end ifclear which always match the beginning or the end of the buffer. -@code{M} stands for @cite{multi-line}. +In addition, +@ifset PERL +just like in Perl mode without the @code{S} modifier, +@end ifset +the period character does not match a new-line character in +multi-line mode. @ifset PERL @item /@var{regexp}/S @@ -835,7 +862,7 @@ string), while the second matches only strings containing at least one character. @item ^main.*(.*) -his matches a string starting with @samp{main}, +This matches a string starting with @samp{main}, followed by an opening and closing parenthesis. The @samp{n}, @samp{(} and @samp{)} need not be adjacent. @@ -1004,6 +1031,32 @@ to uppercase, Stop case conversion started by @code{\L} or @code{\U}. @end table +When the @code{g} flag is being used, case conversion does not +propagate from one occurrence of the regular expression to +another. For example, when the following command is executed +with @samp{a-b-} in pattern space: +@example +s/\(b\?\)-/x\u\1/g +@end example + +@noindent +the output is @samp{axxB}. When replacing the first @samp{-}, +the @samp{\u} sequence only affects the empty replacement of +@samp{\1}. It does not affect the @code{x} character that is +added to pattern space when replacing @code{b-} with @code{xB}. + +On the other hand, @code{\l} and @code{\u} do affect the remainder +of the replacement text if they are followed by an empty substitution. +With @samp{a-b-} in pattern space, the following command: +@example +s/\(b\?\)-/\u\1x/g +@end example + +@noindent +will replace @samp{-} with @samp{X} (uppercase) and @samp{b-} with +@samp{Bx}. If this behavior is undesirable, you can prevent it by +adding a @samp{\E} sequence---after @samp{\1} in this case. + To include a literal @code{\}, @code{&}, or newline in the final replacement, be sure to precede the desired @code{\}, @code{&}, or newline in the @var{replacement} with a @code{\}. @@ -1091,10 +1144,11 @@ case-insensitive manner. @cindex Perl-style regular expressions, multiline @end ifset The @code{M} modifier to regular-expression matching is a @value{SSED} -extension which causes @code{^} and @code{$} to match respectively -(in addition to the normal behavior) the empty string after a newline, -and the empty string before a newline. There are special character -sequences +extension which directs @value{SSED} to match the regular expression +in @cite{multi-line} mode. The modifier causes @code{^} and @code{$} to +match respectively (in addition to the normal behavior) the empty string +after a newline, and the empty string before a newline. There are +special character sequences @ifset PERL (@code{\A} and @code{\Z} in Perl mode, @code{\`} and @code{\'} in basic or extended regular expression modes) @@ -1103,7 +1157,12 @@ in basic or extended regular expression modes) (@code{\`} and @code{\'}) @end ifclear which always match the beginning or the end of the buffer. -@code{M} stands for @cite{multi-line}. +In addition, +@ifset PERL +just like in Perl mode without the @code{S} modifier, +@end ifset +the period character does not match a new-line character in +multi-line mode. @ifset PERL @item S @@ -1259,19 +1318,18 @@ error, and @file{/dev/stdout}, which writes to the standard output.@footnote{This is equivalent to @code{p} unless the @option{-i} option is being used.} -The file will be created (or truncated) before the -first input line is read; all @code{w} commands -(including instances of @code{w} flag on successful @code{s} commands) -which refer to the same @var{filename} are output without -closing and reopening the file. +The file will be created (or truncated) before the first input line is +read; all @code{w} commands (including instances of the @code{w} flag +on successful @code{s} commands) which refer to the same @var{filename} +are output without closing and reopening the file. @item D @findex D (delete first line) command @cindex Delete first line from pattern space -Delete text in the pattern space up to the first newline. -If any text is left, restart cycle with the resultant -pattern space (without reading a new line of input), -otherwise start a normal new cycle. +If pattern space contains no newline, start a normal new cycle as if +the @code{d} command was issued. Otherwise, delete text in the pattern +space up to the first newline, and restart cycle with the resultant +pattern space, without reading a new line of input. @item N @findex N (append Next line) command @@ -1383,13 +1441,24 @@ replaces the pattern space with the output; a trailing newline is suppressed. If a parameter is specified, instead, the @code{e} command -interprets it as a command and sends its output to the output stream -(like @code{r} does). The command can run across multiple -lines, all but the last ending with a back-slash. +interprets it as a command and sends its output to the output stream. +The command can run across multiple lines, all but the last ending with +a back-slash. In both cases, the results are undefined if the command to be executed contains a @sc{nul} character. +Note that, unlike the @code{r} command, the output of the command will +be printed immediately; the @code{r} command instead delays the output +to the end of the current cycle. + +@item F +@findex F (File name) command +@cindex Printing file name +@cindex File name, printing +Print out the file name of the current input file (with a trailing +newline). + @item L @var{n} @findex L (fLow paragraphs) command @cindex Reformat pattern space @@ -1712,7 +1781,7 @@ and then again substituting underscores with zeros. /[^0-9]/ d -# replace all leading 9s by _ (any other character except digits, could +# replace all trailing 9s by _ (any other character except digits, could # be used) :d s/9\(_*\)$/_\1/ @@ -1720,9 +1789,6 @@ td # incr last digit only. The first line adds a most-significant # digit of 1 if we have to add a digit. -# -# The @code{tn} commands are not necessary, but make the thing -# faster s/^\(_*\)$/1\1/; tn s/8\(_*\)$/9\1/; tn @@ -1841,9 +1907,12 @@ x G # check if converted file name is equal to original file name, -# if it is, do not print nothing +# if it is, do not print anything /^.*\/\(.*\)\n\1/b +# escape special characters for the shell +s/["$`\\]/\\&/g + # now, transform path/fromfile\n, into # mv path/fromfile path/tofile and print it s/^\(.*\/\)\(.*\)\n\(.*\)$/mv "\1\2" "\1\3"/p @@ -2621,8 +2690,7 @@ for the @code{sed-users} mailing list. @chapter Reporting Bugs @cindex Bugs, reporting -Email bug reports to @email{bonzini@@gnu.org}. -Be sure to include the word ``sed'' somewhere in the @code{Subject:} field. +Email bug reports to @email{bug-sed@@gnu.org}. Also, please include the output of @samp{sed --version} in the body of your report if at all possible. @@ -2808,10 +2876,12 @@ the @env{LC_COLLATE} and @env{LC_CTYPE} environment variables to @samp{C}. The only difference between basic and extended regular expressions is in the behavior of a few characters: @samp{?}, @samp{+}, parentheses, -and braces (@samp{@{@}}). While basic regular expressions require -these to be escaped if you want them to behave as special characters, -when using extended regular expressions you must escape them if -you want them @emph{to match a literal character}. +braces (@samp{@{@}}), and @samp{|}. While basic regular expressions +require these to be escaped if you want them to behave as special +characters, when using extended regular expressions you must escape +them if you want them @emph{to match a literal character}. @samp{|} +is special here because @samp{\|} is a GNU extension -- standard +basic regular expressions do not provide its functionality. @noindent Examples: |