From 7725efc4b7e55c8fbbf9335a7de3a1e81ff8c76e Mon Sep 17 00:00:00 2001 From: John Millaway Date: Thu, 3 Apr 2003 00:22:08 +0000 Subject: xml now validates. --- doc/flex.xml | 227 +++++++++++++++++++++++++++++------------------------------ 1 file changed, 111 insertions(+), 116 deletions(-) (limited to 'doc') diff --git a/doc/flex.xml b/doc/flex.xml index 5a08ed0..71edc75 100644 --- a/doc/flex.xml +++ b/doc/flex.xml @@ -5,6 +5,13 @@ flex: a fast lexical analyzer generator + +1990 +1997 +The Regents of the University of California. +All rights reserved. + + -@defindex hk - -@defindex op -@dircategory Programming -@direntry - - - -This manual describes flex, a tool for generating programs that -perform pattern-matching on text. The manual includes both tutorial and -reference sections. - -This edition of @cite{The flex Manual} documents flex version -@value{VERSION}. It was last updated on @value{UPDATED}. - - - - -Copyright - - - - -The flex manual is placed under the same licensing conditions as the -rest of flex: - - -1990 -1997 -The Regents of the University of California. -All rights reserved. - - This code is derived from software contributed to Berkeley by Vern Paxson. @@ -87,6 +58,35 @@ WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. + + + +@defindex hk + +@defindex op +@dircategory Programming +@direntry + + + +This manual describes flex, a tool for generating programs that +perform pattern-matching on text. The manual includes both tutorial and +reference sections. + +This edition of @cite{The flex Manual} documents flex version +@value{VERSION}. It was last updated on @value{UPDATED}. + + + + +Copyright + + + + +The flex manual is placed under the same licensing conditions as the +rest of flex: + @@ -281,7 +281,8 @@ line containing only @samp{%%}. * Rules Section:: * User Code Section:: * Comments in the Input:: -@end menu--> +@end menu +-->
@@ -498,6 +499,7 @@ ruleD ECHO; +
@@ -725,7 +727,7 @@ input yourself, or explicitly use @samp{r/\r\n} for @samp{r$}. -r +<s>r an @samp{r}, but only in start condition @code{s} (see @ref{Start Conditions} for discussion of start conditions). @@ -733,14 +735,14 @@ Conditions} for discussion of start conditions). -r +<s1,s2,s3>r same, but in any of start conditions @code{s1}, @code{s2}, or @code{s3}. -<*>r +<*>r an @samp{r} in any start condition, even an exclusive one. @@ -749,14 +751,14 @@ an @samp{r} in any start condition, even an exclusive one. -<> +<<EOF>> an end-of-file. -<> +<s1,s2><<EOF>> an end-of-file when in start condition @code{s1} or @code{s2} @@ -917,7 +919,7 @@ input unless there's another quote in the input. A rule can have at most one instance of trailing context (the @samp{/} operator -or the @samp{$} operator). The start condition, @samp{^}, and @samp{<>} patterns +or the @samp{$} operator). The start condition, @samp{^}, and @samp{<<EOF>>} patterns can only occur at the beginning of a pattern, and, as well as with @samp{/} and @samp{$}, cannot be grouped inside parentheses. A @samp{^} which does not occur at the beginning of a rule or a @samp{$} which does not occur at the end of @@ -1450,7 +1452,7 @@ the @code{YY_DECL} macro. For example, you could use: to give the scanning routine the name @code{lexscan}, returning a float, and taking two floats as arguments. Note that if you give arguments to -the scanning routine using a K&R-style/non-prototyped function +the scanning routine using a K&R-style/non-prototyped function declaration, you must terminate the definition with a semi-colon (;). flex generates @samp{C99} function definitions by @@ -1460,7 +1462,7 @@ bootstrapping gcc on old systems. Unfortunately, traditional definitions prevent us from using any standard data types smaller than int (such as short, char, or bool) as function arguments. For this reason, future versions of flex may generate standard C99 code -only, leaving K&R-style functions to the historians. Currently, if you +only, leaving K&R-style functions to the historians. Currently, if you do @strong{not} want @samp{C99} definitions, then you must use @code{%option noansi-definitions}. @@ -1563,8 +1565,8 @@ assigning it to some other @code{FILE} pointer. flex provides a mechanism for conditionally activating rules. -Any rule whose pattern is prefixed with @samp{} will only be active -when the scanner is in the @dfn{start condition} named @code{sc}. For +Any rule whose pattern is prefixed with @samp{<NAME>} will only be active +when the scanner is in the @dfn{start condition} named @code{NAME}. For example, @@ -1647,7 +1649,7 @@ is equivalent to -Without the @code{} qualifier, the @code{bar} pattern in +Without the @code{<INITIAL,example>} qualifier, the @code{bar} pattern in the second example wouldn't be active (i.e., couldn't match) when in start condition @code{example}. If we just used @code{example>} to qualify @code{bar}, though, then it would only be active in @@ -1657,11 +1659,11 @@ start condition is an inclusive @code{(%s)} start condition. Also note that the special start-condition specifier -@code{<*>} +@code{<*>} matches every start condition. Thus, the above example could also have been written: - + Flex provides @code{YYSTATE} as an alias for @code{YY_START} (since that -is what's used by AT&T @code{lex}). +is what's used by & @code{lex}). For historical reasons, start conditions do not have their own name-space within the generated scanner. The start condition names are @@ -2095,7 +2097,7 @@ current buffer. It should not be used as an lvalue. Here are two examples of using these features for writing a scanner which expands include files (the -@code{<>} +@code{<<EOF>>} feature is discussed below). This first example uses yypush_buffer_state and yypop_buffer_state. Flex @@ -2250,7 +2252,7 @@ reflecting the size of the buffer. End-of-File Rules -The special rule @code{<>} indicates +The special rule @code{<<EOF>>} indicates actions which are to be taken when an end-of-file is encountered and yywrap returns non-zero (i.e., indicates no further files to process). The action must finish @@ -2284,10 +2286,10 @@ shown in the example above. -<> rules may not be used with other patterns; they may only be -qualified with a list of start conditions. If an unqualified <> +<<EOF>> rules may not be used with other patterns; they may only be +qualified with a list of start conditions. If an unqualified <<EOF>> rule is given, it applies to @emph{all} start conditions which do not -already have <> actions. To specify an <> rule for only the +already have <<EOF>> actions. To specify an <<EOF>> rule for only the initial start condition, use: @@ -2556,7 +2558,8 @@ menu. If you want to lookup a particular option by name, @xref{Index of Scanner * Options for Scanner Speed and Size:: * Debugging Options:: * Miscellaneous Options:: -@end menu--> +@end menu +--> Even though there are many scanner options, a typical scanner might only @@ -2621,8 +2624,6 @@ corresponding routine not appearing in the generated scanner: (though yy_push_state and friends won't appear anyway unless you use @code{%option stack)}. - -
Options for Specifing Filenames @@ -2750,7 +2751,7 @@ not be folded). For tricky behavior, see @ref{case and character ranges}. -l, --lex-compat, @code{%option lex-compat} -turns on maximum compatibility with the original AT&T @code{lex} +turns on maximum compatibility with the original & @code{lex} implementation. Note that this does not mean @emph{full} compatibility. Use of this option costs a considerable amount of performance, and it cannot be used with the @samp{--c++}, @samp{--full}, @samp{--fast}, @samp{-Cf}, or @@ -2927,7 +2928,7 @@ in behavior. At the current writing the known differences between -In POSIX and AT&T @code{lex}, the repeat operator, @samp{@{@}}, has lower +In POSIX and & @code{lex}, the repeat operator, @samp{@{@}}, has lower precedence than concatenation (thus @samp{ab@{3@}} yields @samp{ababab}). Most POSIX utilities use an Extended Regular Expression (ERE) precedence that has the precedence of the repeat operator higher than concatenation @@ -2935,7 +2936,7 @@ that has the precedence of the repeat operator higher than concatenation places the precedence of the repeat operator higher than concatenation which matches the ERE processing of other POSIX utilities. When either @samp{--posix} or @samp{-l} are specified, flex will use the -traditional AT&T and POSIX-compliant precedence for the repeat operator +traditional & and POSIX-compliant precedence for the repeat operator where concatenation has higher precedence than the repeat operator. @@ -3291,7 +3292,6 @@ between small scanners and fast scanners. @opindex -C - -C @@ -3688,6 +3688,7 @@ prints the version number to stdout and exits. +
@@ -4388,10 +4389,9 @@ multi-threaded applications. Any thread may create and execute a reentrant * Reentrant Example:: * Reentrant Detail:: * Reentrant Functions:: -@end menu--> - +@end menu +--> -
Uses for Reentrant Scanners @@ -4547,7 +4547,8 @@ Here are the things you need to do or know to use the reentrant C API of * Accessor Methods:: * Extra Data:: * About yyscan_t:: -@end menu--> +@end menu +-->
@@ -4961,6 +4962,7 @@ yylloc assumes that @code{YYSLYPE} is a valid type. Typically, these types are generated by bison, and are included in section 1 of the flex input. +
@@ -4969,7 +4971,7 @@ input. -flex is a rewrite of the AT&T Unix @emph{lex} tool (the two +flex is a rewrite of the & Unix @emph{lex} tool (the two implementations do not share any code, though), with some extensions and incompatibilities, both of which are of concern to those who wish to write scanners acceptable to both implementations. flex is fully @@ -4977,9 +4979,9 @@ compliant with the POSIX @code{lex} specification, except that when using @code{%pointer} (the default), a call to unput destroys the contents of yytext, which is counter to the POSIX specification. In this section we discuss all of the known areas of -incompatibility between flex, AT&T @code{lex}, and the POSIX +incompatibility between flex, & @code{lex}, and the POSIX specification. flex's @samp{-l} option turns on maximum -compatibility with the original AT&T @code{lex} implementation, at the +compatibility with the original & @code{lex} implementation, at the cost of a major loss in the generated scanner's performance. We note below which incompatibilities can be overcome using the @samp{-l} option. flex is fully compatible with @code{lex} with the @@ -5118,7 +5120,7 @@ and so the string @samp{foo} will match. Note that if the definition begins with @samp{^} or ends with @samp{$} then it is @emph{not} expanded with parentheses, to allow these operators to appear in definitions without losing their special -meanings. But the @samp{}, @samp{/}, and @code{<>} operators +meanings. But the @samp{<s>}, @samp{/}, and @code{<<EOF>>} operators cannot be used in a flex definition. @@ -5169,7 +5171,7 @@ This is not the case with @code{lex} or the POSIX specification. The The precedence of the @samp{@{,@}} (numeric range) operator is -different. The AT&T and POSIX specifications of @code{lex} +different. The & and POSIX specifications of @code{lex} interpret @samp{abc@{1,3@}} as match one, two, or three occurrences of @samp{abc}'', whereas flex interprets it as ``match @samp{ab} followed by one, two, or three occurrences of @@ -5252,11 +5254,11 @@ yy_set_bol() YY_AT_BOL() - <> + <<EOF>> -<*> +<*> @@ -5338,11 +5340,10 @@ override the default behavior. * The Default Memory Management:: * Overriding The Default Memory Management:: * A Note About yytext And Memory:: -@end menu--> +@end menu +--> - -
The Default Memory Management @@ -5556,6 +5557,7 @@ To prevent memory leaks from strdup'd yytext, you will have to track the memory somehow. Our experience has shown that a garbage collection mechanism or a pooled memory mechanism will save you a lot of grief when writing parsers. +
@@ -5580,10 +5582,10 @@ scanning begins. The tables may be discarded when scanning is finished. * Creating Serialized Tables:: * Loading and Unloading Serialized Tables:: * Tables File Format:: -@end menu--> +@end menu +--> -
Creating Serialized Tables @@ -5969,6 +5971,7 @@ calculated from the beginning of this table. +
@@ -6084,8 +6087,8 @@ or, as noted above, switch to using the C++ scanner class. -@samp{too many start conditions in <> construct!} you listed more start -conditions in a <> construct than exist (so you must have listed at +@samp{too many start conditions in <> construct!} you listed more start +conditions in a <> construct than exist (so you must have listed at least one of them twice). @@ -6166,7 +6169,7 @@ You may wish to read more about the following programs: The following books may contain material of interest: John Levine, Tony Mason, and Doug Brown, -@emph{Lex & Yacc}, +@emph{Lex & Yacc}, O'Reilly and Associates. Be sure to get the 2nd edition. M. E. Lesk and E. Schmidt, @@ -6177,9 +6180,9 @@ Techniques and Tools}, Addison-Wesley (1986). Describes the pattern-matching techniques used by flex (deterministic finite automata). - + -
+ FAQ From time to time, the flex maintainer receives certain @@ -6219,7 +6222,7 @@ publish them here. * How can I expand macros in the input?:: * How can I build a two-pass scanner?:: * How do I match any string not matched in the preceding rules?:: -* I am trying to port code from AT&T lex that uses yysptr and yysbuf.:: +* I am trying to port code from & lex that uses yysptr and yysbuf.:: * Is there a way to make flex treat NULL like a regular character?:: * Whenever flex can not match the input it says "flex scanner jammed".:: * Why doesnt flex have non-greedy operators like perl does?:: @@ -6290,8 +6293,6 @@ publish them here. @end menu--> -
-
When was flex born? @@ -6394,7 +6395,7 @@ data_.* yyless( 5 ); BEGIN BLOCKIDSTATE; Another fix would be to make the second rule active only during the -@code{} start condition, and make that start condition exclusive +@code{<BLOCKIDSTATE>} start condition, and make that start condition exclusive by declaring it with @code{%x} instead of @code{%s}. A final fix is to change the input language so that the ambiguity for @@ -6523,8 +6524,8 @@ real @code{EOF} next time it's called). Then you could write: How can I make REJECT cascade across start condition boundaries? You can do this as follows. Suppose you have a start condition @samp{A}, and -after exhausting all of the possible matches in @samp{}, you want to try -matches in @samp{}. Then you could use the following: +after exhausting all of the possible matches in @samp{<A>}, you want to try +matches in @samp{<INITIAL>}. Then you could use the following: @@ -6817,7 +6818,7 @@ did_init = 1;
How do I execute code at termination? -You can specify an action for the @code{<>} rule. +You can specify an action for the @code{<<EOF>>} rule.
@@ -6944,9 +6945,9 @@ one to match.
-I am trying to port code from AT&T lex that uses yysptr and yysbuf. +I am trying to port code from <acronym>&</acronym> lex that uses yysptr and yysbuf. -Those are internal variables pointing into the AT&T scanner's input buffer. I +Those are internal variables pointing into the & scanner's input buffer. I imagine they're being manipulated in user versions of the input and unput functions. If so, what you need to do is analyze those functions to figure out what they're doing, and then replace input with an appropriate definition of @@ -7548,7 +7549,7 @@ trailing context operator, and have it enclosed in ()'s. Flex does not allow this operator to be enclosed in ()'s because doing so allows undefined regular expressions such as "(a/b)+". So the solution is to remove the parentheses. Note that you must also be building the scanner with the -l -option for AT&T lex compatibility. Without this option, flex automatically +option for & lex compatibility. Without this option, flex automatically encloses the definitions in parentheses. Vern @@ -9179,7 +9180,7 @@ Well, your problem is the switch (yybgin-yysvec-1) { /* witchcraft */ at the beginning of lex rules. "witchcraft" == "non-portable". It's -assuming knowledge of the AT&T lex's internal variables. +assuming knowledge of the & lex's internal variables. For flex, you can probably do the equivalent using a switch on YYSTATE. @@ -9282,18 +9283,13 @@ then the problem is that the last rule needs to be "{whitespace}" ! ]]> +
-@appendix Appendices - - - + + -@appendixsec Makefiles and Flex + +Makefiles and Flex @@ -9409,9 +9405,9 @@ with your specific implementation of @command{make}. For more details on writing Makefiles, see @ref{Top, , , make, The GNU Make Manual}. - + -
+ C Scanners with Bison Parsers @@ -9521,9 +9517,9 @@ As you can see, there really is no magic here. We just use -
+ -
+ M4 Dependency @@ -9554,12 +9550,11 @@ code such as @code{x[y[z]]}. scanner is ordinary C or C++, and does @emph{not} require m4. -
+ -
+ - - -
+@end menu
Concept Index @@ -9619,6 +9611,9 @@ to specific locations in the generated scanner, and may be used to insert arbitr Index of Scanner Options @printindex op +
+ +-->