2 files changed, 6399 insertions, 0 deletions
diff --git a/MISC/texinfo/flex.info b/MISC/texinfo/flex.info
new file mode 100644
index 0000000..9269418
--- /dev/null
+++ b/MISC/texinfo/flex.info
@@ -0,0 +1,2951 @@
+This is Info file flex.info, produced by Makeinfo-1.55 from the input
+file flex.texi.
+
+START-INFO-DIR-ENTRY
+* Flex: (flex).         A fast scanner generator.
+END-INFO-DIR-ENTRY
+
+   This file documents Flex.
+
+   Copyright (c) 1990 The Regents of the University of California.  All
+rights reserved.
+
+   This code is derived from software contributed to Berkeley by Vern
+Paxson.
+
+   The United States Government has rights in this work pursuant to
+contract no. DE-AC03-76SF00098 between the United States Department of
+Energy and the University of California.
+
+   Redistribution and use in source and binary forms with or without
+modification are permitted provided that: (1) source distributions
+retain this entire copyright notice and comment, and (2) distributions
+including binaries display the following acknowledgement:  "This
+product includes software developed by the University of California,
+Berkeley and its contributors" in the documentation or other materials
+provided with the distribution and in all advertising materials
+mentioning features or use of this software.  Neither the name of the
+University nor the names of its contributors may be used to endorse or
+promote products derived from this software without specific prior
+written permission.
+
+   THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
+WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
+
+
+File: flex.info,  Node: Top,  Next: Name,  Prev: (dir),  Up: (dir)
+
+flex
+****
+
+   This manual documents `flex'.  It covers release 2.5.
+
+* Menu:
+
+* Name::                        Name
+* Synopsis::                    Synopsis
+* Overview::                    Overview
+* Description::                 Description
+* Examples::                    Some simple examples
+* Format::                      Format of the input file
+* Patterns::                    Patterns
+* Matching::                    How the input is matched
+* Actions::                     Actions
+* Generated scanner::           The generated scanner
+* Start conditions::            Start conditions
+* Multiple buffers::            Multiple input buffers
+* End-of-file rules::           End-of-file rules
+* Miscellaneous::               Miscellaneous macros
+* User variables::              Values available to the user
+* YACC interface::              Interfacing with `yacc'
+* Options::                     Options
+* Performance::                 Performance considerations
+* C++::                         Generating C++ scanners
+* Incompatibilities::           Incompatibilities with `lex' and POSIX
+* Diagnostics::                 Diagnostics
+* Files::                       Files
+* Deficiencies::                Deficiencies / Bugs
+* See also::                    See also
+* Author::                      Author
+
+
+File: flex.info,  Node: Name,  Next: Synopsis,  Prev: Top,  Up: Top
+
+Name
+====
+
+   flex - fast lexical analyzer generator
+
+
+File: flex.info,  Node: Synopsis,  Next: Overview,  Prev: Name,  Up: Top
+
+Synopsis
+========
+
+     flex [-bcdfhilnpstvwBFILTV78+? -C[aefFmr] -ooutput -Pprefix -Sskeleton]
+     [--help --version] [FILENAME ...]
+
+
+File: flex.info,  Node: Overview,  Next: Description,  Prev: Synopsis,  Up: Top
+
+Overview
+========
+
+   This manual describes `flex', a tool for generating programs that
+perform pattern-matching on text.  The manual includes both tutorial
+and reference sections:
+
+Description
+     a brief overview of the tool
+
+Some Simple Examples
+Format Of The Input File
+Patterns
+     the extended regular expressions used by flex
+
+How The Input Is Matched
+     the rules for determining what has been matched
+
+Actions
+     how to specify what to do when a pattern is matched
+
+The Generated Scanner
+     details regarding the scanner that flex produces; how to control
+     the input source
+
+Start Conditions
+     introducing context into your scanners, and managing
+     "mini-scanners"
+
+Multiple Input Buffers
+     how to manipulate multiple input sources; how to scan from strings
+     instead of files
+
+End-of-file Rules
+     special rules for matching the end of the input
+
+Miscellaneous Macros
+     a summary of macros available to the actions
+
+Values Available To The User
+     a summary of values available to the actions
+
+Interfacing With Yacc
+     connecting flex scanners together with yacc parsers
+
+Options
+     flex command-line options, and the "%option" directive
+
+Performance Considerations
+     how to make your scanner go as fast as possible
+
+Generating C++ Scanners
+     the (experimental) facility for generating C++ scanner classes
+
+Incompatibilities With Lex And POSIX
+     how flex differs from AT&T lex and the POSIX lex standard
+
+Diagnostics
+     those error messages produced by flex (or scanners it generates)
+     whose meanings might not be apparent
+
+Files
+     files used by flex
+
+Deficiencies / Bugs
+     known problems with flex
+
+See Also
+     other documentation, related tools
+
+Author
+     includes contact information
+
+
+File: flex.info,  Node: Description,  Next: Examples,  Prev: Overview,  Up: Top
+
+Description
+===========
+
+   `flex' is a tool for generating "scanners": programs which
+recognized lexical patterns in text.  `flex' reads the given input
+files, or its standard input if no file names are given, for a
+description of a scanner to generate.  The description is in the form
+of pairs of regular expressions and C code, called "rules". `flex'
+generates as output a C source file, `lex.yy.c', which defines a
+routine `yylex()'.  This file is compiled and linked with the `-lfl'
+library to produce an executable.  When the executable is run, it
+analyzes its input for occurrences of the regular expressions.
+Whenever it finds one, it executes the corresponding C code.
+
+
+File: flex.info,  Node: Examples,  Next: Format,  Prev: Description,  Up: Top
+
+Some simple examples
+====================
+
+   First some simple examples to get the flavor of how one uses `flex'.
+The following `flex' input specifies a scanner which whenever it
+encounters the string "username" will replace it with the user's login
+name:
+
+     %%
+     username    printf( "%s", getlogin() );
+
+   By default, any text not matched by a `flex' scanner is copied to
+the output, so the net effect of this scanner is to copy its input file
+to its output with each occurrence of "username" expanded.  In this
+input, there is just one rule.  "username" is the PATTERN and the
+"printf" is the ACTION.  The "%%" marks the beginning of the rules.
+
+   Here's another simple example:
+
+             int num_lines = 0, num_chars = 0;
+     
+     %%
+     \n      ++num_lines; ++num_chars;
+     .       ++num_chars;
+     
+     %%
+     main()
+             {
+             yylex();
+             printf( "# of lines = %d, # of chars = %d\n",
+                     num_lines, num_chars );
+             }
+
+   This scanner counts the number of characters and the number of lines
+in its input (it produces no output other than the final report on the
+counts).  The first line declares two globals, "num_lines" and
+"num_chars", which are accessible both inside `yylex()' and in the
+`main()' routine declared after the second "%%".  There are two rules,
+one which matches a newline ("\n") and increments both the line count
+and the character count, and one which matches any character other than
+a newline (indicated by the "." regular expression).
+
+   A somewhat more complicated example:
+
+     /* scanner for a toy Pascal-like language */
+     
+     %{
+     /* need this for the call to atof() below */
+     #include <math.h>
+     %}
+     
+     DIGIT    [0-9]
+     ID       [a-z][a-z0-9]*
+     
+     %%
+     
+     {DIGIT}+    {
+                 printf( "An integer: %s (%d)\n", yytext,
+                         atoi( yytext ) );
+                 }
+     
+     {DIGIT}+"."{DIGIT}*        {
+                 printf( "A float: %s (%g)\n", yytext,
+                         atof( yytext ) );
+                 }
+     
+     if|then|begin|end|procedure|function        {
+                 printf( "A keyword: %s\n", yytext );
+                 }
+     
+     {ID}        printf( "An identifier: %s\n", yytext );
+     
+     "+"|"-"|"*"|"/"   printf( "An operator: %s\n", yytext );
+     
+     "{"[^}\n]*"}"     /* eat up one-line comments */
+     
+     [ \t\n]+          /* eat up whitespace */
+     
+     .           printf( "Unrecognized character: %s\n", yytext );
+     
+     %%
+     
+     main( argc, argv )
+     int argc;
+     char **argv;
+         {
+         ++argv, --argc;  /* skip over program name */
+         if ( argc > 0 )
+                 yyin = fopen( argv[0], "r" );
+         else
+                 yyin = stdin;
+     
+         yylex();
+         }
+
+   This is the beginnings of a simple scanner for a language like
+Pascal.  It identifies different types of TOKENS and reports on what it
+has seen.
+
+   The details of this example will be explained in the following
+sections.
+
+
+File: flex.info,  Node: Format,  Next: Patterns,  Prev: Examples,  Up: Top
+
+Format of the input file
+========================
+
+   The `flex' input file consists of three sections, separated by a
+line with just `%%' in it:
+
+     definitions
+     %%
+     rules
+     %%
+     user code
+
+   The "definitions" section contains declarations of simple "name"
+definitions to simplify the scanner specification, and declarations of
+"start conditions", which are explained in a later section.  Name
+definitions have the form:
+
+     name definition
+
+   The "name" is a word beginning with a letter or an underscore ('_')
+followed by zero or more letters, digits, '_', or '-' (dash).  The
+definition is taken to begin at the first non-white-space character
+following the name and continuing to the end of the line.  The
+definition can subsequently be referred to using "{name}", which will
+expand to "(definition)".  For example,
+
+     DIGIT    [0-9]
+     ID       [a-z][a-z0-9]*
+
+defines "DIGIT" to be a regular expression which matches a single
+digit, and "ID" to be a regular expression which matches a letter
+followed by zero-or-more letters-or-digits.  A subsequent reference to
+
+     {DIGIT}+"."{DIGIT}*
+
+is identical to
+
+     ([0-9])+"."([0-9])*
+
+and matches one-or-more digits followed by a '.' followed by
+zero-or-more digits.
+
+   The RULES section of the `flex' input contains a series of rules of
+the form:
+
+     pattern   action
+
+where the pattern must be unindented and the action must begin on the
+same line.
+
+   See below for a further description of patterns and actions.
+
+   Finally, the user code section is simply copied to `lex.yy.c'
+verbatim.  It is used for companion routines which call or are called
+by the scanner.  The presence of this section is optional; if it is
+missing, the second `%%' in the input file may be skipped, too.
+
+   In the definitions and rules sections, any *indented* text or text
+enclosed in `%{' and `%}' is copied verbatim to the output (with the
+`%{}''s removed).  The `%{}''s must appear unindented on lines by
+themselves.
+
+   In the rules section, any indented or %{} text appearing before the
+first rule may be used to declare variables which are local to the
+scanning routine and (after the declarations) code which is to be
+executed whenever the scanning routine is entered.  Other indented or
+%{} text in the rule section is still copied to the output, but its
+meaning is not well-defined and it may well cause compile-time errors
+(this feature is present for `POSIX' compliance; see below for other
+such features).
+
+   In the definitions section (but not in the rules section), an
+unindented comment (i.e., a line beginning with "/*") is also copied
+verbatim to the output up to the next "*/".
+
+
+File: flex.info,  Node: Patterns,  Next: Matching,  Prev: Format,  Up: Top
+
+Patterns
+========
+
+   The patterns in the input are written using an extended set of
+regular expressions.  These are:
+
+`x'
+     match the character `x'
+
+`.'
+     any character (byte) except newline
+
+`[xyz]'
+     a "character class"; in this case, the pattern matches either an
+     `x', a `y', or a `z'
+
+`[abj-oZ]'
+     a "character class" with a range in it; matches an `a', a `b', any
+     letter from `j' through `o', or a `Z'
+
+`[^A-Z]'
+     a "negated character class", i.e., any character but those in the
+     class.  In this case, any character EXCEPT an uppercase letter.
+
+`[^A-Z\n]'
+     any character EXCEPT an uppercase letter or a newline
+
+`R*'
+     zero or more R's, where R is any regular expression
+
+`R+'
+     one or more R's
+
+`R?'
+     zero or one R's (that is, "an optional R")
+
+`R{2,5}'
+     anywhere from two to five R's
+
+`R{2,}'
+     two or more R's
+
+`R{4}'
+     exactly 4 R's
+
+`{NAME}'
+     the expansion of the "NAME" definition (see above)
+
+`"[xyz]\"foo"'
+     the literal string: `[xyz]"foo'
+
+`\X'
+     if X is an `a', `b', `f', `n', `r', `t', or `v', then the ANSI-C
+     interpretation of \X.  Otherwise, a literal `X' (used to escape
+     operators such as `*')
+
+`\0'
+     a NUL character (ASCII code 0)
+
+`\123'
+     the character with octal value 123
+
+`\x2a'
+     the character with hexadecimal value `2a'
+
+`(R)'
+     match an R; parentheses are used to override precedence (see below)
+
+`RS'
+     the regular expression R followed by the regular expression S;
+     called "concatenation"
+
+`R|S'
+     either an R or an S
+
+`R/S'
+     an R but only if it is followed by an S.  The text matched by S is
+     included when determining whether this rule is the "longest
+     match", but is then returned to the input before the action is
+     executed.  So the action only sees the text matched by R.  This
+     type of pattern is called "trailing context".  (There are some
+     combinations of `R/S' that `flex' cannot match correctly; see
+     notes in the Deficiencies / Bugs section below regarding
+     "dangerous trailing context".)
+
+`^R'
+     an R, but only at the beginning of a line (i.e., which just
+     starting to scan, or right after a newline has been scanned).
+
+`R$'
+     an R, but only at the end of a line (i.e., just before a newline).
+     Equivalent to "R/\n".
+
+     Note that flex's notion of "newline" is exactly whatever the C
+     compiler used to compile flex interprets '\n' as; in particular,
+     on some DOS systems you must either filter out \r's in the input
+     yourself, or explicitly use R/\r\n for "r$".
+
+`<S>R'
+     an R, but only in start condition S (see below for discussion of
+     start conditions) <S1,S2,S3>R same, but in any of start conditions
+     S1, S2, or S3
+
+`<*>R'
+     an R in any start condition, even an exclusive one.
+
+`<<EOF>>'
+     an end-of-file <S1,S2><<EOF>> an end-of-file when in start
+     condition S1 or S2
+
+   Note that inside of a character class, all regular expression
+operators lose their special meaning except escape ('\') and the
+character class operators, '-', ']', and, at the beginning of the
+class, '^'.
+
+   The regular expressions listed above are grouped according to
+precedence, from highest precedence at the top to lowest at the bottom.
+Those grouped together have equal precedence.  For example,
+
+     foo|bar*
+
+is the same as
+
+     (foo)|(ba(r*))
+
+since the '*' operator has higher precedence than concatenation, and
+concatenation higher than alternation ('|').  This pattern therefore
+matches *either* the string "foo" *or* the string "ba" followed by
+zero-or-more r's.  To match "foo" or zero-or-more "bar"'s, use:
+
+     foo|(bar)*
+
+and to match zero-or-more "foo"'s-or-"bar"'s:
+
+     (foo|bar)*
+
+   In addition to characters and ranges of characters, character
+classes can also contain character class "expressions".  These are
+expressions enclosed inside `[': and `:'] delimiters (which themselves
+must appear between the '[' and ']' of the character class; other
+elements may occur inside the character class, too).  The valid
+expressions are:
+
+     [:alnum:] [:alpha:] [:blank:]
+     [:cntrl:] [:digit:] [:graph:]
+     [:lower:] [:print:] [:punct:]
+     [:space:] [:upper:] [:xdigit:]
+
+   These expressions all designate a set of characters equivalent to
+the corresponding standard C `isXXX' function.  For example,
+`[:alnum:]' designates those characters for which `isalnum()' returns
+true - i.e., any alphabetic or numeric.  Some systems don't provide
+`isblank()', so flex defines `[:blank:]' as a blank or a tab.
+
+   For example, the following character classes are all equivalent:
+
+     [[:alnum:]]
+     [[:alpha:][:digit:]
+     [[:alpha:]0-9]
+     [a-zA-Z0-9]
+
+   If your scanner is case-insensitive (the `-i' flag), then
+`[:upper:]' and `[:lower:]' are equivalent to `[:alpha:]'.
+
+   Some notes on patterns:
+
+   - A negated character class such as the example "[^A-Z]" above *will
+     match a newline* unless "\n" (or an equivalent escape sequence) is
+     one of the characters explicitly present in the negated character
+     class (e.g., "[^A-Z\n]").  This is unlike how many other regular
+     expression tools treat negated character classes, but
+     unfortunately the inconsistency is historically entrenched.
+     Matching newlines means that a pattern like [^"]* can match the
+     entire input unless there's another quote in the input.
+
+   - A rule can have at most one instance of trailing context (the '/'
+     operator or the '$' operator).  The start condition, '^', and
+     "<<EOF>>" patterns can only occur at the beginning of a pattern,
+     and, as well as with '/' and '$', cannot be grouped inside
+     parentheses.  A '^' which does not occur at the beginning of a
+     rule or a '$' which does not occur at the end of a rule loses its
+     special properties and is treated as a normal character.
+
+     The following are illegal:
+
+          foo/bar$
+          <sc1>foo<sc2>bar
+
+     Note that the first of these, can be written "foo/bar\n".
+
+     The following will result in '$' or '^' being treated as a normal
+     character:
+
+          foo|(bar$)
+          foo|^bar
+
+     If what's wanted is a "foo" or a bar-followed-by-a-newline, the
+     following could be used (the special '|' action is explained
+     below):
+
+          foo      |
+          bar$     /* action goes here */
+
+     A similar trick will work for matching a foo or a
+     bar-at-the-beginning-of-a-line.
+
+
+File: flex.info,  Node: Matching,  Next: Actions,  Prev: Patterns,  Up: Top
+
+How the input is matched
+========================
+
+   When the generated scanner is run, it analyzes its input looking for
+strings which match any of its patterns.  If it finds more than one
+match, it takes the one matching the most text (for trailing context
+rules, this includes the length of the trailing part, even though it
+will then be returned to the input).  If it finds two or more matches
+of the same length, the rule listed first in the `flex' input file is
+chosen.
+
+   Once the match is determined, the text corresponding to the match
+(called the TOKEN) is made available in the global character pointer
+`yytext', and its length in the global integer `yyleng'.  The ACTION
+corresponding to the matched pattern is then executed (a more detailed
+description of actions follows), and then the remaining input is
+scanned for another match.
+
+   If no match is found, then the "default rule" is executed: the next
+character in the input is considered matched and copied to the standard
+output.  Thus, the simplest legal `flex' input is:
+
+     %%
+
+   which generates a scanner that simply copies its input (one
+character at a time) to its output.
+
+   Note that `yytext' can be defined in two different ways: either as a
+character *pointer* or as a character *array*.  You can control which
+definition `flex' uses by including one of the special directives
+`%pointer' or `%array' in the first (definitions) section of your flex
+input.  The default is `%pointer', unless you use the `-l' lex
+compatibility option, in which case `yytext' will be an array.  The
+advantage of using `%pointer' is substantially faster scanning and no
+buffer overflow when matching very large tokens (unless you run out of
+dynamic memory).  The disadvantage is that you are restricted in how
+your actions can modify `yytext' (see the next section), and calls to
+the `unput()' function destroys the present contents of `yytext', which
+can be a considerable porting headache when moving between different
+`lex' versions.
+
+   The advantage of `%array' is that you can then modify `yytext' to
+your heart's content, and calls to `unput()' do not destroy `yytext'
+(see below).  Furthermore, existing `lex' programs sometimes access
+`yytext' externally using declarations of the form:
+     extern char yytext[];
+   This definition is erroneous when used with `%pointer', but correct
+for `%array'.
+
+   `%array' defines `yytext' to be an array of `YYLMAX' characters,
+which defaults to a fairly large value.  You can change the size by
+simply #define'ing `YYLMAX' to a different value in the first section
+of your `flex' input.  As mentioned above, with `%pointer' yytext grows
+dynamically to accommodate large tokens.  While this means your
+`%pointer' scanner can accommodate very large tokens (such as matching
+entire blocks of comments), bear in mind that each time the scanner
+must resize `yytext' it also must rescan the entire token from the
+beginning, so matching such tokens can prove slow.  `yytext' presently
+does *not* dynamically grow if a call to `unput()' results in too much
+text being pushed back; instead, a run-time error results.
+
+   Also note that you cannot use `%array' with C++ scanner classes (the
+`c++' option; see below).
+
+
+File: flex.info,  Node: Actions,  Next: Generated scanner,  Prev: Matching,  Up: Top
+
+Actions
+=======
+
+   Each pattern in a rule has a corresponding action, which can be any
+arbitrary C statement.  The pattern ends at the first non-escaped
+whitespace character; the remainder of the line is its action.  If the
+action is empty, then when the pattern is matched the input token is
+simply discarded.  For example, here is the specification for a program
+which deletes all occurrences of "zap me" from its input:
+
+     %%
+     "zap me"
+
+   (It will copy all other characters in the input to the output since
+they will be matched by the default rule.)
+
+   Here is a program which compresses multiple blanks and tabs down to
+a single blank, and throws away whitespace found at the end of a line:
+
+     %%
+     [ \t]+        putchar( ' ' );
+     [ \t]+$       /* ignore this token */
+
+   If the action contains a '{', then the action spans till the
+balancing '}' is found, and the action may cross multiple lines.
+`flex' knows about C strings and comments and won't be fooled by braces
+found within them, but also allows actions to begin with `%{' and will
+consider the action to be all the text up to the next `%}' (regardless
+of ordinary braces inside the action).
+
+   An action consisting solely of a vertical bar ('|') means "same as
+the action for the next rule." See below for an illustration.
+
+   Actions can include arbitrary C code, including `return' statements
+to return a value to whatever routine called `yylex()'.  Each time
+`yylex()' is called it continues processing tokens from where it last
+left off until it either reaches the end of the file or executes a
+return.
+
+   Actions are free to modify `yytext' except for lengthening it
+(adding characters to its end-these will overwrite later characters in
+the input stream).  This however does not apply when using `%array'
+(see above); in that case, `yytext' may be freely modified in any way.
+
+   Actions are free to modify `yyleng' except they should not do so if
+the action also includes use of `yymore()' (see below).
+
+   There are a number of special directives which can be included
+within an action:
+
+   - `ECHO' copies yytext to the scanner's output.
+
+   - `BEGIN' followed by the name of a start condition places the
+     scanner in the corresponding start condition (see below).
+
+   - `REJECT' directs the scanner to proceed on to the "second best"
+     rule which matched the input (or a prefix of the input).  The rule
+     is chosen as described above in "How the Input is Matched", and
+     `yytext' and `yyleng' set up appropriately.  It may either be one
+     which matched as much text as the originally chosen rule but came
+     later in the `flex' input file, or one which matched less text.
+     For example, the following will both count the words in the input
+     and call the routine special() whenever "frob" is seen:
+
+                  int word_count = 0;
+          %%
+          
+          frob        special(); REJECT;
+          [^ \t\n]+   ++word_count;
+
+     Without the `REJECT', any "frob"'s in the input would not be
+     counted as words, since the scanner normally executes only one
+     action per token.  Multiple `REJECT's' are allowed, each one
+     finding the next best choice to the currently active rule.  For
+     example, when the following scanner scans the token "abcd", it
+     will write "abcdabcaba" to the output:
+
+          %%
+          a        |
+          ab       |
+          abc      |
+          abcd     ECHO; REJECT;
+          .|\n     /* eat up any unmatched character */
+
+     (The first three rules share the fourth's action since they use
+     the special '|' action.)  `REJECT' is a particularly expensive
+     feature in terms of scanner performance; if it is used in *any* of
+     the scanner's actions it will slow down *all* of the scanner's
+     matching.  Furthermore, `REJECT' cannot be used with the `-Cf' or
+     `-CF' options (see below).
+
+     Note also that unlike the other special actions, `REJECT' is a
+     *branch*; code immediately following it in the action will *not*
+     be executed.
+
+   - `yymore()' tells the scanner that the next time it matches a rule,
+     the corresponding token should be *appended* onto the current
+     value of `yytext' rather than replacing it.  For example, given
+     the input "mega-kludge" the following will write
+     "mega-mega-kludge" to the output:
+
+          %%
+          mega-    ECHO; yymore();
+          kludge   ECHO;
+
+     First "mega-" is matched and echoed to the output.  Then "kludge"
+     is matched, but the previous "mega-" is still hanging around at
+     the beginning of `yytext' so the `ECHO' for the "kludge" rule will
+     actually write "mega-kludge".
+
+   Two notes regarding use of `yymore()'.  First, `yymore()' depends on
+the value of `yyleng' correctly reflecting the size of the current
+token, so you must not modify `yyleng' if you are using `yymore()'.
+Second, the presence of `yymore()' in the scanner's action entails a
+minor performance penalty in the scanner's matching speed.
+
+   - `yyless(n)' returns all but the first N characters of the current
+     token back to the input stream, where they will be rescanned when
+     the scanner looks for the next match.  `yytext' and `yyleng' are
+     adjusted appropriately (e.g., `yyleng' will now be equal to N ).
+     For example, on the input "foobar" the following will write out
+     "foobarbar":
+
+          %%
+          foobar    ECHO; yyless(3);
+          [a-z]+    ECHO;
+
+     An argument of 0 to `yyless' will cause the entire current input
+     string to be scanned again.  Unless you've changed how the scanner
+     will subsequently process its input (using `BEGIN', for example),
+     this will result in an endless loop.
+
+     Note that `yyless' is a macro and can only be used in the flex
+     input file, not from other source files.
+
+   - `unput(c)' puts the character `c' back onto the input stream.  It
+     will be the next character scanned.  The following action will
+     take the current token and cause it to be rescanned enclosed in
+     parentheses.
+
+          {
+          int i;
+          /* Copy yytext because unput() trashes yytext */
+          char *yycopy = strdup( yytext );
+          unput( ')' );
+          for ( i = yyleng - 1; i >= 0; --i )
+              unput( yycopy[i] );
+          unput( '(' );
+          free( yycopy );
+          }
+
+     Note that since each `unput()' puts the given character back at
+     the *beginning* of the input stream, pushing back strings must be
+     done back-to-front.  An important potential problem when using
+     `unput()' is that if you are using `%pointer' (the default), a
+     call to `unput()' *destroys* the contents of `yytext', starting
+     with its rightmost character and devouring one character to the
+     left with each call.  If you need the value of yytext preserved
+     after a call to `unput()' (as in the above example), you must
+     either first copy it elsewhere, or build your scanner using
+     `%array' instead (see How The Input Is Matched).
+
+     Finally, note that you cannot put back `EOF' to attempt to mark
+     the input stream with an end-of-file.
+
+   - `input()' reads the next character from the input stream.  For
+     example, the following is one way to eat up C comments:
+
+          %%
+          "/*"        {
+                      register int c;
+          
+                      for ( ; ; )
+                          {
+                          while ( (c = input()) != '*' &&
+                                  c != EOF )
+                              ;    /* eat up text of comment */
+          
+                          if ( c == '*' )
+                              {
+                              while ( (c = input()) == '*' )
+                                  ;
+                              if ( c == '/' )
+                                  break;    /* found the end */
+                              }
+          
+                          if ( c == EOF )
+                              {
+                              error( "EOF in comment" );
+                              break;
+                              }
+                          }
+                      }
+
+     (Note that if the scanner is compiled using `C++', then `input()'
+     is instead referred to as `yyinput()', in order to avoid a name
+     clash with the `C++' stream by the name of `input'.)
+
+   - YY_FLUSH_BUFFER flushes the scanner's internal buffer so that the
+     next time the scanner attempts to match a token, it will first
+     refill the buffer using `YY_INPUT' (see The Generated Scanner,
+     below).  This action is a special case of the more general
+     `yy_flush_buffer()' function, described below in the section
+     Multiple Input Buffers.
+
+   - `yyterminate()' can be used in lieu of a return statement in an
+     action.  It terminates the scanner and returns a 0 to the
+     scanner's caller, indicating "all done".  By default,
+     `yyterminate()' is also called when an end-of-file is encountered.
+     It is a macro and may be redefined.
+
+
+File: flex.info,  Node: Generated scanner,  Next: Start conditions,  Prev: Actions,  Up: Top
+
+The generated scanner
+=====================
+
+   The output of `flex' is the file `lex.yy.c', which contains the
+scanning routine `yylex()', a number of tables used by it for matching
+tokens, and a number of auxiliary routines and macros.  By default,
+`yylex()' is declared as follows:
+
+     int yylex()
+         {
+         ... various definitions and the actions in here ...
+         }
+
+   (If your environment supports function prototypes, then it will be
+"int yylex( void  )".)   This  definition  may  be changed by defining
+the "YY_DECL" macro.  For example, you could use:
+
+     #define YY_DECL float lexscan( a, b ) float a, b;
+
+   to give the scanning routine the name `lexscan', returning a float,
+and taking two floats as arguments.  Note that if you give arguments to
+the scanning routine using a K&R-style/non-prototyped function
+declaration, you must terminate the definition with a semi-colon (`;').
+
+   Whenever `yylex()' is called, it scans tokens from the global input
+file `yyin' (which defaults to stdin).  It continues until it either
+reaches an end-of-file (at which point it returns the value 0) or one
+of its actions executes a `return' statement.
+
+   If the scanner reaches an end-of-file, subsequent calls are undefined
+unless either `yyin' is pointed at a new input file (in which case
+scanning continues from that file), or `yyrestart()' is called.
+`yyrestart()' takes one argument, a `FILE *' pointer (which can be nil,
+if you've set up `YY_INPUT' to scan from a source other than `yyin'),
+and initializes `yyin' for scanning from that file.  Essentially there
+is no difference between just assigning `yyin' to a new input file or
+using `yyrestart()' to do so; the latter is available for compatibility
+with previous versions of `flex', and because it can be used to switch
+input files in the middle of scanning.  It can also be used to throw
+away the current input buffer, by calling it with an argument of
+`yyin'; but better is to use `YY_FLUSH_BUFFER' (see above).  Note that
+`yyrestart()' does *not* reset the start condition to `INITIAL' (see
+Start Conditions, below).
+
+   If `yylex()' stops scanning due to executing a `return' statement in
+one of the actions, the scanner may then be called again and it will
+resume scanning where it left off.
+
+   By default (and for purposes of efficiency), the scanner uses
+block-reads rather than simple `getc()' calls to read characters from
+`yyin'.  The nature of how it gets its input can be controlled by
+defining the `YY_INPUT' macro.  YY_INPUT's calling sequence is
+"YY_INPUT(buf,result,max_size)".  Its action is to place up to MAX_SIZE
+characters in the character array BUF and return in the integer
+variable RESULT either the number of characters read or the constant
+YY_NULL (0 on Unix systems) to indicate EOF.  The default YY_INPUT
+reads from the global file-pointer "yyin".
+
+   A sample definition of YY_INPUT (in the definitions section of the
+input file):
+
+     %{
+     #define YY_INPUT(buf,result,max_size) \
+         { \
+         int c = getchar(); \
+         result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \
+         }
+     %}
+
+   This definition will change the input processing to occur one
+character at a time.
+
+   When the scanner receives an end-of-file indication from YY_INPUT,
+it then checks the `yywrap()' function.  If `yywrap()' returns false
+(zero), then it is assumed that the function has gone ahead and set up
+`yyin' to point to another input file, and scanning continues.  If it
+returns true (non-zero), then the scanner terminates, returning 0 to
+its caller.  Note that in either case, the start condition remains
+unchanged; it does *not* revert to `INITIAL'.
+
+   If you do not supply your own version of `yywrap()', then you must
+either use `%option noyywrap' (in which case the scanner behaves as
+though `yywrap()' returned 1), or you must link with `-lfl' to obtain
+the default version of the routine, which always returns 1.
+
+   Three routines are available for scanning from in-memory buffers
+rather than files: `yy_scan_string()', `yy_scan_bytes()', and
+`yy_scan_buffer()'.  See the discussion of them below in the section
+Multiple Input Buffers.
+
+   The scanner writes its `ECHO' output to the `yyout' global (default,
+stdout), which may be redefined by the user simply by assigning it to
+some other `FILE' pointer.
+
+
+File: flex.info,  Node: Start conditions,  Next: Multiple buffers,  Prev: Generated scanner,  Up: Top
+
+Start conditions
+================
+
+   `flex' provides a mechanism for conditionally activating rules.  Any
+rule whose pattern is prefixed with "<sc>" will only be active when the
+scanner is in the start condition named "sc".  For example,
+
+     <STRING>[^"]*        { /* eat up the string body ... */
+                 ...
+                 }
+
+will be active only when the scanner is in the "STRING" start
+condition, and
+
+     <INITIAL,STRING,QUOTE>\.        { /* handle an escape ... */
+                 ...
+                 }
+
+will be active only when the current start condition is either
+"INITIAL", "STRING", or "QUOTE".
+
+   Start conditions are declared in the definitions (first) section of
+the input using unindented lines beginning with either `%s' or `%x'
+followed by a list of names.  The former declares *inclusive* start
+conditions, the latter *exclusive* start conditions.  A start condition
+is activated using the `BEGIN' action.  Until the next `BEGIN' action is
+executed, rules with the given start condition will be active and rules
+with other start conditions will be inactive.  If the start condition
+is *inclusive*, then rules with no start conditions at all will also be
+active.  If it is *exclusive*, then *only* rules qualified with the
+start condition will be active.  A set of rules contingent on the same
+exclusive start condition describe a scanner which is independent of
+any of the other rules in the `flex' input.  Because of this, exclusive
+start conditions make it easy to specify "mini-scanners" which scan
+portions of the input that are syntactically different from the rest
+(e.g., comments).
+
+   If the distinction between inclusive and exclusive start conditions
+is still a little vague, here's a simple example illustrating the
+connection between the two.  The set of rules:
+
+     %s example
+     %%
+     
+     <example>foo   do_something();
+     
+     bar            something_else();
+
+is equivalent to
+
+     %x example
+     %%
+     
+     <example>foo   do_something();
+     
+     <INITIAL,example>bar    something_else();
+
+   Without the `<INITIAL,example>' qualifier, the `bar' pattern in the
+second example wouldn't be active (i.e., couldn't match) when in start
+condition `example'.  If we just used `<example>' to qualify `bar',
+though, then it would only be active in `example' and not in `INITIAL',
+while in the first example it's active in both, because in the first
+example the `example' starting condition is an *inclusive* (`%s') start
+condition.
+
+   Also note that the special start-condition specifier `<*>' matches
+every start condition.  Thus, the above example could also have been
+written;
+
+     %x example
+     %%
+     
+     <example>foo   do_something();
+     
+     <*>bar    something_else();
+
+   The default rule (to `ECHO' any unmatched character) remains active
+in start conditions.  It is equivalent to:
+
+     <*>.|\\n     ECHO;
+
+   `BEGIN(0)' returns to the original state where only the rules with
+no start conditions are active.  This state can also be referred to as
+the start-condition "INITIAL", so `BEGIN(INITIAL)' is equivalent to
+`BEGIN(0)'.  (The parentheses around the start condition name are not
+required but are considered good style.)
+
+   `BEGIN' actions can also be given as indented code at the beginning
+of the rules section.  For example, the following will cause the
+scanner to enter the "SPECIAL" start condition whenever `yylex()' is
+called and the global variable `enter_special' is true:
+
+             int enter_special;
+     
+     %x SPECIAL
+     %%
+             if ( enter_special )
+                 BEGIN(SPECIAL);
+     
+     <SPECIAL>blahblahblah
+     ...more rules follow...
+
+   To illustrate the uses of start conditions, here is a scanner which
+provides two different interpretations of a string like "123.456".  By
+default it will treat it as as three tokens, the integer "123", a dot
+('.'), and the integer "456".  But if the string is preceded earlier in
+the line by the string "expect-floats" it will treat it as a single
+token, the floating-point number 123.456:
+
+     %{
+     #include <math.h>
+     %}
+     %s expect
+     
+     %%
+     expect-floats        BEGIN(expect);
+     
+     <expect>[0-9]+"."[0-9]+      {
+                 printf( "found a float, = %f\n",
+                         atof( yytext ) );
+                 }
+     <expect>\n           {
+                 /* that's the end of the line, so
+                  * we need another "expect-number"
+                  * before we'll recognize any more
+                  * numbers
+                  */
+                 BEGIN(INITIAL);
+                 }
+     
+     [0-9]+      {
+     
+     Version 2.5               December 1994                        18
+     
+                 printf( "found an integer, = %d\n",
+                         atoi( yytext ) );
+                 }
+     
+     "."         printf( "found a dot\n" );
+
+   Here is a scanner which recognizes (and discards) C comments while
+maintaining a count of the current input line.
+
+     %x comment
+     %%
+             int line_num = 1;
+     
+     "/*"         BEGIN(comment);
+     
+     <comment>[^*\n]*        /* eat anything that's not a '*' */
+     <comment>"*"+[^*/\n]*   /* eat up '*'s not followed by '/'s */
+     <comment>\n             ++line_num;
+     <comment>"*"+"/"        BEGIN(INITIAL);
+
+   This scanner goes to a bit of trouble to match as much text as
+possible with each rule.  In general, when attempting to write a
+high-speed scanner try to match as much possible in each rule, as it's
+a big win.
+
+   Note that start-conditions names are really integer values and can
+be stored as such.  Thus, the above could be extended in the following
+fashion:
+
+     %x comment foo
+     %%
+             int line_num = 1;
+             int comment_caller;
+     
+     "/*"         {
+                  comment_caller = INITIAL;
+                  BEGIN(comment);
+                  }
+     
+     ...
+     
+     <foo>"/*"    {
+                  comment_caller = foo;
+                  BEGIN(comment);
+                  }
+     
+     <comment>[^*\n]*        /* eat anything that's not a '*' */
+     <comment>"*"+[^*/\n]*   /* eat up '*'s not followed by '/'s */
+     <comment>\n             ++line_num;
+     <comment>"*"+"/"        BEGIN(comment_caller);
+
+   Furthermore, you can access the current start condition using the
+integer-valued `YY_START' macro.  For example, the above assignments to
+`comment_caller' could instead be written
+
+     comment_caller = YY_START;
+
+   Flex provides `YYSTATE' as an alias for `YY_START' (since that is
+what's used by AT&T `lex').
+
+   Note that start conditions do not have their own name-space; %s's
+and %x's declare names in the same fashion as #define's.
+
+   Finally, here's an example of how to match C-style quoted strings
+using exclusive start conditions, including expanded escape sequences
+(but not including checking for a string that's too long):
+
+     %x str
+     
+     %%
+             char string_buf[MAX_STR_CONST];
+             char *string_buf_ptr;
+     
+     \"      string_buf_ptr = string_buf; BEGIN(str);
+     
+     <str>\"        { /* saw closing quote - all done */
+             BEGIN(INITIAL);
+             *string_buf_ptr = '\0';
+             /* return string constant token type and
+              * value to parser
+              */
+             }
+     
+     <str>\n        {
+             /* error - unterminated string constant */
+             /* generate error message */
+             }
+     
+     <str>\\[0-7]{1,3} {
+             /* octal escape sequence */
+             int result;
+     
+             (void) sscanf( yytext + 1, "%o", &result );
+     
+             if ( result > 0xff )
+                     /* error, constant is out-of-bounds */
+     
+             *string_buf_ptr++ = result;
+             }
+     
+     <str>\\[0-9]+ {
+             /* generate error - bad escape sequence; something
+              * like '\48' or '\0777777'
+              */
+             }
+     
+     <str>\\n  *string_buf_ptr++ = '\n';
+     <str>\\t  *string_buf_ptr++ = '\t';
+     <str>\\r  *string_buf_ptr++ = '\r';
+     <str>\\b  *string_buf_ptr++ = '\b';
+     <str>\\f  *string_buf_ptr++ = '\f';
+     
+     <str>\\(.|\n)  *string_buf_ptr++ = yytext[1];
+     
+     <str>[^\\\n\"]+        {
+             char *yptr = yytext;
+     
+             while ( *yptr )
+                     *string_buf_ptr++ = *yptr++;
+             }
+
+   Often, such as in some of the examples above, you wind up writing a
+whole bunch of rules all preceded by the same start condition(s).  Flex
+makes this a little easier and cleaner by introducing a notion of start
+condition "scope".  A start condition scope is begun with:
+
+     <SCs>{
+
+where SCs is a list of one or more start conditions.  Inside the start
+condition scope, every rule automatically has the prefix `<SCs>'
+applied to it, until a `}' which matches the initial `{'.  So, for
+example,
+
+     <ESC>{
+         "\\n"   return '\n';
+         "\\r"   return '\r';
+         "\\f"   return '\f';
+         "\\0"   return '\0';
+     }
+
+is equivalent to:
+
+     <ESC>"\\n"  return '\n';
+     <ESC>"\\r"  return '\r';
+     <ESC>"\\f"  return '\f';
+     <ESC>"\\0"  return '\0';
+
+   Start condition scopes may be nested.
+
+   Three routines are available for manipulating stacks of start
+conditions:
+
+`void yy_push_state(int new_state)'
+     pushes the current start condition onto the top of the start
+     condition stack and switches to NEW_STATE as though you had used
+     `BEGIN new_state' (recall that start condition names are also
+     integers).
+
+`void yy_pop_state()'
+     pops the top of the stack and switches to it via `BEGIN'.
+
+`int yy_top_state()'
+     returns the top of the stack without altering the stack's contents.
+
+   The start condition stack grows dynamically and so has no built-in
+size limitation.  If memory is exhausted, program execution aborts.
+
+   To use start condition stacks, your scanner must include a `%option
+stack' directive (see Options below).
+
+
+File: flex.info,  Node: Multiple buffers,  Next: End-of-file rules,  Prev: Start conditions,  Up: Top
+
+Multiple input buffers
+======================
+
+   Some scanners (such as those which support "include" files) require
+reading from several input streams.  As `flex' scanners do a large
+amount of buffering, one cannot control where the next input will be
+read from by simply writing a `YY_INPUT' which is sensitive to the
+scanning context.  `YY_INPUT' is only called when the scanner reaches
+the end of its buffer, which may be a long time after scanning a
+statement such as an "include" which requires switching the input
+source.
+
+   To negotiate these sorts of problems, `flex' provides a mechanism
+for creating and switching between multiple input buffers.  An input
+buffer is created by using:
+
+     YY_BUFFER_STATE yy_create_buffer( FILE *file, int size )
+
+which takes a `FILE' pointer and a size and creates a buffer associated
+with the given file and large enough to hold SIZE characters (when in
+doubt, use `YY_BUF_SIZE' for the size).  It returns a `YY_BUFFER_STATE'
+handle, which may then be passed to other routines (see below).  The
+`YY_BUFFER_STATE' type is a pointer to an opaque `struct'
+`yy_buffer_state' structure, so you may safely initialize
+YY_BUFFER_STATE variables to `((YY_BUFFER_STATE) 0)' if you wish, and
+also refer to the opaque structure in order to correctly declare input
+buffers in source files other than that of your scanner.  Note that the
+`FILE' pointer in the call to `yy_create_buffer' is only used as the
+value of `yyin' seen by `YY_INPUT'; if you redefine `YY_INPUT' so it no
+longer uses `yyin', then you can safely pass a nil `FILE' pointer to
+`yy_create_buffer'.  You select a particular buffer to scan from using:
+
+     void yy_switch_to_buffer( YY_BUFFER_STATE new_buffer )
+
+   switches the scanner's input buffer so subsequent tokens will come
+from NEW_BUFFER.  Note that `yy_switch_to_buffer()' may be used by
+`yywrap()' to set things up for continued scanning, instead of opening
+a new file and pointing `yyin' at it.  Note also that switching input
+sources via either `yy_switch_to_buffer()' or `yywrap()' does *not*
+change the start condition.
+
+     void yy_delete_buffer( YY_BUFFER_STATE buffer )
+
+is used to reclaim the storage associated with a buffer.  You can also
+clear the current contents of a buffer using:
+
+     void yy_flush_buffer( YY_BUFFER_STATE buffer )
+
+   This function discards the buffer's contents, so the next time the
+scanner attempts to match a token from the buffer, it will first fill
+the buffer anew using `YY_INPUT'.
+
+   `yy_new_buffer()' is an alias for `yy_create_buffer()', provided for
+compatibility with the C++ use of `new' and `delete' for creating and
+destroying dynamic objects.
+
+   Finally, the `YY_CURRENT_BUFFER' macro returns a `YY_BUFFER_STATE'
+handle to the current buffer.
+
+   Here is an example of using these features for writing a scanner
+which expands include files (the `<<EOF>>' feature is discussed below):
+
+     /* the "incl" state is used for picking up the name
+      * of an include file
+      */
+     %x incl
+     
+     %{
+     #define MAX_INCLUDE_DEPTH 10
+     YY_BUFFER_STATE include_stack[MAX_INCLUDE_DEPTH];
+     int include_stack_ptr = 0;
+     %}
+     
+     %%
+     include             BEGIN(incl);
+     
+     [a-z]+              ECHO;
+     [^a-z\n]*\n?        ECHO;
+     
+     <incl>[ \t]*      /* eat the whitespace */
+     <incl>[^ \t\n]+   { /* got the include file name */
+             if ( include_stack_ptr >= MAX_INCLUDE_DEPTH )
+                 {
+                 fprintf( stderr, "Includes nested too deeply" );
+                 exit( 1 );
+                 }
+     
+             include_stack[include_stack_ptr++] =
+                 YY_CURRENT_BUFFER;
+     
+             yyin = fopen( yytext, "r" );
+     
+             if ( ! yyin )
+                 error( ... );
+     
+             yy_switch_to_buffer(
+                 yy_create_buffer( yyin, YY_BUF_SIZE ) );
+     
+             BEGIN(INITIAL);
+             }
+     
+     <<EOF>> {
+             if ( --include_stack_ptr < 0 )
+                 {
+                 yyterminate();
+                 }
+     
+             else
+                 {
+                 yy_delete_buffer( YY_CURRENT_BUFFER );
+                 yy_switch_to_buffer(
+                      include_stack[include_stack_ptr] );
+                 }
+             }
+
+   Three routines are available for setting up input buffers for
+scanning in-memory strings instead of files.  All of them create a new
+input buffer for scanning the string, and return a corresponding
+`YY_BUFFER_STATE' handle (which you should delete with
+`yy_delete_buffer()' when done with it).  They also switch to the new
+buffer using `yy_switch_to_buffer()', so the next call to `yylex()' will
+start scanning the string.
+
+`yy_scan_string(const char *str)'
+     scans a NUL-terminated string.
+
+`yy_scan_bytes(const char *bytes, int len)'
+     scans `len' bytes (including possibly NUL's) starting at location
+     BYTES.
+
+   Note that both of these functions create and scan a *copy* of the
+string or bytes.  (This may be desirable, since `yylex()' modifies the
+contents of the buffer it is scanning.) You can avoid the copy by using:
+
+`yy_scan_buffer(char *base, yy_size_t size)'
+     which scans in place the buffer starting at BASE, consisting of
+     SIZE bytes, the last two bytes of which *must* be
+     `YY_END_OF_BUFFER_CHAR' (ASCII NUL).  These last two bytes are not
+     scanned; thus, scanning consists of `base[0]' through
+     `base[size-2]', inclusive.
+
+     If you fail to set up BASE in this manner (i.e., forget the final
+     two `YY_END_OF_BUFFER_CHAR' bytes), then `yy_scan_buffer()'
+     returns a nil pointer instead of creating a new input buffer.
+
+     The type `yy_size_t' is an integral type to which you can cast an
+     integer expression reflecting the size of the buffer.
+
+
+File: flex.info,  Node: End-of-file rules,  Next: Miscellaneous,  Prev: Multiple buffers,  Up: Top
+
+End-of-file rules
+=================
+
+   The special rule "<<EOF>>" indicates actions which are to be taken
+when an end-of-file is encountered and yywrap() returns non-zero (i.e.,
+indicates no further files to process).  The action must finish by
+doing one of four things:
+
+   - assigning `yyin' to a new input file (in previous versions of
+     flex, after doing the assignment you had to call the special
+     action `YY_NEW_FILE'; this is no longer necessary);
+
+   - executing a `return' statement;
+
+   - executing the special `yyterminate()' action;
+
+   - or, switching to a new buffer using `yy_switch_to_buffer()' as
+     shown in the example above.
+
+   <<EOF>> rules may not be used with other patterns; they may only be
+qualified with a list of start conditions.  If an unqualified <<EOF>>
+rule is given, it applies to *all* start conditions which do not
+already have <<EOF>> actions.  To specify an <<EOF>> rule for only the
+initial start condition, use
+
+     <INITIAL><<EOF>>
+
+   These rules are useful for catching things like unclosed comments.
+An example:
+
+     %x quote
+     %%
+     
+     ...other rules for dealing with quotes...
+     
+     <quote><<EOF>>   {
+              error( "unterminated quote" );
+              yyterminate();
+              }
+     <<EOF>>  {
+              if ( *++filelist )
+                  yyin = fopen( *filelist, "r" );
+              else
+                 yyterminate();
+              }
+
+
+File: flex.info,  Node: Miscellaneous,  Next: User variables,  Prev: End-of-file rules,  Up: Top
+
+Miscellaneous macros
+====================
+
+   The macro `YY_USER_ACTION' can be defined to provide an action which
+is always executed prior to the matched rule's action.  For example, it
+could be #define'd to call a routine to convert yytext to lower-case.
+When `YY_USER_ACTION' is invoked, the variable `yy_act' gives the
+number of the matched rule (rules are numbered starting with 1).
+Suppose you want to profile how often each of your rules is matched.
+The following would do the trick:
+
+     #define YY_USER_ACTION ++ctr[yy_act]
+
+   where `ctr' is an array to hold the counts for the different rules.
+Note that the macro `YY_NUM_RULES' gives the total number of rules
+(including the default rule, even if you use `-s', so a correct
+declaration for `ctr' is:
+
+     int ctr[YY_NUM_RULES];
+
+   The macro `YY_USER_INIT' may be defined to provide an action which
+is always executed before the first scan (and before the scanner's
+internal initializations are done).  For example, it could be used to
+call a routine to read in a data table or open a logging file.
+
+   The macro `yy_set_interactive(is_interactive)' can be used to
+control whether the current buffer is considered *interactive*.  An
+interactive buffer is processed more slowly, but must be used when the
+scanner's input source is indeed interactive to avoid problems due to
+waiting to fill buffers (see the discussion of the `-I' flag below).  A
+non-zero value in the macro invocation marks the buffer as interactive,
+a zero value as non-interactive.  Note that use of this macro overrides
+`%option always-interactive' or `%option never-interactive' (see
+Options below).  `yy_set_interactive()' must be invoked prior to
+beginning to scan the buffer that is (or is not) to be considered
+interactive.
+
+   The macro `yy_set_bol(at_bol)' can be used to control whether the
+current buffer's scanning context for the next token match is done as
+though at the beginning of a line.  A non-zero macro argument makes
+rules anchored with
+
+   The macro `YY_AT_BOL()' returns true if the next token scanned from
+the current buffer will have '^' rules active, false otherwise.
+
+   In the generated scanner, the actions are all gathered in one large
+switch statement and separated using `YY_BREAK', which may be
+redefined.  By default, it is simply a "break", to separate each rule's
+action from the following rule's.  Redefining `YY_BREAK' allows, for
+example, C++ users to #define YY_BREAK to do nothing (while being very
+careful that every rule ends with a "break" or a "return"!) to avoid
+suffering from unreachable statement warnings where because a rule's
+action ends with "return", the `YY_BREAK' is inaccessible.
+
+
+File: flex.info,  Node: User variables,  Next: YACC interface,  Prev: Miscellaneous,  Up: Top
+
+Values available to the user
+============================
+
+   This section summarizes the various values available to the user in
+the rule actions.
+
+   - `char *yytext' holds the text of the current token.  It may be
+     modified but not lengthened (you cannot append characters to the
+     end).
+
+     If the special directive `%array' appears in the first section of
+     the scanner description, then `yytext' is instead declared `char
+     yytext[YYLMAX]', where `YYLMAX' is a macro definition that you can
+     redefine in the first section if you don't like the default value
+     (generally 8KB).  Using `%array' results in somewhat slower
+     scanners, but the value of `yytext' becomes immune to calls to
+     `input()' and `unput()', which potentially destroy its value when
+     `yytext' is a character pointer.  The opposite of `%array' is
+     `%pointer', which is the default.
+
+     You cannot use `%array' when generating C++ scanner classes (the
+     `-+' flag).
+
+   - `int yyleng' holds the length of the current token.
+
+   - `FILE *yyin' is the file which by default `flex' reads from.  It
+     may be redefined but doing so only makes sense before scanning
+     begins or after an EOF has been encountered.  Changing it in the
+     midst of scanning will have unexpected results since `flex'
+     buffers its input; use `yyrestart()' instead.  Once scanning
+     terminates because an end-of-file has been seen, you can assign
+     `yyin' at the new input file and then call the scanner again to
+     continue scanning.
+
+   - `void yyrestart( FILE *new_file )' may be called to point `yyin'
+     at the new input file.  The switch-over to the new file is
+     immediate (any previously buffered-up input is lost).  Note that
+     calling `yyrestart()' with `yyin' as an argument thus throws away
+     the current input buffer and continues scanning the same input
+     file.
+
+   - `FILE *yyout' is the file to which `ECHO' actions are done.  It
+     can be reassigned by the user.
+
+   - `YY_CURRENT_BUFFER' returns a `YY_BUFFER_STATE' handle to the
+     current buffer.
+
+   - `YY_START' returns an integer value corresponding to the current
+     start condition.  You can subsequently use this value with `BEGIN'
+     to return to that start condition.
+
+
+File: flex.info,  Node: YACC interface,  Next: Options,  Prev: User variables,  Up: Top
+
+Interfacing with `yacc'
+=======================
+
+   One of the main uses of `flex' is as a companion to the `yacc'
+parser-generator.  `yacc' parsers expect to call a routine named
+`yylex()' to find the next input token.  The routine is supposed to
+return the type of the next token as well as putting any associated
+value in the global `yylval'.  To use `flex' with `yacc', one specifies
+the `-d' option to `yacc' to instruct it to generate the file `y.tab.h'
+containing definitions of all the `%tokens' appearing in the `yacc'
+input.  This file is then included in the `flex' scanner.  For example,
+if one of the tokens is "TOK_NUMBER", part of the scanner might look
+like:
+
+     %{
+     #include "y.tab.h"
+     %}
+     
+     %%
+     
+     [0-9]+        yylval = atoi( yytext ); return TOK_NUMBER;
+
+
+File: flex.info,  Node: Options,  Next: Performance,  Prev: YACC interface,  Up: Top
+
+Options
+=======
+
+   `flex' has the following options:
+
+`-b'
+     Generate backing-up information to `lex.backup'.  This is a list
+     of scanner states which require backing up and the input
+     characters on which they do so.  By adding rules one can remove
+     backing-up states.  If *all* backing-up states are eliminated and
+     `-Cf' or `-CF' is used, the generated scanner will run faster (see
+     the `-p' flag).  Only users who wish to squeeze every last cycle
+     out of their scanners need worry about this option.  (See the
+     section on Performance Considerations below.)
+
+`-c'
+     is a do-nothing, deprecated option included for POSIX compliance.
+
+`-d'
+     makes the generated scanner run in "debug" mode.  Whenever a
+     pattern is recognized and the global `yy_flex_debug' is non-zero
+     (which is the default), the scanner will write to `stderr' a line
+     of the form:
+
+          --accepting rule at line 53 ("the matched text")
+
+     The line number refers to the location of the rule in the file
+     defining the scanner (i.e., the file that was fed to flex).
+     Messages are also generated when the scanner backs up, accepts the
+     default rule, reaches the end of its input buffer (or encounters a
+     NUL; at this point, the two look the same as far as the scanner's
+     concerned), or reaches an end-of-file.
+
+`-f'
+     specifies "fast scanner".  No table compression is done and stdio
+     is bypassed.  The result is large but fast.  This option is
+     equivalent to `-Cfr' (see below).
+
+`-h'
+     generates a "help" summary of `flex's' options to `stdout' and
+     then exits.  `-?' and `--help' are synonyms for `-h'.
+
+`-i'
+     instructs `flex' to generate a *case-insensitive* scanner.  The
+     case of letters given in the `flex' input patterns will be
+     ignored, and tokens in the input will be matched regardless of
+     case.  The matched text given in `yytext' will have the preserved
+     case (i.e., it will not be folded).
+
+`-l'
+     turns on maximum compatibility with the original AT&T `lex'
+     implementation.  Note that this does not mean *full*
+     compatibility.  Use of this option costs a considerable amount of
+     performance, and it cannot be used with the `-+, -f, -F, -Cf', or
+     `-CF' options.  For details on the compatibilities it provides, see
+     the section "Incompatibilities With Lex And POSIX" below.  This
+     option also results in the name `YY_FLEX_LEX_COMPAT' being
+     #define'd in the generated scanner.
+
+`-n'
+     is another do-nothing, deprecated option included only for POSIX
+     compliance.
+
+`-p'
+     generates a performance report to stderr.  The report consists of
+     comments regarding features of the `flex' input file which will
+     cause a serious loss of performance in the resulting scanner.  If
+     you give the flag twice, you will also get comments regarding
+     features that lead to minor performance losses.
+
+     Note that the use of `REJECT', `%option yylineno' and variable
+     trailing context (see the Deficiencies / Bugs section below)
+     entails a substantial performance penalty; use of `yymore()', the
+     `^' operator, and the `-I' flag entail minor performance penalties.
+
+`-s'
+     causes the "default rule" (that unmatched scanner input is echoed
+     to `stdout') to be suppressed.  If the scanner encounters input
+     that does not match any of its rules, it aborts with an error.
+     This option is useful for finding holes in a scanner's rule set.
+
+`-t'
+     instructs `flex' to write the scanner it generates to standard
+     output instead of `lex.yy.c'.
+
+`-v'
+     specifies that `flex' should write to `stderr' a summary of
+     statistics regarding the scanner it generates.  Most of the
+     statistics are meaningless to the casual `flex' user, but the
+     first line identifies the version of `flex' (same as reported by
+     `-V'), and the next line the flags used when generating the
+     scanner, including those that are on by default.
+
+`-w'
+     suppresses warning messages.
+
+`-B'
+     instructs `flex' to generate a *batch* scanner, the opposite of
+     *interactive* scanners generated by `-I' (see below).  In general,
+     you use `-B' when you are *certain* that your scanner will never
+     be used interactively, and you want to squeeze a *little* more
+     performance out of it.  If your goal is instead to squeeze out a
+     *lot* more performance, you should be using the `-Cf' or `-CF'
+     options (discussed below), which turn on `-B' automatically anyway.
+
+`-F'
+     specifies that the "fast" scanner table representation should be
+     used (and stdio bypassed).  This representation is about as fast
+     as the full table representation `(-f)', and for some sets of
+     patterns will be considerably smaller (and for others, larger).
+     In general, if the pattern set contains both "keywords" and a
+     catch-all, "identifier" rule, such as in the set:
+
+          "case"    return TOK_CASE;
+          "switch"  return TOK_SWITCH;
+          ...
+          "default" return TOK_DEFAULT;
+          [a-z]+    return TOK_ID;
+
+     then you're better off using the full table representation.  If
+     only the "identifier" rule is present and you then use a hash
+     table or some such to detect the keywords, you're better off using
+     `-F'.
+
+     This option is equivalent to `-CFr' (see below).  It cannot be
+     used with `-+'.
+
+`-I'
+     instructs `flex' to generate an *interactive* scanner.  An
+     interactive scanner is one that only looks ahead to decide what
+     token has been matched if it absolutely must.  It turns out that
+     always looking one extra character ahead, even if the scanner has
+     already seen enough text to disambiguate the current token, is a
+     bit faster than only looking ahead when necessary.  But scanners
+     that always look ahead give dreadful interactive performance; for
+     example, when a user types a newline, it is not recognized as a
+     newline token until they enter *another* token, which often means
+     typing in another whole line.
+
+     `Flex' scanners default to *interactive* unless you use the `-Cf'
+     or `-CF' table-compression options (see below).  That's because if
+     you're looking for high-performance you should be using one of
+     these options, so if you didn't, `flex' assumes you'd rather trade
+     off a bit of run-time performance for intuitive interactive
+     behavior.  Note also that you *cannot* use `-I' in conjunction
+     with `-Cf' or `-CF'.  Thus, this option is not really needed; it
+     is on by default for all those cases in which it is allowed.
+
+     You can force a scanner to *not* be interactive by using `-B' (see
+     above).
+
+`-L'
+     instructs `flex' not to generate `#line' directives.  Without this
+     option, `flex' peppers the generated scanner with #line directives
+     so error messages in the actions will be correctly located with
+     respect to either the original `flex' input file (if the errors
+     are due to code in the input file), or `lex.yy.c' (if the errors
+     are `flex's' fault - you should report these sorts of errors to
+     the email address given below).
+
+`-T'
+     makes `flex' run in `trace' mode.  It will generate a lot of
+     messages to `stderr' concerning the form of the input and the
+     resultant non-deterministic and deterministic finite automata.
+     This option is mostly for use in maintaining `flex'.
+
+`-V'
+     prints the version number to `stdout' and exits.  `--version' is a
+     synonym for `-V'.
+
+`-7'
+     instructs `flex' to generate a 7-bit scanner, i.e., one which can
+     only recognized 7-bit characters in its input.  The advantage of
+     using `-7' is that the scanner's tables can be up to half the size
+     of those generated using the `-8' option (see below).  The
+     disadvantage is that such scanners often hang or crash if their
+     input contains an 8-bit character.
+
+     Note, however, that unless you generate your scanner using the
+     `-Cf' or `-CF' table compression options, use of `-7' will save
+     only a small amount of table space, and make your scanner
+     considerably less portable.  `Flex's' default behavior is to
+     generate an 8-bit scanner unless you use the `-Cf' or `-CF', in
+     which case `flex' defaults to generating 7-bit scanners unless
+     your site was always configured to generate 8-bit scanners (as
+     will often be the case with non-USA sites).  You can tell whether
+     flex generated a 7-bit or an 8-bit scanner by inspecting the flag
+     summary in the `-v' output as described above.
+
+     Note that if you use `-Cfe' or `-CFe' (those table compression
+     options, but also using equivalence classes as discussed see
+     below), flex still defaults to generating an 8-bit scanner, since
+     usually with these compression options full 8-bit tables are not
+     much more expensive than 7-bit tables.
+
+`-8'
+     instructs `flex' to generate an 8-bit scanner, i.e., one which can
+     recognize 8-bit characters.  This flag is only needed for scanners
+     generated using `-Cf' or `-CF', as otherwise flex defaults to
+     generating an 8-bit scanner anyway.
+
+     See the discussion of `-7' above for flex's default behavior and
+     the tradeoffs between 7-bit and 8-bit scanners.
+
+`-+'
+     specifies that you want flex to generate a C++ scanner class.  See
+     the section on Generating C++ Scanners below for details.
+
+`-C[aefFmr]'
+     controls the degree of table compression and, more generally,
+     trade-offs between small scanners and fast scanners.
+
+     `-Ca' ("align") instructs flex to trade off larger tables in the
+     generated scanner for faster performance because the elements of
+     the tables are better aligned for memory access and computation.
+     On some RISC architectures, fetching and manipulating long-words
+     is more efficient than with smaller-sized units such as
+     shortwords.  This option can double the size of the tables used by
+     your scanner.
+
+     `-Ce' directs `flex' to construct "equivalence classes", i.e.,
+     sets of characters which have identical lexical properties (for
+     example, if the only appearance of digits in the `flex' input is
+     in the character class "[0-9]" then the digits '0', '1', ..., '9'
+     will all be put in the same equivalence class).  Equivalence
+     classes usually give dramatic reductions in the final table/object
+     file sizes (typically a factor of 2-5) and are pretty cheap
+     performance-wise (one array look-up per character scanned).
+
+     `-Cf' specifies that the *full* scanner tables should be generated
+     - `flex' should not compress the tables by taking advantages of
+     similar transition functions for different states.
+
+     `-CF' specifies that the alternate fast scanner representation
+     (described above under the `-F' flag) should be used.  This option
+     cannot be used with `-+'.
+
+     `-Cm' directs `flex' to construct "meta-equivalence classes",
+     which are sets of equivalence classes (or characters, if
+     equivalence classes are not being used) that are commonly used
+     together.  Meta-equivalence classes are often a big win when using
+     compressed tables, but they have a moderate performance impact
+     (one or two "if" tests and one array look-up per character
+     scanned).
+
+     `-Cr' causes the generated scanner to *bypass* use of the standard
+     I/O library (stdio) for input.  Instead of calling `fread()' or
+     `getc()', the scanner will use the `read()' system call, resulting
+     in a performance gain which varies from system to system, but in
+     general is probably negligible unless you are also using `-Cf' or
+     `-CF'.  Using `-Cr' can cause strange behavior if, for example,
+     you read from `yyin' using stdio prior to calling the scanner
+     (because the scanner will miss whatever text your previous reads
+     left in the stdio input buffer).
+
+     `-Cr' has no effect if you define `YY_INPUT' (see The Generated
+     Scanner above).
+
+     A lone `-C' specifies that the scanner tables should be compressed
+     but neither equivalence classes nor meta-equivalence classes
+     should be used.
+
+     The options `-Cf' or `-CF' and `-Cm' do not make sense together -
+     there is no opportunity for meta-equivalence classes if the table
+     is not being compressed.  Otherwise the options may be freely
+     mixed, and are cumulative.
+
+     The default setting is `-Cem', which specifies that `flex' should
+     generate equivalence classes and meta-equivalence classes.  This
+     setting provides the highest degree of table compression.  You can
+     trade off faster-executing scanners at the cost of larger tables
+     with the following generally being true:
+
+          slowest & smallest
+                -Cem
+                -Cm
+                -Ce
+                -C
+                -C{f,F}e
+                -C{f,F}
+                -C{f,F}a
+          fastest & largest
+
+     Note that scanners with the smallest tables are usually generated
+     and compiled the quickest, so during development you will usually
+     want to use the default, maximal compression.
+
+     `-Cfe' is often a good compromise between speed and size for
+     production scanners.
+
+`-ooutput'
+     directs flex to write the scanner to the file `out-' `put' instead
+     of `lex.yy.c'.  If you combine `-o' with the `-t' option, then the
+     scanner is written to `stdout' but its `#line' directives (see the
+     `-L' option above) refer to the file `output'.
+
+`-Pprefix'
+     changes the default `yy' prefix used by `flex' for all
+     globally-visible variable and function names to instead be PREFIX.
+     For example, `-Pfoo' changes the name of `yytext' to `footext'.
+     It also changes the name of the default output file from
+     `lex.yy.c' to `lex.foo.c'.  Here are all of the names affected:
+
+          yy_create_buffer
+          yy_delete_buffer
+          yy_flex_debug
+          yy_init_buffer
+          yy_flush_buffer
+          yy_load_buffer_state
+          yy_switch_to_buffer
+          yyin
+          yyleng
+          yylex
+          yylineno
+          yyout
+          yyrestart
+          yytext
+          yywrap
+
+     (If you are using a C++ scanner, then only `yywrap' and
+     `yyFlexLexer' are affected.) Within your scanner itself, you can
+     still refer to the global variables and functions using either
+     version of their name; but externally, they have the modified name.
+
+     This option lets you easily link together multiple `flex' programs
+     into the same executable.  Note, though, that using this option
+     also renames `yywrap()', so you now *must* either provide your own
+     (appropriately-named) version of the routine for your scanner, or
+     use `%option noyywrap', as linking with `-lfl' no longer provides
+     one for you by default.
+
+`-Sskeleton_file'
+     overrides the default skeleton file from which `flex' constructs
+     its scanners.  You'll never need this option unless you are doing
+     `flex' maintenance or development.
+
+   `flex' also provides a mechanism for controlling options within the
+scanner specification itself, rather than from the flex command-line.
+This is done by including `%option' directives in the first section of
+the scanner specification.  You can specify multiple options with a
+single `%option' directive, and multiple directives in the first
+section of your flex input file.  Most options are given simply as
+names, optionally preceded by the word "no" (with no intervening
+whitespace) to negate their meaning.  A number are equivalent to flex
+flags or their negation:
+
+     7bit            -7 option
+     8bit            -8 option
+     align           -Ca option
+     backup          -b option
+     batch           -B option
+     c++             -+ option
+     
+     caseful or
+     case-sensitive  opposite of -i (default)
+     
+     case-insensitive or
+     caseless        -i option
+     
+     debug           -d option
+     default         opposite of -s option
+     ecs             -Ce option
+     fast            -F option
+     full            -f option
+     interactive     -I option
+     lex-compat      -l option
+     meta-ecs        -Cm option
+     perf-report     -p option
+     read            -Cr option
+     stdout          -t option
+     verbose         -v option
+     warn            opposite of -w option
+                     (use "%option nowarn" for -w)
+     
+     array           equivalent to "%array"
+     pointer         equivalent to "%pointer" (default)
+
+   Some `%option's' provide features otherwise not available:
+
+`always-interactive'
+     instructs flex to generate a scanner which always considers its
+     input "interactive".  Normally, on each new input file the scanner
+     calls `isatty()' in an attempt to determine whether the scanner's
+     input source is interactive and thus should be read a character at
+     a time.  When this option is used, however, then no such call is
+     made.
+
+`main'
+     directs flex to provide a default `main()' program for the
+     scanner, which simply calls `yylex()'.  This option implies
+     `noyywrap' (see below).
+
+`never-interactive'
+     instructs flex to generate a scanner which never considers its
+     input "interactive" (again, no call made to `isatty())'.  This is
+     the opposite of `always-' *interactive*.
+
+`stack'
+     enables the use of start condition stacks (see Start Conditions
+     above).
+
+`stdinit'
+     if unset (i.e., `%option nostdinit') initializes `yyin' and
+     `yyout' to nil `FILE' pointers, instead of `stdin' and `stdout'.
+
+`yylineno'
+     directs `flex' to generate a scanner that maintains the number of
+     the current line read from its input in the global variable
+     `yylineno'.  This option is implied by `%option lex-compat'.
+
+`yywrap'
+     if unset (i.e., `%option noyywrap'), makes the scanner not call
+     `yywrap()' upon an end-of-file, but simply assume that there are
+     no more files to scan (until the user points `yyin' at a new file
+     and calls `yylex()' again).
+
+   `flex' scans your rule actions to determine whether you use the
+`REJECT' or `yymore()' features.  The `reject' and `yymore' options are
+available to override its decision as to whether you use the options,
+either by setting them (e.g., `%option reject') to indicate the feature
+is indeed used, or unsetting them to indicate it actually is not used
+(e.g., `%option noyymore').
+
+   Three options take string-delimited values, offset with '=':
+
+     %option outfile="ABC"
+
+is equivalent to `-oABC', and
+
+     %option prefix="XYZ"
+
+is equivalent to `-PXYZ'.
+
+   Finally,
+
+     %option yyclass="foo"
+
+only applies when generating a C++ scanner (`-+' option).  It informs
+`flex' that you have derived `foo' as a subclass of `yyFlexLexer' so
+`flex' will place your actions in the member function `foo::yylex()'
+instead of `yyFlexLexer::yylex()'.  It also generates a
+`yyFlexLexer::yylex()' member function that emits a run-time error (by
+invoking `yyFlexLexer::LexerError()') if called.  See Generating C++
+Scanners, below, for additional information.
+
+   A number of options are available for lint purists who want to
+suppress the appearance of unneeded routines in the generated scanner.
+Each of the following, if unset, results in the corresponding routine
+not appearing in the generated scanner:
+
+     input, unput
+     yy_push_state, yy_pop_state, yy_top_state
+     yy_scan_buffer, yy_scan_bytes, yy_scan_string
+
+(though `yy_push_state()' and friends won't appear anyway unless you
+use `%option stack').
+
+
+File: flex.info,  Node: Performance,  Next: C++,  Prev: Options,  Up: Top
+
+Performance considerations
+==========================
+
+   The main design goal of `flex' is that it generate high-performance
+scanners.  It has been optimized for dealing well with large sets of
+rules.  Aside from the effects on scanner speed of the table
+compression `-C' options outlined above, there are a number of
+options/actions which degrade performance.  These are, from most
+expensive to least:
+
+     REJECT
+     %option yylineno
+     arbitrary trailing context
+     
+     pattern sets that require backing up
+     %array
+     %option interactive
+     %option always-interactive
+     
+     '^' beginning-of-line operator
+     yymore()
+
+   with the first three all being quite expensive and the last two
+being quite cheap.  Note also that `unput()' is implemented as a
+routine call that potentially does quite a bit of work, while
+`yyless()' is a quite-cheap macro; so if just putting back some excess
+text you scanned, use `yyless()'.
+
+   `REJECT' should be avoided at all costs when performance is
+important.  It is a particularly expensive option.
+
+   Getting rid of backing up is messy and often may be an enormous
+amount of work for a complicated scanner.  In principal, one begins by
+using the `-b' flag to generate a `lex.backup' file.  For example, on
+the input
+
+     %%
+     foo        return TOK_KEYWORD;
+     foobar     return TOK_KEYWORD;
+
+the file looks like:
+
+     State #6 is non-accepting -
+      associated rule line numbers:
+            2       3
+      out-transitions: [ o ]
+      jam-transitions: EOF [ \001-n  p-\177 ]
+     
+     State #8 is non-accepting -
+      associated rule line numbers:
+            3
+      out-transitions: [ a ]
+      jam-transitions: EOF [ \001-`  b-\177 ]
+     
+     State #9 is non-accepting -
+      associated rule line numbers:
+            3
+      out-transitions: [ r ]
+      jam-transitions: EOF [ \001-q  s-\177 ]
+     
+     Compressed tables always back up.
+
+   The first few lines tell us that there's a scanner state in which it
+can make a transition on an 'o' but not on any other character, and
+that in that state the currently scanned text does not match any rule.
+The state occurs when trying to match the rules found at lines 2 and 3
+in the input file.  If the scanner is in that state and then reads
+something other than an 'o', it will have to back up to find a rule
+which is matched.  With a bit of head-scratching one can see that this
+must be the state it's in when it has seen "fo".  When this has
+happened, if anything other than another 'o' is seen, the scanner will
+have to back up to simply match the 'f' (by the default rule).
+
+   The comment regarding State #8 indicates there's a problem when
+"foob" has been scanned.  Indeed, on any character other than an 'a',
+the scanner will have to back up to accept "foo".  Similarly, the
+comment for State #9 concerns when "fooba" has been scanned and an 'r'
+does not follow.
+
+   The final comment reminds us that there's no point going to all the
+trouble of removing backing up from the rules unless we're using `-Cf'
+or `-CF', since there's no performance gain doing so with compressed
+scanners.
+
+   The way to remove the backing up is to add "error" rules:
+
+     %%
+     foo         return TOK_KEYWORD;
+     foobar      return TOK_KEYWORD;
+     
+     fooba       |
+     foob        |
+     fo          {
+                 /* false alarm, not really a keyword */
+                 return TOK_ID;
+                 }
+
+   Eliminating backing up among a list of keywords can also be done
+using a "catch-all" rule:
+
+     %%
+     foo         return TOK_KEYWORD;
+     foobar      return TOK_KEYWORD;
+     
+     [a-z]+      return TOK_ID;
+
+   This is usually the best solution when appropriate.
+
+   Backing up messages tend to cascade.  With a complicated set of
+rules it's not uncommon to get hundreds of messages.  If one can
+decipher them, though, it often only takes a dozen or so rules to
+eliminate the backing up (though it's easy to make a mistake and have
+an error rule accidentally match a valid token.  A possible future
+`flex' feature will be to automatically add rules to eliminate backing
+up).
+
+   It's important to keep in mind that you gain the benefits of
+eliminating backing up only if you eliminate *every* instance of
+backing up.  Leaving just one means you gain nothing.
+
+   VARIABLE trailing context (where both the leading and trailing parts
+do not have a fixed length) entails almost the same performance loss as
+`REJECT' (i.e., substantial).  So when possible a rule like:
+
+     %%
+     mouse|rat/(cat|dog)   run();
+
+is better written:
+
+     %%
+     mouse/cat|dog         run();
+     rat/cat|dog           run();
+
+or as
+
+     %%
+     mouse|rat/cat         run();
+     mouse|rat/dog         run();
+
+   Note that here the special '|' action does *not* provide any
+savings, and can even make things worse (see Deficiencies / Bugs below).
+
+   Another area where the user can increase a scanner's performance
+(and one that's easier to implement) arises from the fact that the
+longer the tokens matched, the faster the scanner will run.  This is
+because with long tokens the processing of most input characters takes
+place in the (short) inner scanning loop, and does not often have to go
+through the additional work of setting up the scanning environment
+(e.g., `yytext') for the action.  Recall the scanner for C comments:
+
+     %x comment
+     %%
+             int line_num = 1;
+     
+     "/*"         BEGIN(comment);
+     
+     <comment>[^*\n]*
+     <comment>"*"+[^*/\n]*
+     <comment>\n             ++line_num;
+     <comment>"*"+"/"        BEGIN(INITIAL);
+
+   This could be sped up by writing it as:
+
+     %x comment
+     %%
+             int line_num = 1;
+     
+     "/*"         BEGIN(comment);
+     
+     <comment>[^*\n]*
+     <comment>[^*\n]*\n      ++line_num;
+     <comment>"*"+[^*/\n]*
+     <comment>"*"+[^*/\n]*\n ++line_num;
+     <comment>"*"+"/"        BEGIN(INITIAL);
+
+   Now instead of each newline requiring the processing of another
+action, recognizing the newlines is "distributed" over the other rules
+to keep the matched text as long as possible.  Note that *adding* rules
+does *not* slow down the scanner!  The speed of the scanner is
+independent of the number of rules or (modulo the considerations given
+at the beginning of this section) how complicated the rules are with
+regard to operators such as '*' and '|'.
+
+   A final example in speeding up a scanner: suppose you want to scan
+through a file containing identifiers and keywords, one per line and
+with no other extraneous characters, and recognize all the keywords.  A
+natural first approach is:
+
+     %%
+     asm      |
+     auto     |
+     break    |
+     ... etc ...
+     volatile |
+     while    /* it's a keyword */
+     
+     .|\n     /* it's not a keyword */
+
+   To eliminate the back-tracking, introduce a catch-all rule:
+
+     %%
+     asm      |
+     auto     |
+     break    |
+     ... etc ...
+     volatile |
+     while    /* it's a keyword */
+     
+     [a-z]+   |
+     .|\n     /* it's not a keyword */
+
+   Now, if it's guaranteed that there's exactly one word per line, then
+we can reduce the total number of matches by a half by merging in the
+recognition of newlines with that of the other tokens:
+
+     %%
+     asm\n    |
+     auto\n   |
+     break\n  |
+     ... etc ...
+     volatile\n |
+     while\n  /* it's a keyword */
+     
+     [a-z]+\n |
+     .|\n     /* it's not a keyword */
+
+   One has to be careful here, as we have now reintroduced backing up
+into the scanner.  In particular, while *we* know that there will never
+be any characters in the input stream other than letters or newlines,
+`flex' can't figure this out, and it will plan for possibly needing to
+back up when it has scanned a token like "auto" and then the next
+character is something other than a newline or a letter.  Previously it
+would then just match the "auto" rule and be done, but now it has no
+"auto" rule, only a "auto\n" rule.  To eliminate the possibility of
+backing up, we could either duplicate all rules but without final
+newlines, or, since we never expect to encounter such an input and
+therefore don't how it's classified, we can introduce one more
+catch-all rule, this one which doesn't include a newline:
+
+     %%
+     asm\n    |
+     auto\n   |
+     break\n  |
+     ... etc ...
+     volatile\n |
+     while\n  /* it's a keyword */
+     
+     [a-z]+\n |
+     [a-z]+   |
+     .|\n     /* it's not a keyword */
+
+   Compiled with `-Cf', this is about as fast as one can get a `flex'
+scanner to go for this particular problem.
+
+   A final note: `flex' is slow when matching NUL's, particularly when
+a token contains multiple NUL's.  It's best to write rules which match
+*short* amounts of text if it's anticipated that the text will often
+include NUL's.
+
+   Another final note regarding performance: as mentioned above in the
+section How the Input is Matched, dynamically resizing `yytext' to
+accommodate huge tokens is a slow process because it presently requires
+that the (huge) token be rescanned from the beginning.  Thus if
+performance is vital, you should attempt to match "large" quantities of
+text but not "huge" quantities, where the cutoff between the two is at
+about 8K characters/token.
+
+
+File: flex.info,  Node: C++,  Next: Incompatibilities,  Prev: Performance,  Up: Top
+
+Generating C++ scanners
+=======================
+
+   `flex' provides two different ways to generate scanners for use with
+C++.  The first way is to simply compile a scanner generated by `flex'
+using a C++ compiler instead of a C compiler.  You should not encounter
+any compilations errors (please report any you find to the email address
+given in the Author section below).  You can then use C++ code in your
+rule actions instead of C code.  Note that the default input source for
+your scanner remains `yyin', and default echoing is still done to
+`yyout'.  Both of these remain `FILE *' variables and not C++ `streams'.
+
+   You can also use `flex' to generate a C++ scanner class, using the
+`-+' option, (or, equivalently, `%option c++'), which is automatically
+specified if the name of the flex executable ends in a `+', such as
+`flex++'.  When using this option, flex defaults to generating the
+scanner to the file `lex.yy.cc' instead of `lex.yy.c'.  The generated
+scanner includes the header file `FlexLexer.h', which defines the
+interface to two C++ classes.
+
+   The first class, `FlexLexer', provides an abstract base class
+defining the general scanner class interface.  It provides the
+following member functions:
+
+`const char* YYText()'
+     returns the text of the most recently matched token, the
+     equivalent of `yytext'.
+
+`int YYLeng()'
+     returns the length of the most recently matched token, the
+     equivalent of `yyleng'.
+
+`int lineno() const'
+     returns the current input line number (see `%option yylineno'), or
+     1 if `%option yylineno' was not used.
+
+`void set_debug( int flag )'
+     sets the debugging flag for the scanner, equivalent to assigning to
+     `yy_flex_debug' (see the Options section above).  Note that you
+     must build the scanner using `%option debug' to include debugging
+     information in it.
+
+`int debug() const'
+     returns the current setting of the debugging flag.
+
+   Also provided are member functions equivalent to
+`yy_switch_to_buffer(), yy_create_buffer()' (though the first argument
+is an `istream*' object pointer and not a `FILE*', `yy_flush_buffer()',
+`yy_delete_buffer()', and `yyrestart()' (again, the first argument is a
+`istream*' object pointer).
+
+   The second class defined in `FlexLexer.h' is `yyFlexLexer', which is
+derived from `FlexLexer'.  It defines the following additional member
+functions:
+
+`yyFlexLexer( istream* arg_yyin = 0, ostream* arg_yyout = 0 )'
+     constructs a `yyFlexLexer' object using the given streams for
+     input and output.  If not specified, the streams default to `cin'
+     and `cout', respectively.
+
+`virtual int yylex()'
+     performs the same role is `yylex()' does for ordinary flex
+     scanners: it scans the input stream, consuming tokens, until a
+     rule's action returns a value.  If you derive a subclass S from
+     `yyFlexLexer' and want to access the member functions and
+     variables of S inside `yylex()', then you need to use `%option
+     yyclass="S"' to inform `flex' that you will be using that subclass
+     instead of `yyFlexLexer'.  In this case, rather than generating
+     `yyFlexLexer::yylex()', `flex' generates `S::yylex()' (and also
+     generates a dummy `yyFlexLexer::yylex()' that calls
+     `yyFlexLexer::LexerError()' if called).
+
+`virtual void switch_streams(istream* new_in = 0, ostream* new_out = 0)'
+     reassigns `yyin' to `new_in' (if non-nil) and `yyout' to `new_out'
+     (ditto), deleting the previous input buffer if `yyin' is
+     reassigned.
+
+`int yylex( istream* new_in = 0, ostream* new_out = 0 )'
+     first switches the input streams via `switch_streams( new_in,
+     new_out )' and then returns the value of `yylex()'.
+
+   In addition, `yyFlexLexer' defines the following protected virtual
+functions which you can redefine in derived classes to tailor the
+scanner:
+
+`virtual int LexerInput( char* buf, int max_size )'
+     reads up to `max_size' characters into BUF and returns the number
+     of characters read.  To indicate end-of-input, return 0
+     characters.  Note that "interactive" scanners (see the `-B' and
+     `-I' flags) define the macro `YY_INTERACTIVE'.  If you redefine
+     `LexerInput()' and need to take different actions depending on
+     whether or not the scanner might be scanning an interactive input
+     source, you can test for the presence of this name via `#ifdef'.
+
+`virtual void LexerOutput( const char* buf, int size )'
+     writes out SIZE characters from the buffer BUF, which, while
+     NUL-terminated, may also contain "internal" NUL's if the scanner's
+     rules can match text with NUL's in them.
+
+`virtual void LexerError( const char* msg )'
+     reports a fatal error message.  The default version of this
+     function writes the message to the stream `cerr' and exits.
+
+   Note that a `yyFlexLexer' object contains its *entire* scanning
+state.  Thus you can use such objects to create reentrant scanners.
+You can instantiate multiple instances of the same `yyFlexLexer' class,
+and you can also combine multiple C++ scanner classes together in the
+same program using the `-P' option discussed above.  Finally, note that
+the `%array' feature is not available to C++ scanner classes; you must
+use `%pointer' (the default).
+
+   Here is an example of a simple C++ scanner:
+
+         // An example of using the flex C++ scanner class.
+     
+     %{
+     int mylineno = 0;
+     %}
+     
+     string  \"[^\n"]+\"
+     
+     ws      [ \t]+
+     
+     alpha   [A-Za-z]
+     dig     [0-9]
+     name    ({alpha}|{dig}|\$)({alpha}|{dig}|[_.\-/$])*
+     num1    [-+]?{dig}+\.?([eE][-+]?{dig}+)?
+     num2    [-+]?{dig}*\.{dig}+([eE][-+]?{dig}+)?
+     number  {num1}|{num2}
+     
+     %%
+     
+     {ws}    /* skip blanks and tabs */
+     
+     "/*"    {
+             int c;
+     
+             while((c = yyinput()) != 0)
+                 {
+                 if(c == '\n')
+                     ++mylineno;
+     
+                 else if(c == '*')
+                     {
+                     if((c = yyinput()) == '/')
+                         break;
+                     else
+                         unput(c);
+                     }
+                 }
+             }
+     
+     {number}  cout << "number " << YYText() << '\n';
+     
+     \n        mylineno++;
+     
+     {name}    cout << "name " << YYText() << '\n';
+     
+     {string}  cout << "string " << YYText() << '\n';
+     
+     %%
+     
+     Version 2.5               December 1994                        44
+     
+     int main( int /* argc */, char** /* argv */ )
+         {
+         FlexLexer* lexer = new yyFlexLexer;
+         while(lexer->yylex() != 0)
+             ;
+         return 0;
+         }
+
+   If you want to create multiple (different) lexer classes, you use
+the `-P' flag (or the `prefix=' option) to rename each `yyFlexLexer' to
+some other `xxFlexLexer'.  You then can include `<FlexLexer.h>' in your
+other sources once per lexer class, first renaming `yyFlexLexer' as
+follows:
+
+     #undef yyFlexLexer
+     #define yyFlexLexer xxFlexLexer
+     #include <FlexLexer.h>
+     
+     #undef yyFlexLexer
+     #define yyFlexLexer zzFlexLexer
+     #include <FlexLexer.h>
+
+   if, for example, you used `%option prefix="xx"' for one of your
+scanners and `%option prefix="zz"' for the other.
+
+   IMPORTANT: the present form of the scanning class is *experimental*
+and may change considerably between major releases.
+
+
+File: flex.info,  Node: Incompatibilities,  Next: Diagnostics,  Prev: C++,  Up: Top
+
+Incompatibilities with `lex' and POSIX
+======================================
+
+   `flex' is a rewrite of the AT&T Unix `lex' tool (the two
+implementations do not share any code, though), with some extensions
+and incompatibilities, both of which are of concern to those who wish
+to write scanners acceptable to either implementation.  Flex is fully
+compliant with the POSIX `lex' specification, except that when using
+`%pointer' (the default), a call to `unput()' destroys the contents of
+`yytext', which is counter to the POSIX specification.
+
+   In this section we discuss all of the known areas of incompatibility
+between flex, AT&T lex, and the POSIX specification.
+
+   `flex's' `-l' option turns on maximum compatibility with the
+original AT&T `lex' implementation, at the cost of a major loss in the
+generated scanner's performance.  We note below which incompatibilities
+can be overcome using the `-l' option.
+
+   `flex' is fully compatible with `lex' with the following exceptions:
+
+   - The undocumented `lex' scanner internal variable `yylineno' is not
+     supported unless `-l' or `%option yylineno' is used.  `yylineno'
+     should be maintained on a per-buffer basis, rather than a
+     per-scanner (single global variable) basis.  `yylineno' is not
+     part of the POSIX specification.
+
+   - The `input()' routine is not redefinable, though it may be called
+     to read characters following whatever has been matched by a rule.
+     If `input()' encounters an end-of-file the normal `yywrap()'
+     processing is done.  A "real" end-of-file is returned by `input()'
+     as `EOF'.
+
+     Input is instead controlled by defining the `YY_INPUT' macro.
+
+     The `flex' restriction that `input()' cannot be redefined is in
+     accordance with the POSIX specification, which simply does not
+     specify any way of controlling the scanner's input other than by
+     making an initial assignment to `yyin'.
+
+   - The `unput()' routine is not redefinable.  This restriction is in
+     accordance with POSIX.
+
+   - `flex' scanners are not as reentrant as `lex' scanners.  In
+     particular, if you have an interactive scanner and an interrupt
+     handler which long-jumps out of the scanner, and the scanner is
+     subsequently called again, you may get the following message:
+
+          fatal flex scanner internal error--end of buffer missed
+
+     To reenter the scanner, first use
+
+          yyrestart( yyin );
+
+     Note that this call will throw away any buffered input; usually
+     this isn't a problem with an interactive scanner.
+
+     Also note that flex C++ scanner classes *are* reentrant, so if
+     using C++ is an option for you, you should use them instead.  See
+     "Generating C++ Scanners" above for details.
+
+   - `output()' is not supported.  Output from the `ECHO' macro is done
+     to the file-pointer `yyout' (default `stdout').
+
+     `output()' is not part of the POSIX specification.
+
+   - `lex' does not support exclusive start conditions (%x), though
+     they are in the POSIX specification.
+
+   - When definitions are expanded, `flex' encloses them in
+     parentheses.  With lex, the following:
+
+          NAME    [A-Z][A-Z0-9]*
+          %%
+          foo{NAME}?      printf( "Found it\n" );
+          %%
+
+     will not match the string "foo" because when the macro is expanded
+     the rule is equivalent to "foo[A-Z][A-Z0-9]*?" and the precedence
+     is such that the '?' is associated with "[A-Z0-9]*".  With `flex',
+     the rule will be expanded to "foo([A-Z][A-Z0-9]*)?" and so the
+     string "foo" will match.
+
+     Note that if the definition begins with `^' or ends with `$' then
+     it is *not* expanded with parentheses, to allow these operators to
+     appear in definitions without losing their special meanings.  But
+     the `<s>, /', and `<<EOF>>' operators cannot be used in a `flex'
+     definition.
+
+     Using `-l' results in the `lex' behavior of no parentheses around
+     the definition.
+
+     The POSIX specification is that the definition be enclosed in
+     parentheses.
+
+   - Some implementations of `lex' allow a rule's action to begin on a
+     separate line, if the rule's pattern has trailing whitespace:
+
+          %%
+          foo|bar<space here>
+            { foobar_action(); }
+
+     `flex' does not support this feature.
+
+   - The `lex' `%r' (generate a Ratfor scanner) option is not
+     supported.  It is not part of the POSIX specification.
+
+   - After a call to `unput()', `yytext' is undefined until the next
+     token is matched, unless the scanner was built using `%array'.
+     This is not the case with `lex' or the POSIX specification.  The
+     `-l' option does away with this incompatibility.
+
+   - The precedence of the `{}' (numeric range) operator is different.
+     `lex' interprets "abc{1,3}" as "match one, two, or three
+     occurrences of 'abc'", whereas `flex' interprets it as "match 'ab'
+     followed by one, two, or three occurrences of 'c'".  The latter is
+     in agreement with the POSIX specification.
+
+   - The precedence of the `^' operator is different.  `lex' interprets
+     "^foo|bar" as "match either 'foo' at the beginning of a line, or
+     'bar' anywhere", whereas `flex' interprets it as "match either
+     'foo' or 'bar' if they come at the beginning of a line".  The
+     latter is in agreement with the POSIX specification.
+
+   - The special table-size declarations such as `%a' supported by
+     `lex' are not required by `flex' scanners; `flex' ignores them.
+
+   - The name FLEX_SCANNER is #define'd so scanners may be written for
+     use with either `flex' or `lex'.  Scanners also include
+     `YY_FLEX_MAJOR_VERSION' and `YY_FLEX_MINOR_VERSION' indicating
+     which version of `flex' generated the scanner (for example, for the
+     2.5 release, these defines would be 2 and 5 respectively).
+
+   The following `flex' features are not included in `lex' or the POSIX
+specification:
+
+     C++ scanners
+     %option
+     start condition scopes
+     start condition stacks
+     interactive/non-interactive scanners
+     yy_scan_string() and friends
+     yyterminate()
+     yy_set_interactive()
+     yy_set_bol()
+     YY_AT_BOL()
+     <<EOF>>
+     <*>
+     YY_DECL
+     YY_START
+     YY_USER_ACTION
+     YY_USER_INIT
+     #line directives
+     %{}'s around actions
+     multiple actions on a line
+
+plus almost all of the flex flags.  The last feature in the list refers
+to the fact that with `flex' you can put multiple actions on the same
+line, separated with semicolons, while with `lex', the following
+
+     foo    handle_foo(); ++num_foos_seen;
+
+is (rather surprisingly) truncated to
+
+     foo    handle_foo();
+
+   `flex' does not truncate the action.  Actions that are not enclosed
+in braces are simply terminated at the end of the line.
+
+
+File: flex.info,  Node: Diagnostics,  Next: Files,  Prev: Incompatibilities,  Up: Top
+
+Diagnostics
+===========
+
+`warning, rule cannot be matched'
+     indicates that the given rule cannot be matched because it follows
+     other rules that will always match the same text as it.  For
+     example, in the following "foo" cannot be matched because it comes
+     after an identifier "catch-all" rule:
+
+          [a-z]+    got_identifier();
+          foo       got_foo();
+
+     Using `REJECT' in a scanner suppresses this warning.
+
+`warning, -s option given but default rule can be matched'
+     means that it is possible (perhaps only in a particular start
+     condition) that the default rule (match any single character) is
+     the only one that will match a particular input.  Since `-s' was
+     given, presumably this is not intended.
+
+`reject_used_but_not_detected undefined'
+`yymore_used_but_not_detected undefined'
+     These errors can occur at compile time.  They indicate that the
+     scanner uses `REJECT' or `yymore()' but that `flex' failed to
+     notice the fact, meaning that `flex' scanned the first two sections
+     looking for occurrences of these actions and failed to find any,
+     but somehow you snuck some in (via a #include file, for example).
+     Use `%option reject' or `%option yymore' to indicate to flex that
+     you really do use these features.
+
+`flex scanner jammed'
+     a scanner compiled with `-s' has encountered an input string which
+     wasn't matched by any of its rules.  This error can also occur due
+     to internal problems.
+
+`token too large, exceeds YYLMAX'
+     your scanner uses `%array' and one of its rules matched a string
+     longer than the `YYL-' `MAX' constant (8K bytes by default).  You
+     can increase the value by #define'ing `YYLMAX' in the definitions
+     section of your `flex' input.
+
+`scanner requires -8 flag to use the character 'X''
+     Your scanner specification includes recognizing the 8-bit
+     character X and you did not specify the -8 flag, and your scanner
+     defaulted to 7-bit because you used the `-Cf' or `-CF' table
+     compression options.  See the discussion of the `-7' flag for
+     details.
+
+`flex scanner push-back overflow'
+     you used `unput()' to push back so much text that the scanner's
+     buffer could not hold both the pushed-back text and the current
+     token in `yytext'.  Ideally the scanner should dynamically resize
+     the buffer in this case, but at present it does not.
+
+`input buffer overflow, can't enlarge buffer because scanner uses REJECT'
+     the scanner was working on matching an extremely large token and
+     needed to expand the input buffer.  This doesn't work with
+     scanners that use `REJECT'.
+
+`fatal flex scanner internal error--end of buffer missed'
+     This can occur in an scanner which is reentered after a long-jump
+     has jumped out (or over) the scanner's activation frame.  Before
+     reentering the scanner, use:
+
+          yyrestart( yyin );
+
+     or, as noted above, switch to using the C++ scanner class.
+
+`too many start conditions in <> construct!'
+     you listed more start conditions in a <> construct than exist (so
+     you must have listed at least one of them twice).
+
+
+File: flex.info,  Node: Files,  Next: Deficiencies,  Prev: Diagnostics,  Up: Top
+
+Files
+=====
+
+`-lfl'
+     library with which scanners must be linked.
+
+`lex.yy.c'
+     generated scanner (called `lexyy.c' on some systems).
+
+`lex.yy.cc'
+     generated C++ scanner class, when using `-+'.
+
+`<FlexLexer.h>'
+     header file defining the C++ scanner base class, `FlexLexer', and
+     its derived class, `yyFlexLexer'.
+
+`flex.skl'
+     skeleton scanner.  This file is only used when building flex, not
+     when flex executes.
+
+`lex.backup'
+     backing-up information for `-b' flag (called `lex.bck' on some
+     systems).
+
+
+File: flex.info,  Node: Deficiencies,  Next: See also,  Prev: Files,  Up: Top
+
+Deficiencies / Bugs
+===================
+
+   Some trailing context patterns cannot be properly matched and
+generate warning messages ("dangerous trailing context").  These are
+patterns where the ending of the first part of the rule matches the
+beginning of the second part, such as "zx*/xy*", where the 'x*' matches
+the 'x' at the beginning of the trailing context.  (Note that the POSIX
+draft states that the text matched by such patterns is undefined.)
+
+   For some trailing context rules, parts which are actually
+fixed-length are not recognized as such, leading to the abovementioned
+performance loss.  In particular, parts using '|' or {n} (such as
+"foo{3}") are always considered variable-length.
+
+   Combining trailing context with the special '|' action can result in
+*fixed* trailing context being turned into the more expensive VARIABLE
+trailing context.  For example, in the following:
+
+     %%
+     abc      |
+     xyz/def
+
+   Use of `unput()' invalidates yytext and yyleng, unless the `%array'
+directive or the `-l' option has been used.
+
+   Pattern-matching of NUL's is substantially slower than matching
+other characters.
+
+   Dynamic resizing of the input buffer is slow, as it entails
+rescanning all the text matched so far by the current (generally huge)
+token.
+
+   Due to both buffering of input and read-ahead, you cannot intermix
+calls to <stdio.h> routines, such as, for example, `getchar()', with
+`flex' rules and expect it to work.  Call `input()' instead.
+
+   The total table entries listed by the `-v' flag excludes the number
+of table entries needed to determine what rule has been matched.  The
+number of entries is equal to the number of DFA states if the scanner
+does not use `REJECT', and somewhat greater than the number of states
+if it does.
+
+   `REJECT' cannot be used with the `-f' or `-F' options.
+
+   The `flex' internal algorithms need documentation.
+
+
+File: flex.info,  Node: See also,  Next: Author,  Prev: Deficiencies,  Up: Top
+
+See also
+========
+
+   `lex'(1), `yacc'(1), `sed'(1), `awk'(1).
+
+   John Levine, Tony Mason, and Doug Brown: Lex & Yacc; O'Reilly and
+Associates.  Be sure to get the 2nd edition.
+
+   M. E. Lesk and E. Schmidt, LEX - Lexical Analyzer Generator.
+
+   Alfred Aho, Ravi Sethi and Jeffrey Ullman: Compilers: Principles,
+Techniques and Tools; Addison-Wesley (1986).  Describes the
+pattern-matching techniques used by `flex' (deterministic finite
+automata).
+
+
+File: flex.info,  Node: Author,  Prev: See also,  Up: Top
+
+Author
+======
+
+   Vern Paxson, with the help of many ideas and much inspiration from
+Van Jacobson.  Original version by Jef Poskanzer.  The fast table
+representation is a partial implementation of a design done by Van
+Jacobson.  The implementation was done by Kevin Gong and Vern Paxson.
+
+   Thanks to the many `flex' beta-testers, feedbackers, and
+contributors, especially Francois Pinard, Casey Leedom, Stan Adermann,
+Terry Allen, David Barker-Plummer, John Basrai, Nelson H.F. Beebe,
+`benson@odi.com', Karl Berry, Peter A. Bigot, Simon Blanchard, Keith
+Bostic, Frederic Brehm, Ian Brockbank, Kin Cho, Nick Christopher, Brian
+Clapper, J.T. Conklin, Jason Coughlin, Bill Cox, Nick Cropper, Dave
+Curtis, Scott David Daniels, Chris G. Demetriou, Theo Deraadt, Mike
+Donahue, Chuck Doucette, Tom Epperly, Leo Eskin, Chris Faylor, Chris
+Flatters, Jon Forrest, Joe Gayda, Kaveh R. Ghazi, Eric Goldman,
+Christopher M.  Gould, Ulrich Grepel, Peer Griebel, Jan Hajic, Charles
+Hemphill, NORO Hideo, Jarkko Hietaniemi, Scott Hofmann, Jeff Honig,
+Dana Hudes, Eric Hughes, John Interrante, Ceriel Jacobs, Michal
+Jaegermann, Sakari Jalovaara, Jeffrey R. Jones, Henry Juengst, Klaus
+Kaempf, Jonathan I. Kamens, Terrence O Kane, Amir Katz,
+`ken@ken.hilco.com', Kevin B. Kenny, Steve Kirsch, Winfried Koenig,
+Marq Kole, Ronald Lamprecht, Greg Lee, Rohan Lenard, Craig Leres, John
+Levine, Steve Liddle, Mike Long, Mohamed el Lozy, Brian Madsen, Malte,
+Joe Marshall, Bengt Martensson, Chris Metcalf, Luke Mewburn, Jim
+Meyering, R.  Alexander Milowski, Erik Naggum, G.T. Nicol, Landon Noll,
+James Nordby, Marc Nozell, Richard Ohnemus, Karsten Pahnke, Sven Panne,
+Roland Pesch, Walter Pelissero, Gaumond Pierre, Esmond Pitt, Jef
+Poskanzer, Joe Rahmeh, Jarmo Raiha, Frederic Raimbault, Pat Rankin,
+Rick Richardson, Kevin Rodgers, Kai Uwe Rommel, Jim Roskind, Alberto
+Santini, Andreas Scherer, Darrell Schiebel, Raf Schietekat, Doug
+Schmidt, Philippe Schnoebelen, Andreas Schwab, Alex Siegel, Eckehard
+Stolz, Jan-Erik Strvmquist, Mike Stump, Paul Stuart, Dave Tallman, Ian
+Lance Taylor, Chris Thewalt, Richard M. Timoney, Jodi Tsai, Paul
+Tuinenga, Gary Weik, Frank Whaley, Gerhard Wilhelms, Kent Williams, Ken
+Yap, Ron Zellar, Nathan Zelle, David Zuhn, and those whose names have
+slipped my marginal mail-archiving skills but whose contributions are
+appreciated all the same.
+
+   Thanks to Keith Bostic, Jon Forrest, Noah Friedman, John Gilmore,
+Craig Leres, John Levine, Bob Mulcahy, G.T.  Nicol, Francois Pinard,
+Rich Salz, and Richard Stallman for help with various distribution
+headaches.
+
+   Thanks to Esmond Pitt and Earle Horton for 8-bit character support;
+to Benson Margulies and Fred Burke for C++ support; to Kent Williams
+and Tom Epperly for C++ class support; to Ove Ewerlid for support of
+NUL's; and to Eric Hughes for support of multiple buffers.
+
+   This work was primarily done when I was with the Real Time Systems
+Group at the Lawrence Berkeley Laboratory in Berkeley, CA.  Many thanks
+to all there for the support I received.
+
+   Send comments to `vern@ee.lbl.gov'.
+
+
+
+Tag Table:
+Node: Top1430
+Node: Name2808
+Node: Synopsis2933
+Node: Overview3145
+Node: Description4986
+Node: Examples5748
+Node: Format8896
+Node: Patterns11637
+Node: Matching18138
+Node: Actions21438
+Node: Generated scanner30560
+Node: Start conditions34988
+Node: Multiple buffers45069
+Node: End-of-file rules50975
+Node: Miscellaneous52508
+Node: User variables55279
+Node: YACC interface57651
+Node: Options58542
+Node: Performance78234
+Node: C++87532
+Node: Incompatibilities94993
+Node: Diagnostics101853
+Node: Files105094
+Node: Deficiencies105715
+Node: See also107684
+Node: Author108216
+
+End Tag Table
diff --git a/MISC/texinfo/flex.texi b/MISC/texinfo/flex.texi
new file mode 100644
index 0000000..23280b1
--- /dev/null
+++ b/MISC/texinfo/flex.texi
@@ -0,0 +1,3448 @@
+\input texinfo
+@c %**start of header
+@setfilename flex.info
+@settitle Flex - a scanner generator
+@c @finalout
+@c @setchapternewpage odd
+@c %**end of header
+
+@set EDITION 2.5
+@set UPDATED March 1995
+@set VERSION 2.5
+
+@c FIXME - Reread a printed copy with a red pen and patience.
+@c FIXME - Modify all "See ..." references and replace with @xref's.
+
+@ifinfo
+@format
+START-INFO-DIR-ENTRY
+* Flex: (flex).         A fast scanner generator.
+END-INFO-DIR-ENTRY
+@end format
+@end ifinfo
+
+@c Define new indices for commands, filenames, and options.
+@c @defcodeindex cm
+@c @defcodeindex fl
+@c @defcodeindex op
+
+@c Put everything in one index (arbitrarily chosen to be the concept index).
+@c @syncodeindex cm cp
+@c @syncodeindex fl cp
+@syncodeindex fn cp
+@syncodeindex ky cp
+@c @syncodeindex op cp
+@syncodeindex pg cp
+@syncodeindex vr cp
+
+@ifinfo
+This file documents Flex.
+
+Copyright (c) 1990 The Regents of the University of California.
+All rights reserved.
+
+This code is derived from software contributed to Berkeley by
+Vern Paxson.
+
+The United States Government has rights in this work pursuant
+to contract no. DE-AC03-76SF00098 between the United States
+Department of Energy and the University of California.
+
+Redistribution and use in source and binary forms with or without
+modification are permitted provided that: (1) source distributions
+retain this entire copyright notice and comment, and (2)
+distributions including binaries display the following
+acknowledgement:  ``This product includes software developed by the
+University of California, Berkeley and its contributors'' in the
+documentation or other materials provided with the distribution and
+in all advertising materials mentioning features or use of this
+software.  Neither the name of the University nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE.
+
+@ignore
+Permission is granted to process this file through TeX and print the
+results, provided the printed document carries copying permission
+notice identical to this one except for the removal of this paragraph
+(this paragraph not being relevant to the printed manual).
+
+@end ignore
+@end ifinfo
+
+@titlepage
+@title Flex, version @value{VERSION}
+@subtitle A fast scanner generator
+@subtitle Edition @value{EDITION}, @value{UPDATED}
+@author Vern Paxson
+
+@page
+@vskip 0pt plus 1filll
+Copyright @copyright{} 1990 The Regents of the University of California.
+All rights reserved.
+
+This code is derived from software contributed to Berkeley by
+Vern Paxson.
+
+The United States Government has rights in this work pursuant
+to contract no. DE-AC03-76SF00098 between the United States
+Department of Energy and the University of California.
+
+Redistribution and use in source and binary forms with or without
+modification are permitted provided that: (1) source distributions
+retain this entire copyright notice and comment, and (2)
+distributions including binaries display the following
+acknowledgement:  ``This product includes software developed by the
+University of California, Berkeley and its contributors'' in the
+documentation or other materials provided with the distribution and
+in all advertising materials mentioning features or use of this
+software.  Neither the name of the University nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
+IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
+WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE.
+@end titlepage
+
+@ifinfo
+
+@node Top, Name, (dir), (dir)
+@top flex
+
+@cindex scanner generator
+
+This manual documents @code{flex}.  It covers release @value{VERSION}.
+
+@menu
+* Name::                        Name
+* Synopsis::                    Synopsis
+* Overview::                    Overview
+* Description::                 Description
+* Examples::                    Some simple examples
+* Format::                      Format of the input file
+* Patterns::                    Patterns
+* Matching::                    How the input is matched
+* Actions::                     Actions
+* Generated scanner::           The generated scanner
+* Start conditions::            Start conditions
+* Multiple buffers::            Multiple input buffers
+* End-of-file rules::           End-of-file rules
+* Miscellaneous::               Miscellaneous macros
+* User variables::              Values available to the user
+* YACC interface::              Interfacing with @code{yacc}
+* Options::                     Options
+* Performance::                 Performance considerations
+* C++::                         Generating C++ scanners
+* Incompatibilities::           Incompatibilities with @code{lex} and POSIX
+* Diagnostics::                 Diagnostics
+* Files::                       Files
+* Deficiencies::                Deficiencies / Bugs
+* See also::                    See also
+* Author::                      Author
+@c * Index::                       Index
+@end menu
+
+@end ifinfo
+
+@node Name, Synopsis, Top, Top
+@section Name
+
+flex - fast lexical analyzer generator
+
+@node Synopsis, Overview, Name, Top
+@section Synopsis
+
+@example
+flex [-bcdfhilnpstvwBFILTV78+? -C[aefFmr] -ooutput -Pprefix -Sskeleton]
+[--help --version] [@var{filename} @dots{}]
+@end example
+
+@node Overview, Description, Synopsis, Top
+@section Overview
+
+This manual describes @code{flex}, a tool for generating programs
+that perform pattern-matching on text.  The manual
+includes both tutorial and reference sections:
+
+@table @asis
+@item Description
+a brief overview of the tool
+
+@item Some Simple Examples
+
+@item Format Of The Input File
+
+@item Patterns
+the extended regular expressions used by flex
+
+@item How The Input Is Matched
+the rules for determining what has been matched
+
+@item Actions
+how to specify what to do when a pattern is matched
+
+@item The Generated Scanner
+details regarding the scanner that flex produces;
+how to control the input source
+
+@item Start Conditions
+introducing context into your scanners, and
+managing "mini-scanners"
+
+@item Multiple Input Buffers
+how to manipulate multiple input sources; how to
+scan from strings instead of files
+
+@item End-of-file Rules
+special rules for matching the end of the input
+
+@item Miscellaneous Macros
+a summary of macros available to the actions
+
+@item Values Available To The User
+a summary of values available to the actions
+
+@item Interfacing With Yacc
+connecting flex scanners together with yacc parsers
+
+@item Options
+flex command-line options, and the "%option"
+directive
+
+@item Performance Considerations
+how to make your scanner go as fast as possible
+
+@item Generating C++ Scanners
+the (experimental) facility for generating C++
+scanner classes
+
+@item Incompatibilities With Lex And POSIX
+how flex differs from AT&T lex and the POSIX lex
+standard
+
+@item Diagnostics
+those error messages produced by flex (or scanners
+it generates) whose meanings might not be apparent
+
+@item Files
+files used by flex
+
+@item Deficiencies / Bugs
+known problems with flex
+
+@item See Also
+other documentation, related tools
+
+@item Author
+includes contact information
+@end table
+
+@node Description, Examples, Overview, Top
+@section Description
+
+@code{flex} is a tool for generating @dfn{scanners}: programs which
+recognized lexical patterns in text.  @code{flex} reads the given
+input files, or its standard input if no file names are
+given, for a description of a scanner to generate.  The
+description is in the form of pairs of regular expressions
+and C code, called @dfn{rules}. @code{flex} generates as output a C
+source file, @file{lex.yy.c}, which defines a routine @samp{yylex()}.
+This file is compiled and linked with the @samp{-lfl} library to
+produce an executable.  When the executable is run, it
+analyzes its input for occurrences of the regular
+expressions.  Whenever it finds one, it executes the
+corresponding C code.
+
+@node Examples, Format, Description, Top
+@section Some simple examples
+
+First some simple examples to get the flavor of how one
+uses @code{flex}.  The following @code{flex} input specifies a scanner
+which whenever it encounters the string "username" will
+replace it with the user's login name:
+
+@example
+%%
+username    printf( "%s", getlogin() );
+@end example
+
+By default, any text not matched by a @code{flex} scanner is
+copied to the output, so the net effect of this scanner is
+to copy its input file to its output with each occurrence
+of "username" expanded.  In this input, there is just one
+rule.  "username" is the @var{pattern} and the "printf" is the
+@var{action}.  The "%%" marks the beginning of the rules.
+
+Here's another simple example:
+
+@example
+        int num_lines = 0, num_chars = 0;
+
+%%
+\n      ++num_lines; ++num_chars;
+.       ++num_chars;
+
+%%
+main()
+        @{
+        yylex();
+        printf( "# of lines = %d, # of chars = %d\n",
+                num_lines, num_chars );
+        @}
+@end example
+
+This scanner counts the number of characters and the
+number of lines in its input (it produces no output other
+than the final report on the counts).  The first line
+declares two globals, "num_lines" and "num_chars", which
+are accessible both inside @samp{yylex()} and in the @samp{main()}
+routine declared after the second "%%".  There are two rules,
+one which matches a newline ("\n") and increments both the
+line count and the character count, and one which matches
+any character other than a newline (indicated by the "."
+regular expression).
+
+A somewhat more complicated example:
+
+@example
+/* scanner for a toy Pascal-like language */
+
+%@{
+/* need this for the call to atof() below */
+#include <math.h>
+%@}
+
+DIGIT    [0-9]
+ID       [a-z][a-z0-9]*
+
+%%
+
+@{DIGIT@}+    @{
+            printf( "An integer: %s (%d)\n", yytext,
+                    atoi( yytext ) );
+            @}
+
+@{DIGIT@}+"."@{DIGIT@}*        @{
+            printf( "A float: %s (%g)\n", yytext,
+                    atof( yytext ) );
+            @}
+
+if|then|begin|end|procedure|function        @{
+            printf( "A keyword: %s\n", yytext );
+            @}
+
+@{ID@}        printf( "An identifier: %s\n", yytext );
+
+"+"|"-"|"*"|"/"   printf( "An operator: %s\n", yytext );
+
+"@{"[^@}\n]*"@}"     /* eat up one-line comments */
+
+[ \t\n]+          /* eat up whitespace */
+
+.           printf( "Unrecognized character: %s\n", yytext );
+
+%%
+
+main( argc, argv )
+int argc;
+char **argv;
+    @{
+    ++argv, --argc;  /* skip over program name */
+    if ( argc > 0 )
+            yyin = fopen( argv[0], "r" );
+    else
+            yyin = stdin;
+
+    yylex();
+    @}
+@end example
+
+This is the beginnings of a simple scanner for a language
+like Pascal.  It identifies different types of @var{tokens} and
+reports on what it has seen.
+
+The details of this example will be explained in the
+following sections.
+
+@node Format, Patterns, Examples, Top
+@section Format of the input file
+
+The @code{flex} input file consists of three sections, separated
+by a line with just @samp{%%} in it:
+
+@example
+definitions
+%%
+rules
+%%
+user code
+@end example
+
+The @dfn{definitions} section contains declarations of simple
+@dfn{name} definitions to simplify the scanner specification,
+and declarations of @dfn{start conditions}, which are explained
+in a later section.
+Name definitions have the form:
+
+@example
+name definition
+@end example
+
+The "name" is a word beginning with a letter or an
+underscore ('_') followed by zero or more letters, digits, '_',
+or '-' (dash).  The definition is taken to begin at the
+first non-white-space character following the name and
+continuing to the end of the line.  The definition can
+subsequently be referred to using "@{name@}", which will
+expand to "(definition)".  For example,
+
+@example
+DIGIT    [0-9]
+ID       [a-z][a-z0-9]*
+@end example
+
+@noindent
+defines "DIGIT" to be a regular expression which matches a
+single digit, and "ID" to be a regular expression which
+matches a letter followed by zero-or-more
+letters-or-digits.  A subsequent reference to
+
+@example
+@{DIGIT@}+"."@{DIGIT@}*
+@end example
+
+@noindent
+is identical to
+
+@example
+([0-9])+"."([0-9])*
+@end example
+
+@noindent
+and matches one-or-more digits followed by a '.' followed
+by zero-or-more digits.
+
+The @var{rules} section of the @code{flex} input contains a series of
+rules of the form:
+
+@example
+pattern   action
+@end example
+
+@noindent
+where the pattern must be unindented and the action must
+begin on the same line.
+
+See below for a further description of patterns and
+actions.
+
+Finally, the user code section is simply copied to
+@file{lex.yy.c} verbatim.  It is used for companion routines
+which call or are called by the scanner.  The presence of
+this section is optional; if it is missing, the second @samp{%%}
+in the input file may be skipped, too.
+
+In the definitions and rules sections, any @emph{indented} text or
+text enclosed in @samp{%@{} and @samp{%@}} is copied verbatim to the
+output (with the @samp{%@{@}}'s removed).  The @samp{%@{@}}'s must
+appear unindented on lines by themselves.
+
+In the rules section, any indented or %@{@} text appearing
+before the first rule may be used to declare variables
+which are local to the scanning routine and (after the
+declarations) code which is to be executed whenever the
+scanning routine is entered.  Other indented or %@{@} text
+in the rule section is still copied to the output, but its
+meaning is not well-defined and it may well cause
+compile-time errors (this feature is present for @code{POSIX} compliance;
+see below for other such features).
+
+In the definitions section (but not in the rules section),
+an unindented comment (i.e., a line beginning with "/*")
+is also copied verbatim to the output up to the next "*/".
+
+@node Patterns, Matching, Format, Top
+@section Patterns
+
+The patterns in the input are written using an extended
+set of regular expressions.  These are:
+
+@table @samp
+@item x
+match the character @samp{x}
+@item .
+any character (byte) except newline
+@item [xyz]
+a "character class"; in this case, the pattern
+matches either an @samp{x}, a @samp{y}, or a @samp{z}
+@item [abj-oZ]
+a "character class" with a range in it; matches
+an @samp{a}, a @samp{b}, any letter from @samp{j} through @samp{o},
+or a @samp{Z}
+@item [^A-Z]
+a "negated character class", i.e., any character
+but those in the class.  In this case, any
+character EXCEPT an uppercase letter.
+@item [^A-Z\n]
+any character EXCEPT an uppercase letter or
+a newline
+@item @var{r}*
+zero or more @var{r}'s, where @var{r} is any regular expression
+@item @var{r}+
+one or more @var{r}'s
+@item @var{r}?
+zero or one @var{r}'s (that is, "an optional @var{r}")
+@item @var{r}@{2,5@}
+anywhere from two to five @var{r}'s
+@item @var{r}@{2,@}
+two or more @var{r}'s
+@item @var{r}@{4@}
+exactly 4 @var{r}'s
+@item @{@var{name}@}
+the expansion of the "@var{name}" definition
+(see above)
+@item "[xyz]\"foo"
+the literal string: @samp{[xyz]"foo}
+@item \@var{x}
+if @var{x} is an @samp{a}, @samp{b}, @samp{f}, @samp{n}, @samp{r}, @samp{t}, or @samp{v},
+then the ANSI-C interpretation of \@var{x}.
+Otherwise, a literal @samp{@var{x}} (used to escape
+operators such as @samp{*})
+@item \0
+a NUL character (ASCII code 0)
+@item \123
+the character with octal value 123
+@item \x2a
+the character with hexadecimal value @code{2a}
+@item (@var{r})
+match an @var{r}; parentheses are used to override
+precedence (see below)
+@item @var{r}@var{s}
+the regular expression @var{r} followed by the
+regular expression @var{s}; called "concatenation"
+@item @var{r}|@var{s}
+either an @var{r} or an @var{s}
+@item @var{r}/@var{s}
+an @var{r} but only if it is followed by an @var{s}.  The text
+matched by @var{s} is included when determining whether this rule is
+the @dfn{longest match}, but is then returned to the input before
+the action is executed.  So the action only sees the text matched
+by @var{r}.  This type of pattern is called @dfn{trailing context}.
+(There are some combinations of @samp{@var{r}/@var{s}} that @code{flex}
+cannot match correctly; see notes in the Deficiencies / Bugs section
+below regarding "dangerous trailing context".)
+@item ^@var{r}
+an @var{r}, but only at the beginning of a line (i.e.,
+which just starting to scan, or right after a
+newline has been scanned).
+@item @var{r}$
+an @var{r}, but only at the end of a line (i.e., just
+before a newline).  Equivalent to "@var{r}/\n".
+
+Note that flex's notion of "newline" is exactly
+whatever the C compiler used to compile flex
+interprets '\n' as; in particular, on some DOS
+systems you must either filter out \r's in the
+input yourself, or explicitly use @var{r}/\r\n for "r$".
+@item <@var{s}>@var{r}
+an @var{r}, but only in start condition @var{s} (see
+below for discussion of start conditions)
+<@var{s1},@var{s2},@var{s3}>@var{r}
+same, but in any of start conditions @var{s1},
+@var{s2}, or @var{s3}
+@item <*>@var{r}
+an @var{r} in any start condition, even an exclusive one.
+@item <<EOF>>
+an end-of-file
+<@var{s1},@var{s2}><<EOF>>
+an end-of-file when in start condition @var{s1} or @var{s2}
+@end table
+
+Note that inside of a character class, all regular
+expression operators lose their special meaning except escape
+('\') and the character class operators, '-', ']', and, at
+the beginning of the class, '^'.
+
+The regular expressions listed above are grouped according
+to precedence, from highest precedence at the top to
+lowest at the bottom.  Those grouped together have equal
+precedence.  For example,
+
+@example
+foo|bar*
+@end example
+
+@noindent
+is the same as
+
+@example
+(foo)|(ba(r*))
+@end example
+
+@noindent
+since the '*' operator has higher precedence than
+concatenation, and concatenation higher than alternation ('|').
+This pattern therefore matches @emph{either} the string "foo" @emph{or}
+the string "ba" followed by zero-or-more r's.  To match
+"foo" or zero-or-more "bar"'s, use:
+
+@example
+foo|(bar)*
+@end example
+
+@noindent
+and to match zero-or-more "foo"'s-or-"bar"'s:
+
+@example
+(foo|bar)*
+@end example
+
+In addition to characters and ranges of characters,
+character classes can also contain character class
+@dfn{expressions}.  These are expressions enclosed inside @samp{[}: and @samp{:}]
+delimiters (which themselves must appear between the '['
+and ']' of the character class; other elements may occur
+inside the character class, too).  The valid expressions
+are:
+
+@example
+[:alnum:] [:alpha:] [:blank:]
+[:cntrl:] [:digit:] [:graph:]
+[:lower:] [:print:] [:punct:]
+[:space:] [:upper:] [:xdigit:]
+@end example
+
+These expressions all designate a set of characters
+equivalent to the corresponding standard C @samp{isXXX} function.  For
+example, @samp{[:alnum:]} designates those characters for which
+@samp{isalnum()} returns true - i.e., any alphabetic or numeric.
+Some systems don't provide @samp{isblank()}, so flex defines
+@samp{[:blank:]} as a blank or a tab.
+
+For example, the following character classes are all
+equivalent:
+
+@example
+[[:alnum:]]
+[[:alpha:][:digit:]
+[[:alpha:]0-9]
+[a-zA-Z0-9]
+@end example
+
+If your scanner is case-insensitive (the @samp{-i} flag), then
+@samp{[:upper:]} and @samp{[:lower:]} are equivalent to @samp{[:alpha:]}.
+
+Some notes on patterns:
+
+@itemize -
+@item
+A negated character class such as the example
+"[^A-Z]" above @emph{will match a newline} unless "\n" (or an
+equivalent escape sequence) is one of the
+characters explicitly present in the negated character
+class (e.g., "[^A-Z\n]").  This is unlike how many
+other regular expression tools treat negated
+character classes, but unfortunately the inconsistency
+is historically entrenched.  Matching newlines
+means that a pattern like [^"]* can match the
+entire input unless there's another quote in the
+input.
+
+@item
+A rule can have at most one instance of trailing
+context (the '/' operator or the '$' operator).
+The start condition, '^', and "<<EOF>>" patterns
+can only occur at the beginning of a pattern, and,
+as well as with '/' and '$', cannot be grouped
+inside parentheses.  A '^' which does not occur at
+the beginning of a rule or a '$' which does not
+occur at the end of a rule loses its special
+properties and is treated as a normal character.
+
+The following are illegal:
+
+@example
+foo/bar$
+<sc1>foo<sc2>bar
+@end example
+
+Note that the first of these, can be written
+"foo/bar\n".
+
+The following will result in '$' or '^' being
+treated as a normal character:
+
+@example
+foo|(bar$)
+foo|^bar
+@end example
+
+If what's wanted is a "foo" or a
+bar-followed-by-a-newline, the following could be used (the special
+'|' action is explained below):
+
+@example
+foo      |
+bar$     /* action goes here */
+@end example
+
+A similar trick will work for matching a foo or a
+bar-at-the-beginning-of-a-line.
+@end itemize
+
+@node Matching, Actions, Patterns, Top
+@section How the input is matched
+
+When the generated scanner is run, it analyzes its input
+looking for strings which match any of its patterns.  If
+it finds more than one match, it takes the one matching
+the most text (for trailing context rules, this includes
+the length of the trailing part, even though it will then
+be returned to the input).  If it finds two or more
+matches of the same length, the rule listed first in the
+@code{flex} input file is chosen.
+
+Once the match is determined, the text corresponding to
+the match (called the @var{token}) is made available in the
+global character pointer @code{yytext}, and its length in the
+global integer @code{yyleng}.  The @var{action} corresponding to the
+matched pattern is then executed (a more detailed
+description of actions follows), and then the remaining input is
+scanned for another match.
+
+If no match is found, then the @dfn{default rule} is executed:
+the next character in the input is considered matched and
+copied to the standard output.  Thus, the simplest legal
+@code{flex} input is:
+
+@example
+%%
+@end example
+
+which generates a scanner that simply copies its input
+(one character at a time) to its output.
+
+Note that @code{yytext} can be defined in two different ways:
+either as a character @emph{pointer} or as a character @emph{array}.
+You can control which definition @code{flex} uses by including
+one of the special directives @samp{%pointer} or @samp{%array} in the
+first (definitions) section of your flex input.  The
+default is @samp{%pointer}, unless you use the @samp{-l} lex
+compatibility option, in which case @code{yytext} will be an array.  The
+advantage of using @samp{%pointer} is substantially faster
+scanning and no buffer overflow when matching very large
+tokens (unless you run out of dynamic memory).  The
+disadvantage is that you are restricted in how your actions can
+modify @code{yytext} (see the next section), and calls to the
+@samp{unput()} function destroys the present contents of @code{yytext},
+which can be a considerable porting headache when moving
+between different @code{lex} versions.
+
+The advantage of @samp{%array} is that you can then modify @code{yytext}
+to your heart's content, and calls to @samp{unput()} do not
+destroy @code{yytext} (see below).  Furthermore, existing @code{lex}
+programs sometimes access @code{yytext} externally using
+declarations of the form:
+@example
+extern char yytext[];
+@end example
+This definition is erroneous when used with @samp{%pointer}, but
+correct for @samp{%array}.
+
+@samp{%array} defines @code{yytext} to be an array of @code{YYLMAX} characters,
+which defaults to a fairly large value.  You can change
+the size by simply #define'ing @code{YYLMAX} to a different value
+in the first section of your @code{flex} input.  As mentioned
+above, with @samp{%pointer} yytext grows dynamically to
+accommodate large tokens.  While this means your @samp{%pointer} scanner
+can accommodate very large tokens (such as matching entire
+blocks of comments), bear in mind that each time the
+scanner must resize @code{yytext} it also must rescan the entire
+token from the beginning, so matching such tokens can
+prove slow.  @code{yytext} presently does @emph{not} dynamically grow if
+a call to @samp{unput()} results in too much text being pushed
+back; instead, a run-time error results.
+
+Also note that you cannot use @samp{%array} with C++ scanner
+classes (the @code{c++} option; see below).
+
+@node Actions, Generated scanner, Matching, Top
+@section Actions
+
+Each pattern in a rule has a corresponding action, which
+can be any arbitrary C statement.  The pattern ends at the
+first non-escaped whitespace character; the remainder of
+the line is its action.  If the action is empty, then when
+the pattern is matched the input token is simply
+discarded.  For example, here is the specification for a
+program which deletes all occurrences of "zap me" from its
+input:
+
+@example
+%%
+"zap me"
+@end example
+
+(It will copy all other characters in the input to the
+output since they will be matched by the default rule.)
+
+Here is a program which compresses multiple blanks and
+tabs down to a single blank, and throws away whitespace
+found at the end of a line:
+
+@example
+%%
+[ \t]+        putchar( ' ' );
+[ \t]+$       /* ignore this token */
+@end example
+
+If the action contains a '@{', then the action spans till
+the balancing '@}' is found, and the action may cross
+multiple lines.  @code{flex} knows about C strings and comments and
+won't be fooled by braces found within them, but also
+allows actions to begin with @samp{%@{} and will consider the
+action to be all the text up to the next @samp{%@}} (regardless of
+ordinary braces inside the action).
+
+An action consisting solely of a vertical bar ('|') means
+"same as the action for the next rule." See below for an
+illustration.
+
+Actions can include arbitrary C code, including @code{return}
+statements to return a value to whatever routine called
+@samp{yylex()}.  Each time @samp{yylex()} is called it continues
+processing tokens from where it last left off until it either
+reaches the end of the file or executes a return.
+
+Actions are free to modify @code{yytext} except for lengthening
+it (adding characters to its end--these will overwrite
+later characters in the input stream).  This however does
+not apply when using @samp{%array} (see above); in that case,
+@code{yytext} may be freely modified in any way.
+
+Actions are free to modify @code{yyleng} except they should not
+do so if the action also includes use of @samp{yymore()} (see
+below).
+
+There are a number of special directives which can be
+included within an action:
+
+@itemize -
+@item
+@samp{ECHO} copies yytext to the scanner's output.
+
+@item
+@code{BEGIN} followed by the name of a start condition
+places the scanner in the corresponding start
+condition (see below).
+
+@item
+@code{REJECT} directs the scanner to proceed on to the
+"second best" rule which matched the input (or a
+prefix of the input).  The rule is chosen as
+described above in "How the Input is Matched", and
+@code{yytext} and @code{yyleng} set up appropriately.  It may
+either be one which matched as much text as the
+originally chosen rule but came later in the @code{flex}
+input file, or one which matched less text.  For
+example, the following will both count the words in
+the input and call the routine special() whenever
+"frob" is seen:
+
+@example
+        int word_count = 0;
+%%
+
+frob        special(); REJECT;
+[^ \t\n]+   ++word_count;
+@end example
+
+Without the @code{REJECT}, any "frob"'s in the input would
+not be counted as words, since the scanner normally
+executes only one action per token.  Multiple
+@code{REJECT's} are allowed, each one finding the next
+best choice to the currently active rule.  For
+example, when the following scanner scans the token
+"abcd", it will write "abcdabcaba" to the output:
+
+@example
+%%
+a        |
+ab       |
+abc      |
+abcd     ECHO; REJECT;
+.|\n     /* eat up any unmatched character */
+@end example
+
+(The first three rules share the fourth's action
+since they use the special '|' action.)  @code{REJECT} is
+a particularly expensive feature in terms of
+scanner performance; if it is used in @emph{any} of the
+scanner's actions it will slow down @emph{all} of the
+scanner's matching.  Furthermore, @code{REJECT} cannot be used
+with the @samp{-Cf} or @samp{-CF} options (see below).
+
+Note also that unlike the other special actions,
+@code{REJECT} is a @emph{branch}; code immediately following it
+in the action will @emph{not} be executed.
+
+@item
+@samp{yymore()} tells the scanner that the next time it
+matches a rule, the corresponding token should be
+@emph{appended} onto the current value of @code{yytext} rather
+than replacing it.  For example, given the input
+"mega-kludge" the following will write
+"mega-mega-kludge" to the output:
+
+@example
+%%
+mega-    ECHO; yymore();
+kludge   ECHO;
+@end example
+
+First "mega-" is matched and echoed to the output.
+Then "kludge" is matched, but the previous "mega-"
+is still hanging around at the beginning of @code{yytext}
+so the @samp{ECHO} for the "kludge" rule will actually
+write "mega-kludge".
+@end itemize
+
+Two notes regarding use of @samp{yymore()}.  First, @samp{yymore()}
+depends on the value of @code{yyleng} correctly reflecting the
+size of the current token, so you must not modify @code{yyleng}
+if you are using @samp{yymore()}.  Second, the presence of
+@samp{yymore()} in the scanner's action entails a minor
+performance penalty in the scanner's matching speed.
+
+@itemize -
+@item
+@samp{yyless(n)} returns all but the first @var{n} characters of
+the current token back to the input stream, where
+they will be rescanned when the scanner looks for
+the next match.  @code{yytext} and @code{yyleng} are adjusted
+appropriately (e.g., @code{yyleng} will now be equal to @var{n}
+).  For example, on the input "foobar" the
+following will write out "foobarbar":
+
+@example
+%%
+foobar    ECHO; yyless(3);
+[a-z]+    ECHO;
+@end example
+
+An argument of 0 to @code{yyless} will cause the entire
+current input string to be scanned again.  Unless
+you've changed how the scanner will subsequently
+process its input (using @code{BEGIN}, for example), this
+will result in an endless loop.
+
+Note that @code{yyless} is a macro and can only be used in the
+flex input file, not from other source files.
+
+@item
+@samp{unput(c)} puts the character @code{c} back onto the input
+stream.  It will be the next character scanned.
+The following action will take the current token
+and cause it to be rescanned enclosed in
+parentheses.
+
+@example
+@{
+int i;
+/* Copy yytext because unput() trashes yytext */
+char *yycopy = strdup( yytext );
+unput( ')' );
+for ( i = yyleng - 1; i >= 0; --i )
+    unput( yycopy[i] );
+unput( '(' );
+free( yycopy );
+@}
+@end example
+
+Note that since each @samp{unput()} puts the given
+character back at the @emph{beginning} of the input stream,
+pushing back strings must be done back-to-front.
+An important potential problem when using @samp{unput()} is that
+if you are using @samp{%pointer} (the default), a call to @samp{unput()}
+@emph{destroys} the contents of @code{yytext}, starting with its
+rightmost character and devouring one character to the left
+with each call.  If you need the value of yytext preserved
+after a call to @samp{unput()} (as in the above example), you
+must either first copy it elsewhere, or build your scanner
+using @samp{%array} instead (see How The Input Is Matched).
+
+Finally, note that you cannot put back @code{EOF} to attempt to
+mark the input stream with an end-of-file.
+
+@item
+@samp{input()} reads the next character from the input
+stream.  For example, the following is one way to
+eat up C comments:
+
+@example
+%%
+"/*"        @{
+            register int c;
+
+            for ( ; ; )
+                @{
+                while ( (c = input()) != '*' &&
+                        c != EOF )
+                    ;    /* eat up text of comment */
+
+                if ( c == '*' )
+                    @{
+                    while ( (c = input()) == '*' )
+                        ;
+                    if ( c == '/' )
+                        break;    /* found the end */
+                    @}
+
+                if ( c == EOF )
+                    @{
+                    error( "EOF in comment" );
+                    break;
+                    @}
+                @}
+            @}
+@end example
+
+(Note that if the scanner is compiled using @samp{C++},
+then @samp{input()} is instead referred to as @samp{yyinput()},
+in order to avoid a name clash with the @samp{C++} stream
+by the name of @code{input}.)
+
+@item YY_FLUSH_BUFFER
+flushes the scanner's internal buffer so that the next time the scanner
+attempts to match a token, it will first refill the buffer using
+@code{YY_INPUT} (see The Generated Scanner, below).  This action is
+a special case of the more general @samp{yy_flush_buffer()} function,
+described below in the section Multiple Input Buffers.
+
+@item
+@samp{yyterminate()} can be used in lieu of a return
+statement in an action.  It terminates the scanner
+and returns a 0 to the scanner's caller, indicating
+"all done".  By default, @samp{yyterminate()} is also
+called when an end-of-file is encountered.  It is a
+macro and may be redefined.
+@end itemize
+
+@node Generated scanner, Start conditions, Actions, Top
+@section The generated scanner
+
+The output of @code{flex} is the file @file{lex.yy.c}, which contains
+the scanning routine @samp{yylex()}, a number of tables used by
+it for matching tokens, and a number of auxiliary routines
+and macros.  By default, @samp{yylex()} is declared as follows:
+
+@example
+int yylex()
+    @{
+    @dots{} various definitions and the actions in here @dots{}
+    @}
+@end example
+
+(If your environment supports function prototypes, then it
+will be "int yylex( void  )".)   This  definition  may  be
+changed by defining the "YY_DECL" macro.  For example, you
+could use:
+
+@example
+#define YY_DECL float lexscan( a, b ) float a, b;
+@end example
+
+to give the scanning routine the name @code{lexscan}, returning a
+float, and taking two floats as arguments.  Note that if
+you give arguments to the scanning routine using a
+K&R-style/non-prototyped function declaration, you must
+terminate the definition with a semi-colon (@samp{;}).
+
+Whenever @samp{yylex()} is called, it scans tokens from the
+global input file @code{yyin} (which defaults to stdin).  It
+continues until it either reaches an end-of-file (at which
+point it returns the value 0) or one of its actions
+executes a @code{return} statement.
+
+If the scanner reaches an end-of-file, subsequent calls are undefined
+unless either @code{yyin} is pointed at a new input file (in which case
+scanning continues from that file), or @samp{yyrestart()} is called.
+@samp{yyrestart()} takes one argument, a @samp{FILE *} pointer (which
+can be nil, if you've set up @code{YY_INPUT} to scan from a source
+other than @code{yyin}), and initializes @code{yyin} for scanning from
+that file.  Essentially there is no difference between just assigning
+@code{yyin} to a new input file or using @samp{yyrestart()} to do so;
+the latter is available for compatibility with previous versions of
+@code{flex}, and because it can be used to switch input files in the
+middle of scanning.  It can also be used to throw away the current
+input buffer, by calling it with an argument of @code{yyin}; but
+better is to use @code{YY_FLUSH_BUFFER} (see above).  Note that
+@samp{yyrestart()} does @emph{not} reset the start condition to
+@code{INITIAL} (see Start Conditions, below).
+
+
+If @samp{yylex()} stops scanning due to executing a @code{return}
+statement in one of the actions, the scanner may then be called
+again and it will resume scanning where it left off.
+
+By default (and for purposes of efficiency), the scanner
+uses block-reads rather than simple @samp{getc()} calls to read
+characters from @code{yyin}.  The nature of how it gets its input
+can be controlled by defining the @code{YY_INPUT} macro.
+YY_INPUT's calling sequence is
+"YY_INPUT(buf,result,max_size)".  Its action is to place
+up to @var{max_size} characters in the character array @var{buf} and
+return in the integer variable @var{result} either the number of
+characters read or the constant YY_NULL (0 on Unix
+systems) to indicate EOF.  The default YY_INPUT reads from
+the global file-pointer "yyin".
+
+A sample definition of YY_INPUT (in the definitions
+section of the input file):
+
+@example
+%@{
+#define YY_INPUT(buf,result,max_size) \
+    @{ \
+    int c = getchar(); \
+    result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \
+    @}
+%@}
+@end example
+
+This definition will change the input processing to occur
+one character at a time.
+
+When the scanner receives an end-of-file indication from
+YY_INPUT, it then checks the @samp{yywrap()} function.  If
+@samp{yywrap()} returns false (zero), then it is assumed that the
+function has gone ahead and set up @code{yyin} to point to
+another input file, and scanning continues.  If it returns
+true (non-zero), then the scanner terminates, returning 0
+to its caller.  Note that in either case, the start
+condition remains unchanged; it does @emph{not} revert to @code{INITIAL}.
+
+If you do not supply your own version of @samp{yywrap()}, then you
+must either use @samp{%option noyywrap} (in which case the scanner
+behaves as though @samp{yywrap()} returned 1), or you must link with
+@samp{-lfl} to obtain the default version of the routine, which always
+returns 1.
+
+Three routines are available for scanning from in-memory
+buffers rather than files: @samp{yy_scan_string()},
+@samp{yy_scan_bytes()}, and @samp{yy_scan_buffer()}.  See the discussion
+of them below in the section Multiple Input Buffers.
+
+The scanner writes its @samp{ECHO} output to the @code{yyout} global
+(default, stdout), which may be redefined by the user
+simply by assigning it to some other @code{FILE} pointer.
+
+@node Start conditions, Multiple buffers, Generated scanner, Top
+@section Start conditions
+
+@code{flex} provides a mechanism for conditionally activating
+rules.  Any rule whose pattern is prefixed with "<sc>"
+will only be active when the scanner is in the start
+condition named "sc".  For example,
+
+@example
+<STRING>[^"]*        @{ /* eat up the string body ... */
+            @dots{}
+            @}
+@end example
+
+@noindent
+will be active only when the scanner is in the "STRING"
+start condition, and
+
+@example
+<INITIAL,STRING,QUOTE>\.        @{ /* handle an escape ... */
+            @dots{}
+            @}
+@end example
+
+@noindent
+will be active only when the current start condition is
+either "INITIAL", "STRING", or "QUOTE".
+
+Start conditions are declared in the definitions (first)
+section of the input using unindented lines beginning with
+either @samp{%s} or @samp{%x} followed by a list of names.  The former
+declares @emph{inclusive} start conditions, the latter @emph{exclusive}
+start conditions.  A start condition is activated using
+the @code{BEGIN} action.  Until the next @code{BEGIN} action is
+executed, rules with the given start condition will be active
+and rules with other start conditions will be inactive.
+If the start condition is @emph{inclusive}, then rules with no
+start conditions at all will also be active.  If it is
+@emph{exclusive}, then @emph{only} rules qualified with the start
+condition will be active.  A set of rules contingent on the
+same exclusive start condition describe a scanner which is
+independent of any of the other rules in the @code{flex} input.
+Because of this, exclusive start conditions make it easy
+to specify "mini-scanners" which scan portions of the
+input that are syntactically different from the rest
+(e.g., comments).
+
+If the distinction between inclusive and exclusive start
+conditions is still a little vague, here's a simple
+example illustrating the connection between the two.  The set
+of rules:
+
+@example
+%s example
+%%
+
+<example>foo   do_something();
+
+bar            something_else();
+@end example
+
+@noindent
+is equivalent to
+
+@example
+%x example
+%%
+
+<example>foo   do_something();
+
+<INITIAL,example>bar    something_else();
+@end example
+
+Without the @samp{<INITIAL,example>} qualifier, the @samp{bar} pattern
+in the second example wouldn't be active (i.e., couldn't match) when
+in start condition @samp{example}.  If we just used @samp{<example>}
+to qualify @samp{bar}, though, then it would only be active in
+@samp{example} and not in @code{INITIAL}, while in the first example
+it's active in both, because in the first example the @samp{example}
+starting condition is an @emph{inclusive} (@samp{%s}) start condition.
+
+Also note that the special start-condition specifier @samp{<*>}
+matches every start condition.  Thus, the above example
+could also have been written;
+
+@example
+%x example
+%%
+
+<example>foo   do_something();
+
+<*>bar    something_else();
+@end example
+
+The default rule (to @samp{ECHO} any unmatched character) remains
+active in start conditions.  It is equivalent to:
+
+@example
+<*>.|\\n     ECHO;
+@end example
+
+@samp{BEGIN(0)} returns to the original state where only the
+rules with no start conditions are active.  This state can
+also be referred to as the start-condition "INITIAL", so
+@samp{BEGIN(INITIAL)} is equivalent to @samp{BEGIN(0)}.  (The
+parentheses around the start condition name are not required but
+are considered good style.)
+
+@code{BEGIN} actions can also be given as indented code at the
+beginning of the rules section.  For example, the
+following will cause the scanner to enter the "SPECIAL" start
+condition whenever @samp{yylex()} is called and the global
+variable @code{enter_special} is true:
+
+@example
+        int enter_special;
+
+%x SPECIAL
+%%
+        if ( enter_special )
+            BEGIN(SPECIAL);
+
+<SPECIAL>blahblahblah
+@dots{}more rules follow@dots{}
+@end example
+
+To illustrate the uses of start conditions, here is a
+scanner which provides two different interpretations of a
+string like "123.456".  By default it will treat it as as
+three tokens, the integer "123", a dot ('.'), and the
+integer "456".  But if the string is preceded earlier in
+the line by the string "expect-floats" it will treat it as
+a single token, the floating-point number 123.456:
+
+@example
+%@{
+#include <math.h>
+%@}
+%s expect
+
+%%
+expect-floats        BEGIN(expect);
+
+<expect>[0-9]+"."[0-9]+      @{
+            printf( "found a float, = %f\n",
+                    atof( yytext ) );
+            @}
+<expect>\n           @{
+            /* that's the end of the line, so
+             * we need another "expect-number"
+             * before we'll recognize any more
+             * numbers
+             */
+            BEGIN(INITIAL);
+            @}
+
+[0-9]+      @{
+
+Version 2.5               December 1994                        18
+
+            printf( "found an integer, = %d\n",
+                    atoi( yytext ) );
+            @}
+
+"."         printf( "found a dot\n" );
+@end example
+
+Here is a scanner which recognizes (and discards) C
+comments while maintaining a count of the current input line.
+
+@example
+%x comment
+%%
+        int line_num = 1;
+
+"/*"         BEGIN(comment);
+
+<comment>[^*\n]*        /* eat anything that's not a '*' */
+<comment>"*"+[^*/\n]*   /* eat up '*'s not followed by '/'s */
+<comment>\n             ++line_num;
+<comment>"*"+"/"        BEGIN(INITIAL);
+@end example
+
+This scanner goes to a bit of trouble to match as much
+text as possible with each rule.  In general, when
+attempting to write a high-speed scanner try to match as
+much possible in each rule, as it's a big win.
+
+Note that start-conditions names are really integer values
+and can be stored as such.  Thus, the above could be
+extended in the following fashion:
+
+@example
+%x comment foo
+%%
+        int line_num = 1;
+        int comment_caller;
+
+"/*"         @{
+             comment_caller = INITIAL;
+             BEGIN(comment);
+             @}
+
+@dots{}
+
+<foo>"/*"    @{
+             comment_caller = foo;
+             BEGIN(comment);
+             @}
+
+<comment>[^*\n]*        /* eat anything that's not a '*' */
+<comment>"*"+[^*/\n]*   /* eat up '*'s not followed by '/'s */
+<comment>\n             ++line_num;
+<comment>"*"+"/"        BEGIN(comment_caller);
+@end example
+
+Furthermore, you can access the current start condition
+using the integer-valued @code{YY_START} macro.  For example, the
+above assignments to @code{comment_caller} could instead be
+written
+
+@example
+comment_caller = YY_START;
+@end example
+
+Flex provides @code{YYSTATE} as an alias for @code{YY_START} (since that
+is what's used by AT&T @code{lex}).
+
+Note that start conditions do not have their own
+name-space; %s's and %x's declare names in the same fashion as
+#define's.
+
+Finally, here's an example of how to match C-style quoted
+strings using exclusive start conditions, including
+expanded escape sequences (but not including checking for
+a string that's too long):
+
+@example
+%x str
+
+%%
+        char string_buf[MAX_STR_CONST];
+        char *string_buf_ptr;
+
+\"      string_buf_ptr = string_buf; BEGIN(str);
+
+<str>\"        @{ /* saw closing quote - all done */
+        BEGIN(INITIAL);
+        *string_buf_ptr = '\0';
+        /* return string constant token type and
+         * value to parser
+         */
+        @}
+
+<str>\n        @{
+        /* error - unterminated string constant */
+        /* generate error message */
+        @}
+
+<str>\\[0-7]@{1,3@} @{
+        /* octal escape sequence */
+        int result;
+
+        (void) sscanf( yytext + 1, "%o", &result );
+
+        if ( result > 0xff )
+                /* error, constant is out-of-bounds */
+
+        *string_buf_ptr++ = result;
+        @}
+
+<str>\\[0-9]+ @{
+        /* generate error - bad escape sequence; something
+         * like '\48' or '\0777777'
+         */
+        @}
+
+<str>\\n  *string_buf_ptr++ = '\n';
+<str>\\t  *string_buf_ptr++ = '\t';
+<str>\\r  *string_buf_ptr++ = '\r';
+<str>\\b  *string_buf_ptr++ = '\b';
+<str>\\f  *string_buf_ptr++ = '\f';
+
+<str>\\(.|\n)  *string_buf_ptr++ = yytext[1];
+
+<str>[^\\\n\"]+        @{
+        char *yptr = yytext;
+
+        while ( *yptr )
+                *string_buf_ptr++ = *yptr++;
+        @}
+@end example
+
+Often, such as in some of the examples above, you wind up
+writing a whole bunch of rules all preceded by the same
+start condition(s).  Flex makes this a little easier and
+cleaner by introducing a notion of start condition @dfn{scope}.
+A start condition scope is begun with:
+
+@example
+<SCs>@{
+@end example
+
+@noindent
+where SCs is a list of one or more start conditions.
+Inside the start condition scope, every rule automatically
+has the prefix @samp{<SCs>} applied to it, until a @samp{@}} which
+matches the initial @samp{@{}.  So, for example,
+
+@example
+<ESC>@{
+    "\\n"   return '\n';
+    "\\r"   return '\r';
+    "\\f"   return '\f';
+    "\\0"   return '\0';
+@}
+@end example
+
+@noindent
+is equivalent to:
+
+@example
+<ESC>"\\n"  return '\n';
+<ESC>"\\r"  return '\r';
+<ESC>"\\f"  return '\f';
+<ESC>"\\0"  return '\0';
+@end example
+
+Start condition scopes may be nested.
+
+Three routines are available for manipulating stacks of
+start conditions:
+
+@table @samp
+@item void yy_push_state(int new_state)
+pushes the current start condition onto the top of
+the start condition stack and switches to @var{new_state}
+as though you had used @samp{BEGIN new_state} (recall that
+start condition names are also integers).
+
+@item void yy_pop_state()
+pops the top of the stack and switches to it via
+@code{BEGIN}.
+
+@item int yy_top_state()
+returns the top of the stack without altering the
+stack's contents.
+@end table
+
+The start condition stack grows dynamically and so has no
+built-in size limitation.  If memory is exhausted, program
+execution aborts.
+
+To use start condition stacks, your scanner must include a
+@samp{%option stack} directive (see Options below).
+
+@node Multiple buffers, End-of-file rules, Start conditions, Top
+@section Multiple input buffers
+
+Some scanners (such as those which support "include"
+files) require reading from several input streams.  As
+@code{flex} scanners do a large amount of buffering, one cannot
+control where the next input will be read from by simply
+writing a @code{YY_INPUT} which is sensitive to the scanning
+context.  @code{YY_INPUT} is only called when the scanner reaches
+the end of its buffer, which may be a long time after
+scanning a statement such as an "include" which requires
+switching the input source.
+
+To negotiate these sorts of problems, @code{flex} provides a
+mechanism for creating and switching between multiple
+input buffers.  An input buffer is created by using:
+
+@example
+YY_BUFFER_STATE yy_create_buffer( FILE *file, int size )
+@end example
+
+@noindent
+which takes a @code{FILE} pointer and a size and creates a buffer
+associated with the given file and large enough to hold
+@var{size} characters (when in doubt, use @code{YY_BUF_SIZE} for the
+size).  It returns a @code{YY_BUFFER_STATE} handle, which may
+then be passed to other routines (see below).  The
+@code{YY_BUFFER_STATE} type is a pointer to an opaque @code{struct}
+@code{yy_buffer_state} structure, so you may safely initialize
+YY_BUFFER_STATE variables to @samp{((YY_BUFFER_STATE) 0)} if you
+wish, and also refer to the opaque structure in order to
+correctly declare input buffers in source files other than
+that of your scanner.  Note that the @code{FILE} pointer in the
+call to @code{yy_create_buffer} is only used as the value of @code{yyin}
+seen by @code{YY_INPUT}; if you redefine @code{YY_INPUT} so it no longer
+uses @code{yyin}, then you can safely pass a nil @code{FILE} pointer to
+@code{yy_create_buffer}.  You select a particular buffer to scan
+from using:
+
+@example
+void yy_switch_to_buffer( YY_BUFFER_STATE new_buffer )
+@end example
+
+switches the scanner's input buffer so subsequent tokens
+will come from @var{new_buffer}.  Note that
+@samp{yy_switch_to_buffer()} may be used by @samp{yywrap()} to set
+things up for continued scanning, instead of opening a new
+file and pointing @code{yyin} at it.  Note also that switching
+input sources via either @samp{yy_switch_to_buffer()} or @samp{yywrap()}
+does @emph{not} change the start condition.
+
+@example
+void yy_delete_buffer( YY_BUFFER_STATE buffer )
+@end example
+
+@noindent
+is used to reclaim the storage associated with a buffer.
+You can also clear the current contents of a buffer using:
+
+@example
+void yy_flush_buffer( YY_BUFFER_STATE buffer )
+@end example
+
+This function discards the buffer's contents, so the next time the
+scanner attempts to match a token from the buffer, it will first fill
+the buffer anew using @code{YY_INPUT}.
+
+@samp{yy_new_buffer()} is an alias for @samp{yy_create_buffer()},
+provided for compatibility with the C++ use of @code{new} and @code{delete}
+for creating and destroying dynamic objects.
+
+Finally, the @code{YY_CURRENT_BUFFER} macro returns a
+@code{YY_BUFFER_STATE} handle to the current buffer.
+
+Here is an example of using these features for writing a
+scanner which expands include files (the @samp{<<EOF>>} feature
+is discussed below):
+
+@example
+/* the "incl" state is used for picking up the name
+ * of an include file
+ */
+%x incl
+
+%@{
+#define MAX_INCLUDE_DEPTH 10
+YY_BUFFER_STATE include_stack[MAX_INCLUDE_DEPTH];
+int include_stack_ptr = 0;
+%@}
+
+%%
+include             BEGIN(incl);
+
+[a-z]+              ECHO;
+[^a-z\n]*\n?        ECHO;
+
+<incl>[ \t]*      /* eat the whitespace */
+<incl>[^ \t\n]+   @{ /* got the include file name */
+        if ( include_stack_ptr >= MAX_INCLUDE_DEPTH )
+            @{
+            fprintf( stderr, "Includes nested too deeply" );
+            exit( 1 );
+            @}
+
+        include_stack[include_stack_ptr++] =
+            YY_CURRENT_BUFFER;
+
+        yyin = fopen( yytext, "r" );
+
+        if ( ! yyin )
+            error( @dots{} );
+
+        yy_switch_to_buffer(
+            yy_create_buffer( yyin, YY_BUF_SIZE ) );
+
+        BEGIN(INITIAL);
+        @}
+
+<<EOF>> @{
+        if ( --include_stack_ptr < 0 )
+            @{
+            yyterminate();
+            @}
+
+        else
+            @{
+            yy_delete_buffer( YY_CURRENT_BUFFER );
+            yy_switch_to_buffer(
+                 include_stack[include_stack_ptr] );
+            @}
+        @}
+@end example
+
+Three routines are available for setting up input buffers
+for scanning in-memory strings instead of files.  All of
+them create a new input buffer for scanning the string,
+and return a corresponding @code{YY_BUFFER_STATE} handle (which
+you should delete with @samp{yy_delete_buffer()} when done with
+it).  They also switch to the new buffer using
+@samp{yy_switch_to_buffer()}, so the next call to @samp{yylex()} will
+start scanning the string.
+
+@table @samp
+@item yy_scan_string(const char *str)
+scans a NUL-terminated string.
+
+@item yy_scan_bytes(const char *bytes, int len)
+scans @code{len} bytes (including possibly NUL's) starting
+at location @var{bytes}.
+@end table
+
+Note that both of these functions create and scan a @emph{copy}
+of the string or bytes.  (This may be desirable, since
+@samp{yylex()} modifies the contents of the buffer it is
+scanning.) You can avoid the copy by using:
+
+@table @samp
+@item yy_scan_buffer(char *base, yy_size_t size)
+which scans in place the buffer starting at @var{base},
+consisting of @var{size} bytes, the last two bytes of
+which @emph{must} be @code{YY_END_OF_BUFFER_CHAR} (ASCII NUL).
+These last two bytes are not scanned; thus,
+scanning consists of @samp{base[0]} through @samp{base[size-2]},
+inclusive.
+
+If you fail to set up @var{base} in this manner (i.e.,
+forget the final two @code{YY_END_OF_BUFFER_CHAR} bytes),
+then @samp{yy_scan_buffer()} returns a nil pointer instead
+of creating a new input buffer.
+
+The type @code{yy_size_t} is an integral type to which you
+can cast an integer expression reflecting the size
+of the buffer.
+@end table
+
+@node End-of-file rules, Miscellaneous, Multiple buffers, Top
+@section End-of-file rules
+
+The special rule "<<EOF>>" indicates actions which are to
+be taken when an end-of-file is encountered and yywrap()
+returns non-zero (i.e., indicates no further files to
+process).  The action must finish by doing one of four
+things:
+
+@itemize -
+@item
+assigning @code{yyin} to a new input file (in previous
+versions of flex, after doing the assignment you
+had to call the special action @code{YY_NEW_FILE}; this is
+no longer necessary);
+
+@item
+executing a @code{return} statement;
+
+@item
+executing the special @samp{yyterminate()} action;
+
+@item
+or, switching to a new buffer using
+@samp{yy_switch_to_buffer()} as shown in the example
+above.
+@end itemize
+
+<<EOF>> rules may not be used with other patterns; they
+may only be qualified with a list of start conditions.  If
+an unqualified <<EOF>> rule is given, it applies to @emph{all}
+start conditions which do not already have <<EOF>>
+actions.  To specify an <<EOF>> rule for only the initial
+start condition, use
+
+@example
+<INITIAL><<EOF>>
+@end example
+
+These rules are useful for catching things like unclosed
+comments.  An example:
+
+@example
+%x quote
+%%
+
+@dots{}other rules for dealing with quotes@dots{}
+
+<quote><<EOF>>   @{
+         error( "unterminated quote" );
+         yyterminate();
+         @}
+<<EOF>>  @{
+         if ( *++filelist )
+             yyin = fopen( *filelist, "r" );
+         else
+            yyterminate();
+         @}
+@end example
+
+@node Miscellaneous, User variables, End-of-file rules, Top
+@section Miscellaneous macros
+
+The macro @code{YY_USER_ACTION} can be defined to provide an
+action which is always executed prior to the matched
+rule's action.  For example, it could be #define'd to call
+a routine to convert yytext to lower-case.  When
+@code{YY_USER_ACTION} is invoked, the variable @code{yy_act} gives the
+number of the matched rule (rules are numbered starting
+with 1).  Suppose you want to profile how often each of
+your rules is matched.  The following would do the trick:
+
+@example
+#define YY_USER_ACTION ++ctr[yy_act]
+@end example
+
+where @code{ctr} is an array to hold the counts for the different
+rules.  Note that the macro @code{YY_NUM_RULES} gives the total number
+of rules (including the default rule, even if you use @samp{-s}, so
+a correct declaration for @code{ctr} is:
+
+@example
+int ctr[YY_NUM_RULES];
+@end example
+
+The macro @code{YY_USER_INIT} may be defined to provide an action
+which is always executed before the first scan (and before
+the scanner's internal initializations are done).  For
+example, it could be used to call a routine to read in a
+data table or open a logging file.
+
+The macro @samp{yy_set_interactive(is_interactive)} can be used
+to control whether the current buffer is considered
+@emph{interactive}.  An interactive buffer is processed more slowly,
+but must be used when the scanner's input source is indeed
+interactive to avoid problems due to waiting to fill
+buffers (see the discussion of the @samp{-I} flag below).  A
+non-zero value in the macro invocation marks the buffer as
+interactive, a zero value as non-interactive.  Note that
+use of this macro overrides @samp{%option always-interactive} or
+@samp{%option never-interactive} (see Options below).
+@samp{yy_set_interactive()} must be invoked prior to beginning to
+scan the buffer that is (or is not) to be considered
+interactive.
+
+The macro @samp{yy_set_bol(at_bol)} can be used to control
+whether the current buffer's scanning context for the next
+token match is done as though at the beginning of a line.
+A non-zero macro argument makes rules anchored with
+
+The macro @samp{YY_AT_BOL()} returns true if the next token
+scanned from the current buffer will have '^' rules
+active, false otherwise.
+
+In the generated scanner, the actions are all gathered in
+one large switch statement and separated using @code{YY_BREAK},
+which may be redefined.  By default, it is simply a
+"break", to separate each rule's action from the following
+rule's.  Redefining @code{YY_BREAK} allows, for example, C++
+users to #define YY_BREAK to do nothing (while being very
+careful that every rule ends with a "break" or a
+"return"!) to avoid suffering from unreachable statement
+warnings where because a rule's action ends with "return",
+the @code{YY_BREAK} is inaccessible.
+
+@node User variables, YACC interface, Miscellaneous, Top
+@section Values available to the user
+
+This section summarizes the various values available to
+the user in the rule actions.
+
+@itemize -
+@item
+@samp{char *yytext} holds the text of the current token.
+It may be modified but not lengthened (you cannot
+append characters to the end).
+
+If the special directive @samp{%array} appears in the
+first section of the scanner description, then
+@code{yytext} is instead declared @samp{char yytext[YYLMAX]},
+where @code{YYLMAX} is a macro definition that you can
+redefine in the first section if you don't like the
+default value (generally 8KB).  Using @samp{%array}
+results in somewhat slower scanners, but the value
+of @code{yytext} becomes immune to calls to @samp{input()} and
+@samp{unput()}, which potentially destroy its value when
+@code{yytext} is a character pointer.  The opposite of
+@samp{%array} is @samp{%pointer}, which is the default.
+
+You cannot use @samp{%array} when generating C++ scanner
+classes (the @samp{-+} flag).
+
+@item
+@samp{int yyleng} holds the length of the current token.
+
+@item
+@samp{FILE *yyin} is the file which by default @code{flex} reads
+from.  It may be redefined but doing so only makes
+sense before scanning begins or after an EOF has
+been encountered.  Changing it in the midst of
+scanning will have unexpected results since @code{flex}
+buffers its input; use @samp{yyrestart()} instead.  Once
+scanning terminates because an end-of-file has been
+seen, you can assign @code{yyin} at the new input file and
+then call the scanner again to continue scanning.
+
+@item
+@samp{void yyrestart( FILE *new_file )} may be called to
+point @code{yyin} at the new input file.  The switch-over
+to the new file is immediate (any previously
+buffered-up input is lost).  Note that calling
+@samp{yyrestart()} with @code{yyin} as an argument thus throws
+away the current input buffer and continues
+scanning the same input file.
+
+@item
+@samp{FILE *yyout} is the file to which @samp{ECHO} actions are
+done.  It can be reassigned by the user.
+
+@item
+@code{YY_CURRENT_BUFFER} returns a @code{YY_BUFFER_STATE} handle
+to the current buffer.
+
+@item
+@code{YY_START} returns an integer value corresponding to
+the current start condition.  You can subsequently
+use this value with @code{BEGIN} to return to that start
+condition.
+@end itemize
+
+@node YACC interface, Options, User variables, Top
+@section Interfacing with @code{yacc}
+
+One of the main uses of @code{flex} is as a companion to the @code{yacc}
+parser-generator.  @code{yacc} parsers expect to call a routine
+named @samp{yylex()} to find the next input token.  The routine
+is supposed to return the type of the next token as well
+as putting any associated value in the global @code{yylval}.  To
+use @code{flex} with @code{yacc}, one specifies the @samp{-d} option to @code{yacc} to
+instruct it to generate the file @file{y.tab.h} containing
+definitions of all the @samp{%tokens} appearing in the @code{yacc} input.
+This file is then included in the @code{flex} scanner.  For
+example, if one of the tokens is "TOK_NUMBER", part of the
+scanner might look like:
+
+@example
+%@{
+#include "y.tab.h"
+%@}
+
+%%
+
+[0-9]+        yylval = atoi( yytext ); return TOK_NUMBER;
+@end example
+
+@node Options, Performance, YACC interface, Top
+@section Options
+@code{flex} has the following options:
+
+@table @samp
+@item -b
+Generate backing-up information to @file{lex.backup}.
+This is a list of scanner states which require
+backing up and the input characters on which they
+do so.  By adding rules one can remove backing-up
+states.  If @emph{all} backing-up states are eliminated
+and @samp{-Cf} or @samp{-CF} is used, the generated scanner will
+run faster (see the @samp{-p} flag).  Only users who wish
+to squeeze every last cycle out of their scanners
+need worry about this option.  (See the section on
+Performance Considerations below.)
+
+@item -c
+is a do-nothing, deprecated option included for
+POSIX compliance.
+
+@item -d
+makes the generated scanner run in @dfn{debug} mode.
+Whenever a pattern is recognized and the global
+@code{yy_flex_debug} is non-zero (which is the default),
+the scanner will write to @code{stderr} a line of the
+form:
+
+@example
+--accepting rule at line 53 ("the matched text")
+@end example
+
+The line number refers to the location of the rule
+in the file defining the scanner (i.e., the file
+that was fed to flex).  Messages are also generated
+when the scanner backs up, accepts the default
+rule, reaches the end of its input buffer (or
+encounters a NUL; at this point, the two look the
+same as far as the scanner's concerned), or reaches
+an end-of-file.
+
+@item -f
+specifies @dfn{fast scanner}.  No table compression is
+done and stdio is bypassed.  The result is large
+but fast.  This option is equivalent to @samp{-Cfr} (see
+below).
+
+@item -h
+generates a "help" summary of @code{flex's} options to
+@code{stdout} and then exits.  @samp{-?} and @samp{--help} are synonyms
+for @samp{-h}.
+
+@item -i
+instructs @code{flex} to generate a @emph{case-insensitive}
+scanner.  The case of letters given in the @code{flex} input
+patterns will be ignored, and tokens in the input
+will be matched regardless of case.  The matched
+text given in @code{yytext} will have the preserved case
+(i.e., it will not be folded).
+
+@item -l
+turns on maximum compatibility with the original
+AT&T @code{lex} implementation.  Note that this does not
+mean @emph{full} compatibility.  Use of this option costs
+a considerable amount of performance, and it cannot
+be used with the @samp{-+, -f, -F, -Cf}, or @samp{-CF} options.
+For details on the compatibilities it provides, see
+the section "Incompatibilities With Lex And POSIX"
+below.  This option also results in the name
+@code{YY_FLEX_LEX_COMPAT} being #define'd in the generated
+scanner.
+
+@item -n
+is another do-nothing, deprecated option included
+only for POSIX compliance.
+
+@item -p
+generates a performance report to stderr.  The
+report consists of comments regarding features of
+the @code{flex} input file which will cause a serious loss
+of performance in the resulting scanner.  If you
+give the flag twice, you will also get comments
+regarding features that lead to minor performance
+losses.
+
+Note that the use of @code{REJECT}, @samp{%option yylineno} and
+variable trailing context (see the Deficiencies / Bugs section below)
+entails a substantial performance penalty; use of @samp{yymore()},
+the @samp{^} operator, and the @samp{-I} flag entail minor performance
+penalties.
+
+@item -s
+causes the @dfn{default rule} (that unmatched scanner
+input is echoed to @code{stdout}) to be suppressed.  If
+the scanner encounters input that does not match
+any of its rules, it aborts with an error.  This
+option is useful for finding holes in a scanner's
+rule set.
+
+@item -t
+instructs @code{flex} to write the scanner it generates to
+standard output instead of @file{lex.yy.c}.
+
+@item -v
+specifies that @code{flex} should write to @code{stderr} a
+summary of statistics regarding the scanner it
+generates.  Most of the statistics are meaningless to
+the casual @code{flex} user, but the first line identifies
+the version of @code{flex} (same as reported by @samp{-V}), and
+the next line the flags used when generating the
+scanner, including those that are on by default.
+
+@item -w
+suppresses warning messages.
+
+@item -B
+instructs @code{flex} to generate a @emph{batch} scanner, the
+opposite of @emph{interactive} scanners generated by @samp{-I}
+(see below).  In general, you use @samp{-B} when you are
+@emph{certain} that your scanner will never be used
+interactively, and you want to squeeze a @emph{little} more
+performance out of it.  If your goal is instead to
+squeeze out a @emph{lot} more performance, you should be
+using the @samp{-Cf} or @samp{-CF} options (discussed below),
+which turn on @samp{-B} automatically anyway.
+
+@item -F
+specifies that the @dfn{fast} scanner table
+representation should be used (and stdio bypassed).  This
+representation is about as fast as the full table
+representation @samp{(-f)}, and for some sets of patterns
+will be considerably smaller (and for others,
+larger).  In general, if the pattern set contains
+both "keywords" and a catch-all, "identifier" rule,
+such as in the set:
+
+@example
+"case"    return TOK_CASE;
+"switch"  return TOK_SWITCH;
+...
+"default" return TOK_DEFAULT;
+[a-z]+    return TOK_ID;
+@end example
+
+@noindent
+then you're better off using the full table
+representation.  If only the "identifier" rule is
+present and you then use a hash table or some such to
+detect the keywords, you're better off using @samp{-F}.
+
+This option is equivalent to @samp{-CFr} (see below).  It
+cannot be used with @samp{-+}.
+
+@item -I
+instructs @code{flex} to generate an @emph{interactive} scanner.
+An interactive scanner is one that only looks ahead
+to decide what token has been matched if it
+absolutely must.  It turns out that always looking one
+extra character ahead, even if the scanner has
+already seen enough text to disambiguate the
+current token, is a bit faster than only looking ahead
+when necessary.  But scanners that always look
+ahead give dreadful interactive performance; for
+example, when a user types a newline, it is not
+recognized as a newline token until they enter
+@emph{another} token, which often means typing in another
+whole line.
+
+@code{Flex} scanners default to @emph{interactive} unless you use
+the @samp{-Cf} or @samp{-CF} table-compression options (see
+below).  That's because if you're looking for
+high-performance you should be using one of these
+options, so if you didn't, @code{flex} assumes you'd
+rather trade off a bit of run-time performance for
+intuitive interactive behavior.  Note also that you
+@emph{cannot} use @samp{-I} in conjunction with @samp{-Cf} or @samp{-CF}.
+Thus, this option is not really needed; it is on by
+default for all those cases in which it is allowed.
+
+You can force a scanner to @emph{not} be interactive by
+using @samp{-B} (see above).
+
+@item -L
+instructs @code{flex} not to generate @samp{#line} directives.
+Without this option, @code{flex} peppers the generated
+scanner with #line directives so error messages in
+the actions will be correctly located with respect
+to either the original @code{flex} input file (if the
+errors are due to code in the input file), or
+@file{lex.yy.c} (if the errors are @code{flex's} fault -- you
+should report these sorts of errors to the email
+address given below).
+
+@item -T
+makes @code{flex} run in @code{trace} mode.  It will generate a
+lot of messages to @code{stderr} concerning the form of
+the input and the resultant non-deterministic and
+deterministic finite automata.  This option is
+mostly for use in maintaining @code{flex}.
+
+@item -V
+prints the version number to @code{stdout} and exits.
+@samp{--version} is a synonym for @samp{-V}.
+
+@item -7
+instructs @code{flex} to generate a 7-bit scanner, i.e.,
+one which can only recognized 7-bit characters in
+its input.  The advantage of using @samp{-7} is that the
+scanner's tables can be up to half the size of
+those generated using the @samp{-8} option (see below).
+The disadvantage is that such scanners often hang
+or crash if their input contains an 8-bit
+character.
+
+Note, however, that unless you generate your
+scanner using the @samp{-Cf} or @samp{-CF} table compression options,
+use of @samp{-7} will save only a small amount of table
+space, and make your scanner considerably less
+portable.  @code{Flex's} default behavior is to generate
+an 8-bit scanner unless you use the @samp{-Cf} or @samp{-CF}, in
+which case @code{flex} defaults to generating 7-bit
+scanners unless your site was always configured to
+generate 8-bit scanners (as will often be the case
+with non-USA sites).  You can tell whether flex
+generated a 7-bit or an 8-bit scanner by inspecting
+the flag summary in the @samp{-v} output as described
+above.
+
+Note that if you use @samp{-Cfe} or @samp{-CFe} (those table
+compression options, but also using equivalence
+classes as discussed see below), flex still
+defaults to generating an 8-bit scanner, since
+usually with these compression options full 8-bit
+tables are not much more expensive than 7-bit
+tables.
+
+@item -8
+instructs @code{flex} to generate an 8-bit scanner, i.e.,
+one which can recognize 8-bit characters.  This
+flag is only needed for scanners generated using
+@samp{-Cf} or @samp{-CF}, as otherwise flex defaults to
+generating an 8-bit scanner anyway.
+
+See the discussion of @samp{-7} above for flex's default
+behavior and the tradeoffs between 7-bit and 8-bit
+scanners.
+
+@item -+
+specifies that you want flex to generate a C++
+scanner class.  See the section on Generating C++
+Scanners below for details.
+
+@item -C[aefFmr]
+controls the degree of table compression and, more
+generally, trade-offs between small scanners and
+fast scanners.
+
+@samp{-Ca} ("align") instructs flex to trade off larger
+tables in the generated scanner for faster
+performance because the elements of the tables are better
+aligned for memory access and computation.  On some
+RISC architectures, fetching and manipulating
+long-words is more efficient than with smaller-sized
+units such as shortwords.  This option can double
+the size of the tables used by your scanner.
+
+@samp{-Ce} directs @code{flex} to construct @dfn{equivalence classes},
+i.e., sets of characters which have identical
+lexical properties (for example, if the only appearance
+of digits in the @code{flex} input is in the character
+class "[0-9]" then the digits '0', '1', @dots{}, '9'
+will all be put in the same equivalence class).
+Equivalence classes usually give dramatic
+reductions in the final table/object file sizes
+(typically a factor of 2-5) and are pretty cheap
+performance-wise (one array look-up per character
+scanned).
+
+@samp{-Cf} specifies that the @emph{full} scanner tables should
+be generated - @code{flex} should not compress the tables
+by taking advantages of similar transition
+functions for different states.
+
+@samp{-CF} specifies that the alternate fast scanner
+representation (described above under the @samp{-F} flag)
+should be used.  This option cannot be used with
+@samp{-+}.
+
+@samp{-Cm} directs @code{flex} to construct @dfn{meta-equivalence
+classes}, which are sets of equivalence classes (or
+characters, if equivalence classes are not being
+used) that are commonly used together.
+Meta-equivalence classes are often a big win when using
+compressed tables, but they have a moderate
+performance impact (one or two "if" tests and one array
+look-up per character scanned).
+
+@samp{-Cr} causes the generated scanner to @emph{bypass} use of
+the standard I/O library (stdio) for input.
+Instead of calling @samp{fread()} or @samp{getc()}, the scanner
+will use the @samp{read()} system call, resulting in a
+performance gain which varies from system to
+system, but in general is probably negligible unless
+you are also using @samp{-Cf} or @samp{-CF}.  Using @samp{-Cr} can cause
+strange behavior if, for example, you read from
+@code{yyin} using stdio prior to calling the scanner
+(because the scanner will miss whatever text your
+previous reads left in the stdio input buffer).
+
+@samp{-Cr} has no effect if you define @code{YY_INPUT} (see The
+Generated Scanner above).
+
+A lone @samp{-C} specifies that the scanner tables should
+be compressed but neither equivalence classes nor
+meta-equivalence classes should be used.
+
+The options @samp{-Cf} or @samp{-CF} and @samp{-Cm} do not make sense
+together - there is no opportunity for
+meta-equivalence classes if the table is not being
+compressed.  Otherwise the options may be freely
+mixed, and are cumulative.
+
+The default setting is @samp{-Cem}, which specifies that
+@code{flex} should generate equivalence classes and
+meta-equivalence classes.  This setting provides the
+highest degree of table compression.  You can trade
+off faster-executing scanners at the cost of larger
+tables with the following generally being true:
+
+@example
+slowest & smallest
+      -Cem
+      -Cm
+      -Ce
+      -C
+      -C@{f,F@}e
+      -C@{f,F@}
+      -C@{f,F@}a
+fastest & largest
+@end example
+
+Note that scanners with the smallest tables are
+usually generated and compiled the quickest, so
+during development you will usually want to use the
+default, maximal compression.
+
+@samp{-Cfe} is often a good compromise between speed and
+size for production scanners.
+
+@item -ooutput
+directs flex to write the scanner to the file @samp{out-}
+@code{put} instead of @file{lex.yy.c}.  If you combine @samp{-o} with
+the @samp{-t} option, then the scanner is written to
+@code{stdout} but its @samp{#line} directives (see the @samp{-L} option
+above) refer to the file @code{output}.
+
+@item -Pprefix
+changes the default @samp{yy} prefix used by @code{flex} for all
+globally-visible variable and function names to
+instead be @var{prefix}.  For example, @samp{-Pfoo} changes the
+name of @code{yytext} to @file{footext}.  It also changes the
+name of the default output file from @file{lex.yy.c} to
+@file{lex.foo.c}.  Here are all of the names affected:
+
+@example
+yy_create_buffer
+yy_delete_buffer
+yy_flex_debug
+yy_init_buffer
+yy_flush_buffer
+yy_load_buffer_state
+yy_switch_to_buffer
+yyin
+yyleng
+yylex
+yylineno
+yyout
+yyrestart
+yytext
+yywrap
+@end example
+
+(If you are using a C++ scanner, then only @code{yywrap}
+and @code{yyFlexLexer} are affected.) Within your scanner
+itself, you can still refer to the global variables
+and functions using either version of their name;
+but externally, they have the modified name.
+
+This option lets you easily link together multiple
+@code{flex} programs into the same executable.  Note,
+though, that using this option also renames
+@samp{yywrap()}, so you now @emph{must} either provide your own
+(appropriately-named) version of the routine for
+your scanner, or use @samp{%option noyywrap}, as linking
+with @samp{-lfl} no longer provides one for you by
+default.
+
+@item -Sskeleton_file
+overrides the default skeleton file from which @code{flex}
+constructs its scanners.  You'll never need this
+option unless you are doing @code{flex} maintenance or
+development.
+@end table
+
+@code{flex} also provides a mechanism for controlling options
+within the scanner specification itself, rather than from
+the flex command-line.  This is done by including @samp{%option}
+directives in the first section of the scanner
+specification.  You can specify multiple options with a single
+@samp{%option} directive, and multiple directives in the first
+section of your flex input file.  Most options are given
+simply as names, optionally preceded by the word "no"
+(with no intervening whitespace) to negate their meaning.
+A number are equivalent to flex flags or their negation:
+
+@example
+7bit            -7 option
+8bit            -8 option
+align           -Ca option
+backup          -b option
+batch           -B option
+c++             -+ option
+
+caseful or
+case-sensitive  opposite of -i (default)
+
+case-insensitive or
+caseless        -i option
+
+debug           -d option
+default         opposite of -s option
+ecs             -Ce option
+fast            -F option
+full            -f option
+interactive     -I option
+lex-compat      -l option
+meta-ecs        -Cm option
+perf-report     -p option
+read            -Cr option
+stdout          -t option
+verbose         -v option
+warn            opposite of -w option
+                (use "%option nowarn" for -w)
+
+array           equivalent to "%array"
+pointer         equivalent to "%pointer" (default)
+@end example
+
+Some @samp{%option's} provide features otherwise not available:
+
+@table @samp
+@item always-interactive
+instructs flex to generate a scanner which always
+considers its input "interactive".  Normally, on
+each new input file the scanner calls @samp{isatty()} in
+an attempt to determine whether the scanner's input
+source is interactive and thus should be read a
+character at a time.  When this option is used,
+however, then no such call is made.
+
+@item main
+directs flex to provide a default @samp{main()} program
+for the scanner, which simply calls @samp{yylex()}.  This
+option implies @code{noyywrap} (see below).
+
+@item never-interactive
+instructs flex to generate a scanner which never
+considers its input "interactive" (again, no call
+made to @samp{isatty())}.  This is the opposite of @samp{always-}
+@emph{interactive}.
+
+@item stack
+enables the use of start condition stacks (see
+Start Conditions above).
+
+@item stdinit
+if unset (i.e., @samp{%option nostdinit}) initializes @code{yyin}
+and @code{yyout} to nil @code{FILE} pointers, instead of @code{stdin}
+and @code{stdout}.
+
+@item yylineno
+directs @code{flex} to generate a scanner that maintains the number
+of the current line read from its input in the global variable
+@code{yylineno}.  This option is implied by @samp{%option lex-compat}.
+
+@item yywrap
+if unset (i.e., @samp{%option noyywrap}), makes the
+scanner not call @samp{yywrap()} upon an end-of-file, but
+simply assume that there are no more files to scan
+(until the user points @code{yyin} at a new file and calls
+@samp{yylex()} again).
+@end table
+
+@code{flex} scans your rule actions to determine whether you use
+the @code{REJECT} or @samp{yymore()} features.  The @code{reject} and @code{yymore}
+options are available to override its decision as to
+whether you use the options, either by setting them (e.g.,
+@samp{%option reject}) to indicate the feature is indeed used, or
+unsetting them to indicate it actually is not used (e.g.,
+@samp{%option noyymore}).
+
+Three options take string-delimited values, offset with '=':
+
+@example
+%option outfile="ABC"
+@end example
+
+@noindent
+is equivalent to @samp{-oABC}, and
+
+@example
+%option prefix="XYZ"
+@end example
+
+@noindent
+is equivalent to @samp{-PXYZ}.
+
+Finally,
+
+@example
+%option yyclass="foo"
+@end example
+
+@noindent
+only applies when generating a C++ scanner (@samp{-+} option).  It
+informs @code{flex} that you have derived @samp{foo} as a subclass of
+@code{yyFlexLexer} so @code{flex} will place your actions in the member
+function @samp{foo::yylex()} instead of @samp{yyFlexLexer::yylex()}.
+It also generates a @samp{yyFlexLexer::yylex()} member function that
+emits a run-time error (by invoking @samp{yyFlexLexer::LexerError()})
+if called.  See Generating C++ Scanners, below, for additional
+information.
+
+A number of options are available for lint purists who
+want to suppress the appearance of unneeded routines in
+the generated scanner.  Each of the following, if unset,
+results in the corresponding routine not appearing in the
+generated scanner:
+
+@example
+input, unput
+yy_push_state, yy_pop_state, yy_top_state
+yy_scan_buffer, yy_scan_bytes, yy_scan_string
+@end example
+
+@noindent
+(though @samp{yy_push_state()} and friends won't appear anyway
+unless you use @samp{%option stack}).
+
+@node Performance, C++, Options, Top
+@section Performance considerations
+
+The main design goal of @code{flex} is that it generate
+high-performance scanners.  It has been optimized for dealing
+well with large sets of rules.  Aside from the effects on
+scanner speed of the table compression @samp{-C} options outlined
+above, there are a number of options/actions which degrade
+performance.  These are, from most expensive to least:
+
+@example
+REJECT
+%option yylineno
+arbitrary trailing context
+
+pattern sets that require backing up
+%array
+%option interactive
+%option always-interactive
+
+'^' beginning-of-line operator
+yymore()
+@end example
+
+with the first three all being quite expensive and the
+last two being quite cheap.  Note also that @samp{unput()} is
+implemented as a routine call that potentially does quite
+a bit of work, while @samp{yyless()} is a quite-cheap macro; so
+if just putting back some excess text you scanned, use
+@samp{yyless()}.
+
+@code{REJECT} should be avoided at all costs when performance is
+important.  It is a particularly expensive option.
+
+Getting rid of backing up is messy and often may be an
+enormous amount of work for a complicated scanner.  In
+principal, one begins by using the @samp{-b} flag to generate a
+@file{lex.backup} file.  For example, on the input
+
+@example
+%%
+foo        return TOK_KEYWORD;
+foobar     return TOK_KEYWORD;
+@end example
+
+@noindent
+the file looks like:
+
+@example
+State #6 is non-accepting -
+ associated rule line numbers:
+       2       3
+ out-transitions: [ o ]
+ jam-transitions: EOF [ \001-n  p-\177 ]
+
+State #8 is non-accepting -
+ associated rule line numbers:
+       3
+ out-transitions: [ a ]
+ jam-transitions: EOF [ \001-`  b-\177 ]
+
+State #9 is non-accepting -
+ associated rule line numbers:
+       3
+ out-transitions: [ r ]
+ jam-transitions: EOF [ \001-q  s-\177 ]
+
+Compressed tables always back up.
+@end example
+
+The first few lines tell us that there's a scanner state
+in which it can make a transition on an 'o' but not on any
+other character, and that in that state the currently
+scanned text does not match any rule.  The state occurs
+when trying to match the rules found at lines 2 and 3 in
+the input file.  If the scanner is in that state and then
+reads something other than an 'o', it will have to back up
+to find a rule which is matched.  With a bit of
+head-scratching one can see that this must be the state it's in
+when it has seen "fo".  When this has happened, if
+anything other than another 'o' is seen, the scanner will
+have to back up to simply match the 'f' (by the default
+rule).
+
+The comment regarding State #8 indicates there's a problem
+when "foob" has been scanned.  Indeed, on any character
+other than an 'a', the scanner will have to back up to
+accept "foo".  Similarly, the comment for State #9
+concerns when "fooba" has been scanned and an 'r' does not
+follow.
+
+The final comment reminds us that there's no point going
+to all the trouble of removing backing up from the rules
+unless we're using @samp{-Cf} or @samp{-CF}, since there's no
+performance gain doing so with compressed scanners.
+
+The way to remove the backing up is to add "error" rules:
+
+@example
+%%
+foo         return TOK_KEYWORD;
+foobar      return TOK_KEYWORD;
+
+fooba       |
+foob        |
+fo          @{
+            /* false alarm, not really a keyword */
+            return TOK_ID;
+            @}
+@end example
+
+Eliminating backing up among a list of keywords can also
+be done using a "catch-all" rule:
+
+@example
+%%
+foo         return TOK_KEYWORD;
+foobar      return TOK_KEYWORD;
+
+[a-z]+      return TOK_ID;
+@end example
+
+This is usually the best solution when appropriate.
+
+Backing up messages tend to cascade.  With a complicated
+set of rules it's not uncommon to get hundreds of
+messages.  If one can decipher them, though, it often only
+takes a dozen or so rules to eliminate the backing up
+(though it's easy to make a mistake and have an error rule
+accidentally match a valid token.  A possible future @code{flex}
+feature will be to automatically add rules to eliminate
+backing up).
+
+It's important to keep in mind that you gain the benefits
+of eliminating backing up only if you eliminate @emph{every}
+instance of backing up.  Leaving just one means you gain
+nothing.
+
+@var{Variable} trailing context (where both the leading and
+trailing parts do not have a fixed length) entails almost
+the same performance loss as @code{REJECT} (i.e., substantial).
+So when possible a rule like:
+
+@example
+%%
+mouse|rat/(cat|dog)   run();
+@end example
+
+@noindent
+is better written:
+
+@example
+%%
+mouse/cat|dog         run();
+rat/cat|dog           run();
+@end example
+
+@noindent
+or as
+
+@example
+%%
+mouse|rat/cat         run();
+mouse|rat/dog         run();
+@end example
+
+Note that here the special '|' action does @emph{not} provide any
+savings, and can even make things worse (see Deficiencies
+/ Bugs below).
+
+Another area where the user can increase a scanner's
+performance (and one that's easier to implement) arises from
+the fact that the longer the tokens matched, the faster
+the scanner will run.  This is because with long tokens
+the processing of most input characters takes place in the
+(short) inner scanning loop, and does not often have to go
+through the additional work of setting up the scanning
+environment (e.g., @code{yytext}) for the action.  Recall the
+scanner for C comments:
+
+@example
+%x comment
+%%
+        int line_num = 1;
+
+"/*"         BEGIN(comment);
+
+<comment>[^*\n]*
+<comment>"*"+[^*/\n]*
+<comment>\n             ++line_num;
+<comment>"*"+"/"        BEGIN(INITIAL);
+@end example
+
+This could be sped up by writing it as:
+
+@example
+%x comment
+%%
+        int line_num = 1;
+
+"/*"         BEGIN(comment);
+
+<comment>[^*\n]*
+<comment>[^*\n]*\n      ++line_num;
+<comment>"*"+[^*/\n]*
+<comment>"*"+[^*/\n]*\n ++line_num;
+<comment>"*"+"/"        BEGIN(INITIAL);
+@end example
+
+Now instead of each newline requiring the processing of
+another action, recognizing the newlines is "distributed"
+over the other rules to keep the matched text as long as
+possible.  Note that @emph{adding} rules does @emph{not} slow down the
+scanner!  The speed of the scanner is independent of the
+number of rules or (modulo the considerations given at the
+beginning of this section) how complicated the rules are
+with regard to operators such as '*' and '|'.
+
+A final example in speeding up a scanner: suppose you want
+to scan through a file containing identifiers and
+keywords, one per line and with no other extraneous
+characters, and recognize all the keywords.  A natural first
+approach is:
+
+@example
+%%
+asm      |
+auto     |
+break    |
+@dots{} etc @dots{}
+volatile |
+while    /* it's a keyword */
+
+.|\n     /* it's not a keyword */
+@end example
+
+To eliminate the back-tracking, introduce a catch-all
+rule:
+
+@example
+%%
+asm      |
+auto     |
+break    |
+... etc ...
+volatile |
+while    /* it's a keyword */
+
+[a-z]+   |
+.|\n     /* it's not a keyword */
+@end example
+
+Now, if it's guaranteed that there's exactly one word per
+line, then we can reduce the total number of matches by a
+half by merging in the recognition of newlines with that
+of the other tokens:
+
+@example
+%%
+asm\n    |
+auto\n   |
+break\n  |
+@dots{} etc @dots{}
+volatile\n |
+while\n  /* it's a keyword */
+
+[a-z]+\n |
+.|\n     /* it's not a keyword */
+@end example
+
+One has to be careful here, as we have now reintroduced
+backing up into the scanner.  In particular, while @emph{we} know
+that there will never be any characters in the input
+stream other than letters or newlines, @code{flex} can't figure
+this out, and it will plan for possibly needing to back up
+when it has scanned a token like "auto" and then the next
+character is something other than a newline or a letter.
+Previously it would then just match the "auto" rule and be
+done, but now it has no "auto" rule, only a "auto\n" rule.
+To eliminate the possibility of backing up, we could
+either duplicate all rules but without final newlines, or,
+since we never expect to encounter such an input and
+therefore don't how it's classified, we can introduce one
+more catch-all rule, this one which doesn't include a
+newline:
+
+@example
+%%
+asm\n    |
+auto\n   |
+break\n  |
+@dots{} etc @dots{}
+volatile\n |
+while\n  /* it's a keyword */
+
+[a-z]+\n |
+[a-z]+   |
+.|\n     /* it's not a keyword */
+@end example
+
+Compiled with @samp{-Cf}, this is about as fast as one can get a
+@code{flex} scanner to go for this particular problem.
+
+A final note: @code{flex} is slow when matching NUL's,
+particularly when a token contains multiple NUL's.  It's best to
+write rules which match @emph{short} amounts of text if it's
+anticipated that the text will often include NUL's.
+
+Another final note regarding performance: as mentioned
+above in the section How the Input is Matched, dynamically
+resizing @code{yytext} to accommodate huge tokens is a slow
+process because it presently requires that the (huge) token
+be rescanned from the beginning.  Thus if performance is
+vital, you should attempt to match "large" quantities of
+text but not "huge" quantities, where the cutoff between
+the two is at about 8K characters/token.
+
+@node C++, Incompatibilities, Performance, Top
+@section Generating C++ scanners
+
+@code{flex} provides two different ways to generate scanners for
+use with C++.  The first way is to simply compile a
+scanner generated by @code{flex} using a C++ compiler instead of a C
+compiler.  You should not encounter any compilations
+errors (please report any you find to the email address
+given in the Author section below).  You can then use C++
+code in your rule actions instead of C code.  Note that
+the default input source for your scanner remains @code{yyin},
+and default echoing is still done to @code{yyout}.  Both of these
+remain @samp{FILE *} variables and not C++ @code{streams}.
+
+You can also use @code{flex} to generate a C++ scanner class, using
+the @samp{-+} option, (or, equivalently, @samp{%option c++}), which
+is automatically specified if the name of the flex executable ends
+in a @samp{+}, such as @code{flex++}.  When using this option, flex
+defaults to generating the scanner to the file @file{lex.yy.cc} instead
+of @file{lex.yy.c}.  The generated scanner includes the header file
+@file{FlexLexer.h}, which defines the interface to two C++ classes.
+
+The first class, @code{FlexLexer}, provides an abstract base
+class defining the general scanner class interface.  It
+provides the following member functions:
+
+@table @samp
+@item const char* YYText()
+returns the text of the most recently matched
+token, the equivalent of @code{yytext}.
+
+@item int YYLeng()
+returns the length of the most recently matched
+token, the equivalent of @code{yyleng}.
+
+@item int lineno() const
+returns the current input line number (see @samp{%option yylineno}),
+or 1 if @samp{%option yylineno} was not used.
+
+@item void set_debug( int flag )
+sets the debugging flag for the scanner, equivalent to assigning to
+@code{yy_flex_debug} (see the Options section above).  Note that you
+must build the scanner using @samp{%option debug} to include debugging
+information in it.
+
+@item int debug() const
+returns the current setting of the debugging flag.
+@end table
+
+Also provided are member functions equivalent to
+@samp{yy_switch_to_buffer(), yy_create_buffer()} (though the
+first argument is an @samp{istream*} object pointer and not a
+@samp{FILE*}, @samp{yy_flush_buffer()}, @samp{yy_delete_buffer()},
+and @samp{yyrestart()} (again, the first argument is a @samp{istream*}
+object pointer).
+
+The second class defined in @file{FlexLexer.h} is @code{yyFlexLexer},
+which is derived from @code{FlexLexer}.  It defines the following
+additional member functions:
+
+@table @samp
+@item yyFlexLexer( istream* arg_yyin = 0, ostream* arg_yyout = 0 )
+constructs a @code{yyFlexLexer} object using the given
+streams for input and output.  If not specified,
+the streams default to @code{cin} and @code{cout}, respectively.
+
+@item virtual int yylex()
+performs the same role is @samp{yylex()} does for ordinary
+flex scanners: it scans the input stream, consuming
+tokens, until a rule's action returns a value.  If you derive a subclass
+@var{S}
+from @code{yyFlexLexer}
+and want to access the member functions and variables of
+@var{S}
+inside @samp{yylex()},
+then you need to use @samp{%option yyclass="@var{S}"}
+to inform @code{flex}
+that you will be using that subclass instead of @code{yyFlexLexer}.
+In this case, rather than generating @samp{yyFlexLexer::yylex()},
+@code{flex} generates @samp{@var{S}::yylex()}
+(and also generates a dummy @samp{yyFlexLexer::yylex()}
+that calls @samp{yyFlexLexer::LexerError()}
+if called).
+
+@item virtual void switch_streams(istream* new_in = 0, ostream* new_out = 0)
+reassigns @code{yyin} to @code{new_in}
+(if non-nil)
+and @code{yyout} to @code{new_out}
+(ditto), deleting the previous input buffer if @code{yyin}
+is reassigned.
+
+@item int yylex( istream* new_in = 0, ostream* new_out = 0 )
+first switches the input streams via @samp{switch_streams( new_in, new_out )}
+and then returns the value of @samp{yylex()}.
+@end table
+
+In addition, @code{yyFlexLexer} defines the following protected
+virtual functions which you can redefine in derived
+classes to tailor the scanner:
+
+@table @samp
+@item virtual int LexerInput( char* buf, int max_size )
+reads up to @samp{max_size} characters into @var{buf} and
+returns the number of characters read.  To indicate
+end-of-input, return 0 characters.  Note that
+"interactive" scanners (see the @samp{-B} and @samp{-I} flags)
+define the macro @code{YY_INTERACTIVE}.  If you redefine
+@code{LexerInput()} and need to take different actions
+depending on whether or not the scanner might be
+scanning an interactive input source, you can test
+for the presence of this name via @samp{#ifdef}.
+
+@item virtual void LexerOutput( const char* buf, int size )
+writes out @var{size} characters from the buffer @var{buf},
+which, while NUL-terminated, may also contain
+"internal" NUL's if the scanner's rules can match
+text with NUL's in them.
+
+@item virtual void LexerError( const char* msg )
+reports a fatal error message.  The default version
+of this function writes the message to the stream
+@code{cerr} and exits.
+@end table
+
+Note that a @code{yyFlexLexer} object contains its @emph{entire}
+scanning state.  Thus you can use such objects to create
+reentrant scanners.  You can instantiate multiple instances of
+the same @code{yyFlexLexer} class, and you can also combine
+multiple C++ scanner classes together in the same program
+using the @samp{-P} option discussed above.
+Finally, note that the @samp{%array} feature is not available to
+C++ scanner classes; you must use @samp{%pointer} (the default).
+
+Here is an example of a simple C++ scanner:
+
+@example
+    // An example of using the flex C++ scanner class.
+
+%@{
+int mylineno = 0;
+%@}
+
+string  \"[^\n"]+\"
+
+ws      [ \t]+
+
+alpha   [A-Za-z]
+dig     [0-9]
+name    (@{alpha@}|@{dig@}|\$)(@{alpha@}|@{dig@}|[_.\-/$])*
+num1    [-+]?@{dig@}+\.?([eE][-+]?@{dig@}+)?
+num2    [-+]?@{dig@}*\.@{dig@}+([eE][-+]?@{dig@}+)?
+number  @{num1@}|@{num2@}
+
+%%
+
+@{ws@}    /* skip blanks and tabs */
+
+"/*"    @{
+        int c;
+
+        while((c = yyinput()) != 0)
+            @{
+            if(c == '\n')
+                ++mylineno;
+
+            else if(c == '*')
+                @{
+                if((c = yyinput()) == '/')
+                    break;
+                else
+                    unput(c);
+                @}
+            @}
+        @}
+
+@{number@}  cout << "number " << YYText() << '\n';
+
+\n        mylineno++;
+
+@{name@}    cout << "name " << YYText() << '\n';
+
+@{string@}  cout << "string " << YYText() << '\n';
+
+%%
+
+Version 2.5               December 1994                        44
+
+int main( int /* argc */, char** /* argv */ )
+    @{
+    FlexLexer* lexer = new yyFlexLexer;
+    while(lexer->yylex() != 0)
+        ;
+    return 0;
+    @}
+@end example
+
+If you want to create multiple (different) lexer classes,
+you use the @samp{-P} flag (or the @samp{prefix=} option) to rename each
+@code{yyFlexLexer} to some other @code{xxFlexLexer}.  You then can
+include @samp{<FlexLexer.h>} in your other sources once per lexer
+class, first renaming @code{yyFlexLexer} as follows:
+
+@example
+#undef yyFlexLexer
+#define yyFlexLexer xxFlexLexer
+#include <FlexLexer.h>
+
+#undef yyFlexLexer
+#define yyFlexLexer zzFlexLexer
+#include <FlexLexer.h>
+@end example
+
+if, for example, you used @samp{%option prefix="xx"} for one of
+your scanners and @samp{%option prefix="zz"} for the other.
+
+IMPORTANT: the present form of the scanning class is
+@emph{experimental} and may change considerably between major
+releases.
+
+@node Incompatibilities, Diagnostics, C++, Top
+@section Incompatibilities with @code{lex} and POSIX
+
+@code{flex} is a rewrite of the AT&T Unix @code{lex} tool (the two
+implementations do not share any code, though), with some
+extensions and incompatibilities, both of which are of
+concern to those who wish to write scanners acceptable to
+either implementation.  Flex is fully compliant with the
+POSIX @code{lex} specification, except that when using @samp{%pointer}
+(the default), a call to @samp{unput()} destroys the contents of
+@code{yytext}, which is counter to the POSIX specification.
+
+In this section we discuss all of the known areas of
+incompatibility between flex, AT&T lex, and the POSIX
+specification.
+
+@code{flex's} @samp{-l} option turns on maximum compatibility with the
+original AT&T @code{lex} implementation, at the cost of a major
+loss in the generated scanner's performance.  We note
+below which incompatibilities can be overcome using the @samp{-l}
+option.
+
+@code{flex} is fully compatible with @code{lex} with the following
+exceptions:
+
+@itemize -
+@item
+The undocumented @code{lex} scanner internal variable @code{yylineno}
+is not supported unless @samp{-l} or @samp{%option yylineno} is used.
+@code{yylineno} should be maintained on a per-buffer basis, rather
+than a per-scanner (single global variable) basis.  @code{yylineno} is
+not part of the POSIX specification.
+
+@item
+The @samp{input()} routine is not redefinable, though it
+may be called to read characters following whatever
+has been matched by a rule.  If @samp{input()} encounters
+an end-of-file the normal @samp{yywrap()} processing is
+done.  A ``real'' end-of-file is returned by
+@samp{input()} as @code{EOF}.
+
+Input is instead controlled by defining the
+@code{YY_INPUT} macro.
+
+The @code{flex} restriction that @samp{input()} cannot be
+redefined is in accordance with the POSIX
+specification, which simply does not specify any way of
+controlling the scanner's input other than by making
+an initial assignment to @code{yyin}.
+
+@item
+The @samp{unput()} routine is not redefinable.  This
+restriction is in accordance with POSIX.
+
+@item
+@code{flex} scanners are not as reentrant as @code{lex} scanners.
+In particular, if you have an interactive scanner
+and an interrupt handler which long-jumps out of
+the scanner, and the scanner is subsequently called
+again, you may get the following message:
+
+@example
+fatal flex scanner internal error--end of buffer missed
+@end example
+
+To reenter the scanner, first use
+
+@example
+yyrestart( yyin );
+@end example
+
+Note that this call will throw away any buffered
+input; usually this isn't a problem with an
+interactive scanner.
+
+Also note that flex C++ scanner classes @emph{are}
+reentrant, so if using C++ is an option for you, you
+should use them instead.  See "Generating C++
+Scanners" above for details.
+
+@item
+@samp{output()} is not supported.  Output from the @samp{ECHO}
+macro is done to the file-pointer @code{yyout} (default
+@code{stdout}).
+
+@samp{output()} is not part of the POSIX specification.
+
+@item
+@code{lex} does not support exclusive start conditions
+(%x), though they are in the POSIX specification.
+
+@item
+When definitions are expanded, @code{flex} encloses them
+in parentheses.  With lex, the following:
+
+@example
+NAME    [A-Z][A-Z0-9]*
+%%
+foo@{NAME@}?      printf( "Found it\n" );
+%%
+@end example
+
+will not match the string "foo" because when the
+macro is expanded the rule is equivalent to
+"foo[A-Z][A-Z0-9]*?" and the precedence is such that the
+'?' is associated with "[A-Z0-9]*".  With @code{flex}, the
+rule will be expanded to "foo([A-Z][A-Z0-9]*)?" and
+so the string "foo" will match.
+
+Note that if the definition begins with @samp{^} or ends
+with @samp{$} then it is @emph{not} expanded with parentheses, to
+allow these operators to appear in definitions
+without losing their special meanings.  But the
+@samp{<s>, /}, and @samp{<<EOF>>} operators cannot be used in a
+@code{flex} definition.
+
+Using @samp{-l} results in the @code{lex} behavior of no
+parentheses around the definition.
+
+The POSIX specification is that the definition be enclosed in
+parentheses.
+
+@item
+Some implementations of @code{lex} allow a rule's action to begin on
+a separate line, if the rule's pattern has trailing whitespace:
+
+@example
+%%
+foo|bar<space here>
+  @{ foobar_action(); @}
+@end example
+
+@code{flex} does not support this feature.
+
+@item
+The @code{lex} @samp{%r} (generate a Ratfor scanner) option is
+not supported.  It is not part of the POSIX
+specification.
+
+@item
+After a call to @samp{unput()}, @code{yytext} is undefined until
+the next token is matched, unless the scanner was
+built using @samp{%array}.  This is not the case with @code{lex}
+or the POSIX specification.  The @samp{-l} option does
+away with this incompatibility.
+
+@item
+The precedence of the @samp{@{@}} (numeric range) operator
+is different.  @code{lex} interprets "abc@{1,3@}" as "match
+one, two, or three occurrences of 'abc'", whereas
+@code{flex} interprets it as "match 'ab' followed by one,
+two, or three occurrences of 'c'".  The latter is
+in agreement with the POSIX specification.
+
+@item
+The precedence of the @samp{^} operator is different.  @code{lex}
+interprets "^foo|bar" as "match either 'foo' at the
+beginning of a line, or 'bar' anywhere", whereas
+@code{flex} interprets it as "match either 'foo' or 'bar'
+if they come at the beginning of a line".  The
+latter is in agreement with the POSIX specification.
+
+@item
+The special table-size declarations such as @samp{%a}
+supported by @code{lex} are not required by @code{flex} scanners;
+@code{flex} ignores them.
+
+@item
+The name FLEX_SCANNER is #define'd so scanners may
+be written for use with either @code{flex} or @code{lex}.
+Scanners also include @code{YY_FLEX_MAJOR_VERSION} and
+@code{YY_FLEX_MINOR_VERSION} indicating which version of
+@code{flex} generated the scanner (for example, for the
+2.5 release, these defines would be 2 and 5
+respectively).
+@end itemize
+
+The following @code{flex} features are not included in @code{lex} or the
+POSIX specification:
+
+@example
+C++ scanners
+%option
+start condition scopes
+start condition stacks
+interactive/non-interactive scanners
+yy_scan_string() and friends
+yyterminate()
+yy_set_interactive()
+yy_set_bol()
+YY_AT_BOL()
+<<EOF>>
+<*>
+YY_DECL
+YY_START
+YY_USER_ACTION
+YY_USER_INIT
+#line directives
+%@{@}'s around actions
+multiple actions on a line
+@end example
+
+@noindent
+plus almost all of the flex flags.  The last feature in
+the list refers to the fact that with @code{flex} you can put
+multiple actions on the same line, separated with
+semicolons, while with @code{lex}, the following
+
+@example
+foo    handle_foo(); ++num_foos_seen;
+@end example
+
+@noindent
+is (rather surprisingly) truncated to
+
+@example
+foo    handle_foo();
+@end example
+
+@code{flex} does not truncate the action.  Actions that are not
+enclosed in braces are simply terminated at the end of the
+line.
+
+@node Diagnostics, Files, Incompatibilities, Top
+@section Diagnostics
+
+@table @samp
+@item warning, rule cannot be matched
+indicates that the given
+rule cannot be matched because it follows other rules that
+will always match the same text as it.  For example, in
+the following "foo" cannot be matched because it comes
+after an identifier "catch-all" rule:
+
+@example
+[a-z]+    got_identifier();
+foo       got_foo();
+@end example
+
+Using @code{REJECT} in a scanner suppresses this warning.
+
+@item warning, -s option given but default rule can be matched
+means that it is possible (perhaps only in a particular
+start condition) that the default rule (match any single
+character) is the only one that will match a particular
+input.  Since @samp{-s} was given, presumably this is not
+intended.
+
+@item reject_used_but_not_detected undefined
+@itemx yymore_used_but_not_detected undefined
+These errors can
+occur at compile time.  They indicate that the scanner
+uses @code{REJECT} or @samp{yymore()} but that @code{flex} failed to notice the
+fact, meaning that @code{flex} scanned the first two sections
+looking for occurrences of these actions and failed to
+find any, but somehow you snuck some in (via a #include
+file, for example).  Use @samp{%option reject} or @samp{%option yymore}
+to indicate to flex that you really do use these features.
+
+@item flex scanner jammed
+a scanner compiled with @samp{-s} has
+encountered an input string which wasn't matched by any of
+its rules.  This error can also occur due to internal
+problems.
+
+@item token too large, exceeds YYLMAX
+your scanner uses @samp{%array}
+and one of its rules matched a string longer than the @samp{YYL-}
+@code{MAX} constant (8K bytes by default).  You can increase the
+value by #define'ing @code{YYLMAX} in the definitions section of
+your @code{flex} input.
+
+@item scanner requires -8 flag to use the character '@var{x}'
+Your
+scanner specification includes recognizing the 8-bit
+character @var{x} and you did not specify the -8 flag, and your
+scanner defaulted to 7-bit because you used the @samp{-Cf} or @samp{-CF}
+table compression options.  See the discussion of the @samp{-7}
+flag for details.
+
+@item flex scanner push-back overflow
+you used @samp{unput()} to push
+back so much text that the scanner's buffer could not hold
+both the pushed-back text and the current token in @code{yytext}.
+Ideally the scanner should dynamically resize the buffer
+in this case, but at present it does not.
+
+@item input buffer overflow, can't enlarge buffer because scanner uses REJECT
+the scanner was working on matching an
+extremely large token and needed to expand the input
+buffer.  This doesn't work with scanners that use @code{REJECT}.
+
+@item fatal flex scanner internal error--end of buffer missed
+This can occur in an scanner which is reentered after a
+long-jump has jumped out (or over) the scanner's
+activation frame.  Before reentering the scanner, use:
+
+@example
+yyrestart( yyin );
+@end example
+
+@noindent
+or, as noted above, switch to using the C++ scanner class.
+
+@item too many start conditions in <> construct!
+you listed
+more start conditions in a <> construct than exist (so you
+must have listed at least one of them twice).
+@end table
+
+@node Files, Deficiencies, Diagnostics, Top
+@section Files
+
+@table @file
+@item -lfl
+library with which scanners must be linked.
+
+@item lex.yy.c
+generated scanner (called @file{lexyy.c} on some systems).
+
+@item lex.yy.cc
+generated C++ scanner class, when using @samp{-+}.
+
+@item <FlexLexer.h>
+header file defining the C++ scanner base class,
+@code{FlexLexer}, and its derived class, @code{yyFlexLexer}.
+
+@item flex.skl
+skeleton scanner.  This file is only used when
+building flex, not when flex executes.
+
+@item lex.backup
+backing-up information for @samp{-b} flag (called @file{lex.bck}
+on some systems).
+@end table
+
+@node Deficiencies, See also, Files, Top
+@section Deficiencies / Bugs
+
+Some trailing context patterns cannot be properly matched
+and generate warning messages ("dangerous trailing
+context").  These are patterns where the ending of the first
+part of the rule matches the beginning of the second part,
+such as "zx*/xy*", where the 'x*' matches the 'x' at the
+beginning of the trailing context.  (Note that the POSIX
+draft states that the text matched by such patterns is
+undefined.)
+
+For some trailing context rules, parts which are actually
+fixed-length are not recognized as such, leading to the
+abovementioned performance loss.  In particular, parts
+using '|' or @{n@} (such as "foo@{3@}") are always considered
+variable-length.
+
+Combining trailing context with the special '|' action can
+result in @emph{fixed} trailing context being turned into the
+more expensive @var{variable} trailing context.  For example, in
+the following:
+
+@example
+%%
+abc      |
+xyz/def
+@end example
+
+Use of @samp{unput()} invalidates yytext and yyleng, unless the
+@samp{%array} directive or the @samp{-l} option has been used.
+
+Pattern-matching of NUL's is substantially slower than
+matching other characters.
+
+Dynamic resizing of the input buffer is slow, as it
+entails rescanning all the text matched so far by the
+current (generally huge) token.
+
+Due to both buffering of input and read-ahead, you cannot
+intermix calls to <stdio.h> routines, such as, for
+example, @samp{getchar()}, with @code{flex} rules and expect it to work.
+Call @samp{input()} instead.
+
+The total table entries listed by the @samp{-v} flag excludes the
+number of table entries needed to determine what rule has
+been matched.  The number of entries is equal to the
+number of DFA states if the scanner does not use @code{REJECT}, and
+somewhat greater than the number of states if it does.
+
+@code{REJECT} cannot be used with the @samp{-f} or @samp{-F} options.
+
+The @code{flex} internal algorithms need documentation.
+
+@node See also, Author, Deficiencies, Top
+@section See also
+
+@code{lex}(1), @code{yacc}(1), @code{sed}(1), @code{awk}(1).
+
+John Levine, Tony Mason, and Doug Brown: Lex & Yacc;
+O'Reilly and Associates.  Be sure to get the 2nd edition.
+
+M. E. Lesk and E. Schmidt, LEX - Lexical Analyzer Generator.
+
+Alfred Aho, Ravi Sethi and Jeffrey Ullman: Compilers:
+Principles, Techniques and Tools; Addison-Wesley (1986).
+Describes the pattern-matching techniques used by @code{flex}
+(deterministic finite automata).
+
+@node Author,  , See also, Top
+@section Author
+
+Vern Paxson, with the help of many ideas and much inspiration from
+Van Jacobson.  Original version by Jef Poskanzer.  The fast table
+representation is a partial implementation of a design done by Van
+Jacobson.  The implementation was done by Kevin Gong and Vern Paxson.
+
+Thanks to the many @code{flex} beta-testers, feedbackers, and
+contributors, especially Francois Pinard, Casey Leedom, Stan
+Adermann, Terry Allen, David Barker-Plummer, John Basrai, Nelson
+H.F. Beebe, @samp{benson@@odi.com}, Karl Berry, Peter A. Bigot,
+Simon Blanchard, Keith Bostic, Frederic Brehm, Ian Brockbank, Kin
+Cho, Nick Christopher, Brian Clapper, J.T. Conklin, Jason Coughlin,
+Bill Cox, Nick Cropper, Dave Curtis, Scott David Daniels, Chris
+G. Demetriou, Theo Deraadt, Mike Donahue, Chuck Doucette, Tom Epperly,
+Leo Eskin, Chris Faylor, Chris Flatters, Jon Forrest, Joe Gayda, Kaveh
+R. Ghazi, Eric Goldman, Christopher M.  Gould, Ulrich Grepel, Peer
+Griebel, Jan Hajic, Charles Hemphill, NORO Hideo, Jarkko Hietaniemi,
+Scott Hofmann, Jeff Honig, Dana Hudes, Eric Hughes, John Interrante,
+Ceriel Jacobs, Michal Jaegermann, Sakari Jalovaara, Jeffrey R. Jones,
+Henry Juengst, Klaus Kaempf, Jonathan I. Kamens, Terrence O Kane,
+Amir Katz, @samp{ken@@ken.hilco.com}, Kevin B. Kenny, Steve Kirsch,
+Winfried Koenig, Marq Kole, Ronald Lamprecht, Greg Lee, Rohan Lenard,
+Craig Leres, John Levine, Steve Liddle, Mike Long, Mohamed el Lozy,
+Brian Madsen, Malte, Joe Marshall, Bengt Martensson, Chris Metcalf,
+Luke Mewburn, Jim Meyering, R.  Alexander Milowski, Erik Naggum,
+G.T. Nicol, Landon Noll, James Nordby, Marc Nozell, Richard Ohnemus,
+Karsten Pahnke, Sven Panne, Roland Pesch, Walter Pelissero, Gaumond
+Pierre, Esmond Pitt, Jef Poskanzer, Joe Rahmeh, Jarmo Raiha, Frederic
+Raimbault, Pat Rankin, Rick Richardson, Kevin Rodgers, Kai Uwe Rommel,
+Jim Roskind, Alberto Santini, Andreas Scherer, Darrell Schiebel, Raf
+Schietekat, Doug Schmidt, Philippe Schnoebelen, Andreas Schwab, Alex
+Siegel, Eckehard Stolz, Jan-Erik Strvmquist, Mike Stump, Paul Stuart,
+Dave Tallman, Ian Lance Taylor, Chris Thewalt, Richard M. Timoney,
+Jodi Tsai, Paul Tuinenga, Gary Weik, Frank Whaley, Gerhard Wilhelms,
+Kent Williams, Ken Yap, Ron Zellar, Nathan Zelle, David Zuhn, and
+those whose names have slipped my marginal mail-archiving skills but
+whose contributions are appreciated all the same.
+
+Thanks to Keith Bostic, Jon Forrest, Noah Friedman, John Gilmore,
+Craig Leres, John Levine, Bob Mulcahy, G.T.  Nicol, Francois Pinard,
+Rich Salz, and Richard Stallman for help with various distribution
+headaches.
+
+Thanks to Esmond Pitt and Earle Horton for 8-bit character support;
+to Benson Margulies and Fred Burke for C++ support; to Kent Williams
+and Tom Epperly for C++ class support; to Ove Ewerlid for support of
+NUL's; and to Eric Hughes for support of multiple buffers.
+
+This work was primarily done when I was with the Real Time Systems
+Group at the Lawrence Berkeley Laboratory in Berkeley, CA.  Many thanks
+to all there for the support I received.
+
+Send comments to @samp{vern@@ee.lbl.gov}.
+
+@c @node Index,  , Top, Top
+@c @unnumbered Index
+@c
+@c @printindex cp
+
+@contents
+@bye
+
+@c Local variables:
+@c texinfo-column-for-description: 32
+@c End: