summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorVern Paxson <vern@ee.lbl.gov>1995-03-06 15:53:22 +0000
committerVern Paxson <vern@ee.lbl.gov>1995-03-06 15:53:22 +0000
commita564f393fa20da7ed88f0e2bd69ef5fff253db46 (patch)
tree4bda63b8e3b5db7078ea810329b2610db555fd1f
parenteeb098f88409fbe0e4e921cf3c8be43b0d87909c (diff)
2.5.0.7
-rw-r--r--NEWS104
-rw-r--r--flex.1211
2 files changed, 287 insertions, 28 deletions
diff --git a/NEWS b/NEWS
index 8a03bb1..035e045 100644
--- a/NEWS
+++ b/NEWS
@@ -1,4 +1,4 @@
-Changes between release 2.5.0.5 (10Jan95) and release 2.4.7:
+Changes between release 2.5.0.7 (06Mar95) and release 2.4.7:
- A new concept of "start condition" scope has been introduced.
A start condition scope is begun with:
@@ -92,10 +92,12 @@ Changes between release 2.5.0.5 (10Jan95) and release 2.4.7:
reject
yymore
- Two %option's take string-delimited values, offset with '=':
+ Three %option's take string-delimited values, offset with '=':
outfile="<name>" equivalent to -o<name>
prefix="<name>" equivalent to -P<name>
+ yyclass="<name>" set the name of the C++ scanning class
+ (see below)
A number of %option's are available for lint purists who
want to suppress the appearance of unneeded routines in
@@ -205,7 +207,7 @@ Changes between release 2.5.0.5 (10Jan95) and release 2.4.7:
"interactive". An interactive buffer is processed more slowly,
but must be used when the scanner's input source is indeed
interactive to avoid problems due to waiting to fill buffers
- (see the discussion of the -I flag in flexdoc). A non-zero value
+ (see the discussion of the -I flag in flex.1). A non-zero value
in the macro invocation marks the buffer as interactive, a zero
value as non-interactive. Note that use of this macro overrides
"%option always-interactive" or "%option never-interactive".
@@ -239,19 +241,68 @@ Changes between release 2.5.0.5 (10Jan95) and release 2.4.7:
- The macro YY_AT_BOL() returns true if the next token scanned from
the current buffer will have '^' rules active, false otherwise.
- - Flex now generates #line directives relating the code it
- produces to the output file; this means that error messages
- in the flex-generated code should be correctly pinpointed.
+ - The new function
+
+ void yy_flush_buffer( struct yy_buffer_state* b )
+
+ flushes the contents of the current buffer (i.e., next time
+ the scanner attempts to match a token using b as the current
+ buffer, it will begin by invoking YY_INPUT to fill the buffer).
+ This routine is also available to C++ scanners (unlike some
+ of the other new routines).
+
+ The related macro
+
+ YY_FLUSH_BUFFER
+
+ flushes the contents of the current buffer.
- A new "-ooutput" option writes the generated scanner to "output".
If used with -t, the scanner is still written to stdout, but
its internal #line directives (see previous item) use "output".
+ - Flex now generates #line directives relating the code it
+ produces to the output file; this means that error messages
+ in the flex-generated code should be correctly pinpointed.
+
- When generating #line directives, filenames with embedded '\'s
have those characters escaped (i.e., turned into '\\'). This
feature helps with reporting filenames for some MS-DOS and OS/2
systems.
+ - The FlexLexer class includes two new public member functions:
+
+ virtual void switch_streams( istream* new_in = 0,
+ ostream* new_out = 0 )
+
+ reassigns yyin to new_in (if non-nil) and yyout to new_out
+ (ditto), deleting the previous input buffer if yyin is
+ reassigned. It is used by:
+
+ int yylex( istream* new_in = 0, ostream* new_out = 0 )
+
+ which first calls switch_streams() and then returns the value
+ of calling yylex().
+
+ - C++ scanners now have yy_flex_debug as a member variable of
+ FlexLexer rather than a global.
+
+ - When generating a C++ scanning class, you can now use
+
+ %option yyclass="foo"
+
+ to inform flex that you have derived "foo" as a subclass of
+ yyFlexLexer, so flex will place your actions in the member
+ function foo::yylex() instead of yyFlexLexer::yylex(). It also
+ generates a yyFlexLexer::yylex() member function that generates a
+ run-time error if called (by invoking yyFlexLexer::LexerError()).
+ This feature is necessary if your subclass "foo" introduces some
+ additional member functions or variables that you need to access
+ from yylex().
+
+ - Current texinfo files in MISC/texinfo, contributed by Francois
+ Pinard.
+
- You can now change the name "flex" to something else (e.g., "lex")
by redefining $(FLEX) in the Makefile.
@@ -270,7 +321,7 @@ Changes between release 2.5.0.5 (10Jan95) and release 2.4.7:
- C++ scanner objects now work with the -P option. You include
<FlexLexer.h> once per scanner - see comments in <FlexLexer.h>
- (or flexdoc) for details.
+ (or flex.1) for details.
- C++ FlexLexer objects now use the "cerr" stream to report -d output
instead of stdio.
@@ -286,10 +337,39 @@ Changes between release 2.5.0.5 (10Jan95) and release 2.4.7:
- Scanners generated using -l lex compatibility now have the symbol
YY_FLEX_LEX_COMPAT #define'd.
+ - When initializing (i.e., yy_init is non-zero on entry to yylex()),
+ generated scanners now set yy_init to zero before executing
+ YY_USER_INIT. This means that you can set yy_init back to a
+ non-zero value in YY_USER_INIT if you need the scanner to be
+ reinitialized on the next call.
+
- You can now use "#line" directives in the first section of your
scanner specification.
- - Improved support for MS-DOS.
+ - Improved support for MS-DOS. The flex sources have been successfully
+ built, unmodified, for Borland 4.02 (all that's required is a
+ Borland Makefile and config.h file, which are supplied in
+ MISC/Borland - contributed by Terrence O Kane).
+
+ - Improved support for Macintosh using Think C - the sources should
+ build for this platform "out of the box". Contributed by Scott
+ Hofmann.
+
+ - Improved support for VMS, in MISC/VMS/, contributed by Pat Rankin.
+
+ - Support for the Amiga, in MISC/Amiga/, contributed by Andreas
+ Scherer. Note that the contributed files were developed for
+ flex 2.4 and have not been tested with flex 2.5.
+
+ - Some notes on support for the NeXT, in MISC/NeXT, contributed
+ by Raf Schietekat.
+
+ - The MISC/ directory now includes a preformatted version of flex.1
+ in flex.man, and pre-yacc'd versions of parse.y in parse.{c,h}.
+
+ - The flex.1 and flexdoc.1 manual pages have been merged. There
+ is now just one document, flex.1, which includes an overview
+ at the beginning to help you find the section you need.
- Documentation now clarifies that start conditions persist across
switches to new input files or different input buffers. If you
@@ -313,6 +393,14 @@ Changes between release 2.5.0.5 (10Jan95) and release 2.4.7:
- Documentation now stresses that you gain the benefits of removing
backing-up states only if you remove *all* of them.
+ - Documentation now points out that traditional lex allows you
+ to put the action on a separate line from the rule pattern if
+ the pattern has trailing whitespace (ugh!), but flex doesn't
+ support this.
+
+ - A broken example in documentation of the difference between
+ inclusive and exclusive start conditions is now fixed.
+
- Usage (-h) report now goes to stdout.
- Version (-V) info now goes to stdout.
diff --git a/flex.1 b/flex.1
index 68461c0..636c1c7 100644
--- a/flex.1
+++ b/flex.1
@@ -380,8 +380,16 @@ expressions. These are:
r/s an r but only if it is followed by an s. The
- s is not part of the matched text. This type
- of pattern is called as "trailing context".
+ text matched by s is included when determining
+ whether this rule is the "longest match",
+ but is then returned to the input before
+ the action is executed. So the action only
+ sees the text matched by r. This type
+ of pattern is called trailing context".
+ (There are some combinations of r/s that flex
+ cannot match correctly; see notes in the
+ Deficiencies / Bugs section below regarding
+ "dangerous trailing context".)
^r an r, but only at the beginning of a line (i.e.,
which just starting to scan, or right after a
newline has been scanned).
@@ -970,6 +978,16 @@ in order to avoid a name clash with the
stream by the name of
.I input.)
.IP -
+.B YY_FLUSH_BUFFER
+flushes the scanner's internal buffer
+so that the next time the scanner attempts to match a token, it will
+first refill the buffer using
+.B YY_INPUT
+(see The Generated Scanner, below). This action is a special case
+of the more general
+.B yy_flush_buffer()
+function, described below in the section Multiple Input Buffers.
+.IP -
.B yyterminate()
can be used in lieu of a return statement in an action. It terminates
the scanner and returns a 0 to the scanner's caller, indicating "all done".
@@ -1048,7 +1066,17 @@ of
and because it can be used to switch input files in the middle of scanning.
It can also be used to throw away the current input buffer, by calling
it with an argument of
-.I yyin.
+.I yyin;
+but better is to use
+.B YY_FLUSH_BUFFER
+(see above).
+Note that
+.B yyrestart()
+does
+.I not
+reset the start condition to
+.B INITIAL
+(see Start Conditions, below).
.PP
If
.B yylex()
@@ -1109,9 +1137,15 @@ it does
revert to
.B INITIAL.
.PP
-The default
+If you do not supply your own version of
+.B yywrap(),
+then you must either use
+.B %option noyywrap
+(in which case the scanner behaves as though
.B yywrap()
-always returns 1.
+returned 1), or you must link with
+.B \-lfl
+to obtain the default version of the routine, which always returns 1.
.PP
Three routines are available for scanning from in-memory buffers rather
than files:
@@ -1193,7 +1227,10 @@ connection between the two. The set of rules:
%s example
%%
- <example>foo /* do something */
+
+ <example>foo do_something();
+
+ bar something_else();
.fi
is equivalent to
@@ -1201,9 +1238,34 @@ is equivalent to
%x example
%%
- <INITIAL,example>foo /* do something */
+
+ <example>foo do_something();
+
+ <INITIAL,example>bar something_else();
.fi
+Without the
+.B <INITIAL,example>
+qualifier, the
+.I bar
+pattern in the second example wouldn't be active (i.e., couldn't match)
+when in start condition
+.B example.
+If we just used
+.B <example>
+to qualify
+.I bar,
+though, then it would only be active in
+.B example
+and not in
+.B INITIAL,
+while in the first example it's active in both, because in the first
+example the
+.B example
+startion condition is an
+.I inclusive
+.B (%s)
+start condition.
.PP
Also note that the special start-condition specifier
.B <*>
@@ -1213,13 +1275,22 @@ have been written;
%x example
%%
- <*>foo /* do something */
+
+ <example>foo do_something();
+
+ <*>bar something_else();
.fi
.PP
The default rule (to
.B ECHO
-any unmatched character) remains active in start conditions.
+any unmatched character) remains active in start conditions. It
+is equivalent to:
+.nf
+
+ <*>.|\\n ECHO;
+
+.fi
.PP
.B BEGIN(0)
returns to the original state where only the rules with
@@ -1571,6 +1642,16 @@ change the start condition.
.fi
is used to reclaim the storage associated with a buffer.
+You can also clear the current contents of a buffer using:
+.nf
+
+ void yy_flush_buffer( YY_BUFFER_STATE buffer )
+
+.fi
+This function discards the buffer's contents,
+so the next time the scanner attempts to match a token from the
+buffer, it will first fill the buffer anew using
+.B YY_INPUT.
.PP
.B yy_new_buffer()
is an alias for
@@ -2036,7 +2117,7 @@ The result is large but fast. This option is equivalent to
generates a "help" summary of
.I flex's
options to
-.I stderr
+.I stdout
and then exits.
.B \-?
and
@@ -2263,7 +2344,7 @@ finite automata. This option is mostly for use in maintaining
.TP
.B \-V
prints the version number to
-.I stderr
+.I stdout
and exits.
.B \-\-version
is a synonym for
@@ -2502,6 +2583,7 @@ Here are all of the names affected:
yy_delete_buffer
yy_flex_debug
yy_init_buffer
+ yy_flush_buffer
yy_load_buffer_state
yy_switch_to_buffer
yyin
@@ -2666,7 +2748,7 @@ unsetting them to indicate it actually is not used
(e.g.,
.B %option noyymore).
.PP
-Two options take string-delimited values, offset with '=':
+Three options take string-delimited values, offset with '=':
.nf
%option outfile="ABC"
@@ -2682,6 +2764,32 @@ and
.fi
is equivalent to
.B -PXYZ.
+Finally,
+.nf
+
+ %option yyclass="foo"
+
+.fi
+only applies when generating a C++ scanner (
+.B \-+
+option). It informs
+.I flex
+that you have derived
+.B foo
+as a subclass of
+.B yyFlexLexer,
+so
+.I flex
+will place your actions in the member function
+.B foo::yylex()
+instead of
+.B yyFlexLexer::yylex().
+It also generates a
+.B yyFlexLexer::yylex()
+member function that emits a run-time error (by invoking
+.B yyFlexLexer::LexerError())
+if called.
+See Generating C++ Scanners, below, for additional information.
.PP
A number of options are available for lint purists who want to suppress
the appearance of unneeded routines in the generated scanner. Each of the
@@ -3078,6 +3186,7 @@ Also provided are member functions equivalent to
.B istream*
object pointer and not a
.B FILE*),
+.B yy_flush_buffer(),
.B yy_delete_buffer(),
and
.B yyrestart()
@@ -3108,7 +3217,54 @@ respectively.
performs the same role is
.B yylex()
does for ordinary flex scanners: it scans the input stream, consuming
-tokens, until a rule's action returns a value.
+tokens, until a rule's action returns a value. If you derive a subclass
+.B S
+from
+.B yyFlexLexer
+and want to access the member functions and variables of
+.B S
+inside
+.B yylex(),
+then you need to use
+.B %option yyclass="S"
+to inform
+.I flex
+that you will be using that subclass instead of
+.B yyFlexLexer.
+In this case, rather than generating
+.B yyFlexLexer::yylex(),
+.I flex
+generates
+.B S::yylex()
+(and also generates a dummy
+.B yyFlexLexer::yylex()
+that calls
+.B yyFlexLexer::LexerError()
+if called).
+.TP
+.B
+virtual void switch_streams(istream* new_in = 0,
+.B
+ostream* new_out = 0)
+reassigns
+.B yyin
+to
+.B new_in
+(if non-nil)
+and
+.B yyout
+to
+.B new_out
+(ditto), deleting the previous input buffer if
+.B yyin
+is reassigned.
+.TP
+.B
+int yylex( istream* new_in = 0, ostream* new_out = 0 )
+first switches the input streams via
+.B switch_streams( new_in, new_out )
+and then returns the value of
+.B yylex().
.PP
In addition,
.B yyFlexLexer
@@ -3421,6 +3577,20 @@ behavior of no parentheses around the definition.
.IP
The POSIX specification is that the definition be enclosed in parentheses.
.IP -
+Some implementations of
+.I lex
+allow a rule's action to begin on a separate line, if the rule's pattern
+has trailing whitespace:
+.nf
+
+ %%
+ foo|bar<space here>
+ { foobar_action(); }
+
+.fi
+.I flex
+does not support this feature.
+.IP -
The
.I lex
.B %r
@@ -3773,20 +3943,20 @@ Thanks to the many
.I flex
beta-testers, feedbackers, and contributors, especially Francois Pinard,
Casey Leedom,
-David Barker-Plummer, Nelson H.F. Beebe, benson@odi.com,
+Terry Allen, David Barker-Plummer, Nelson H.F. Beebe, benson@odi.com,
Karl Berry, Peter A. Bigot, Simon Blanchard,
-Keith Bostic, Frederic Brehm, Kin Cho, Nick Christopher,
+Keith Bostic, Frederic Brehm, Ian Brockbank, Kin Cho, Nick Christopher,
Brian Clapper, J.T. Conklin,
-Jason Coughlin, Bill Cox, Dave Curtis, Scott David
+Jason Coughlin, Bill Cox, Nick Cropper, Dave Curtis, Scott David
Daniels, Chris G. Demetriou, Theo Deraadt,
Mike Donahue, Chuck Doucette, Tom Epperly, Leo Eskin,
-Chris Faylor, Chris Flatters, Jon Forrest, Kaveh R. Ghazi,
+Chris Faylor, Chris Flatters, Jon Forrest, Joe Gayda, Kaveh R. Ghazi,
Eric Goldman, Christopher M. Gould, Ulrich Grepel, Peer Griebel,
Jan Hajic, Charles Hemphill, NORO Hideo,
Jarkko Hietaniemi, Scott Hofmann,
Jeff Honig, Dana Hudes, Eric Hughes, John Interrante,
Ceriel Jacobs, Michal Jaegermann, Sakari Jalovaara, Jeffrey R. Jones,
-Henry Juengst, Klaus Kaempf, Jonathan I. Kamens,
+Henry Juengst, Klaus Kaempf, Jonathan I. Kamens, Terrence O Kane,
Amir Katz, ken@ken.hilco.com, Kevin B. Kenny,
Steve Kirsch, Winfried Koenig, Marq Kole, Ronald Lamprecht,
Greg Lee, Rohan Lenard, Craig Leres, John Levine, Steve Liddle, Mike Long,
@@ -3794,11 +3964,12 @@ Mohamed el Lozy, Brian Madsen, Malte, Joe Marshall,
Bengt Martensson, Chris Metcalf,
Luke Mewburn, Jim Meyering, R. Alexander Milowski, Erik Naggum,
G.T. Nicol, Landon Noll, Marc Nozell,
-Richard Ohnemus, Sven Panne, Roland Pesch, Walter Pelissero, Gaumond
+Richard Ohnemus, Karsten Pahnke,
+Sven Panne, Roland Pesch, Walter Pelissero, Gaumond
Pierre, Esmond Pitt, Jef Poskanzer, Joe Rahmeh, Jarmo Raiha,
Frederic Raimbault, Pat Rankin, Rick Richardson,
Kevin Rodgers, Kai Uwe Rommel, Jim Roskind, Alberto Santini,
-Darrell Schiebel, Raf Schietekat,
+Andreas Scherer, Darrell Schiebel, Raf Schietekat,
Doug Schmidt, Philippe Schnoebelen, Andreas Schwab,
Alex Siegel, Eckehard Stolz, Jan-Erik Strvmquist,
Mike Stump, Paul Stuart, Dave Tallman, Ian Lance Taylor,