html2
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The various flex
options are categorized by function in the following
menu. If you want to lookup a particular option by name, See section Index of Scanner Options.
16.1 Options for Specifing Filenames | ||
16.2 Options Affecting Scanner Behavior | ||
16.3 Code-Level And API Options | ||
16.4 Options for Scanner Speed and Size | ||
16.5 Debugging Options | ||
16.6 Miscellaneous Options |
Even though there are many scanner options, a typical scanner might only specify the following options:
@verbatim %option 8bit reentrant bison-bridge %option warn nodefault %option yylineno %option outfile="scanner.c" header-file="scanner.h" |
The first line specifies the general type of scanner we want. The second line specifies that we are being careful. The third line asks flex to track line numbers. The last line tells flex what to name the files. (The options can be specified in any order. We just dividied them.)
flex
also provides a mechanism for controlling options within the
scanner specification itself, rather than from the flex command-line.
This is done by including %option
directives in the first section
of the scanner specification. You can specify multiple options with a
single %option
directive, and multiple directives in the first
section of your flex input file.
Most options are given simply as names, optionally preceded by the word `no' (with no intervening whitespace) to negate their meaning. The names are the same as their long-option equivalents (but without the leading `--' ).
flex
scans your rule actions to determine whether you use the
REJECT
or yymore()
features. The REJECT
and
yymore
options are available to override its decision as to
whether you use the options, either by setting them (e.g., %option
reject)
to indicate the feature is indeed used, or unsetting them to
indicate it actually is not used (e.g., %option noyymore)
.
A number of options are available for lint purists who want to suppress
the appearance of unneeded routines in the generated scanner. Each of
the following, if unset (e.g., %option nounput
), results in the
corresponding routine not appearing in the generated scanner:
@verbatim input, unput yy_push_state, yy_pop_state, yy_top_state yy_scan_buffer, yy_scan_bytes, yy_scan_string yyget_extra, yyset_extra, yyget_leng, yyget_text, yyget_lineno, yyset_lineno, yyget_in, yyset_in, yyget_out, yyset_out, yyget_lval, yyset_lval, yyget_lloc, yyset_lloc, yyget_debug, yyset_debug |
(though yy_push_state()
and friends won't appear anyway unless
you use %option stack)
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
%option header-file="FILE"
'
While in the header, the macro yyIN_HEADER
is defined, where `yy'
is substituted with the appropriate prefix.
The `--header-file' option is not compatible with the `--c++' option, since the C++ scanner provides its own header in `yyFlexLexer.h'.
%option outfile="FILE"
'
#line
directives (see the `-l' option above) refer to the file
`FILE'.
%option stdout
'
flex
to write the scanner it generates to standard
output instead of `lex.yy.c'.
flex
constructs its scanners. You'll never need this option unless you are doing
flex
maintenance or development.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
%option case-insensitive
'
flex
to generate a case-insensitive scanner. The
case of letters given in the flex
input patterns will be ignored,
and tokens in the input will be matched regardless of case. The matched
text given in yytext
will have the preserved case (i.e., it will
not be folded). For tricky behavior, see case and character ranges.
%option lex-compat
'
lex
implementation. Note that this does not mean full compatibility.
Use of this option costs a considerable amount of performance, and it
cannot be used with the `--c++', `--full', `--fast', `-Cf', or
`-CF' options. For details on the compatibilities it provides, see
20. Incompatibilities with Lex and Posix. This option also results in the name
YY_FLEX_LEX_COMPAT
being #define
'd in the generated scanner.
%option batch
'
flex
to generate a batch scanner, the opposite of
interactive scanners generated by `--interactive' (see below). In
general, you use `-B' when you are certain that your scanner
will never be used interactively, and you want to squeeze a
little more performance out of it. If your goal is instead to
squeeze out a lot more performance, you should be using the
`-Cf' or `-CF' options, which turn on `--batch' automatically
anyway.
%option interactive
'
flex
to generate an interactive scanner. An
interactive scanner is one that only looks ahead to decide what token
has been matched if it absolutely must. It turns out that always
looking one extra character ahead, even if the scanner has already seen
enough text to disambiguate the current token, is a bit faster than only
looking ahead when necessary. But scanners that always look ahead give
dreadful interactive performance; for example, when a user types a
newline, it is not recognized as a newline token until they enter
another token, which often means typing in another whole line.
flex
scanners default to interactive
unless you use the
`-Cf' or `-CF' table-compression options
(see section 17. Performance Considerations). That's because if you're looking for
high-performance you should be using one of these options, so if you
didn't, flex
assumes you'd rather trade off a bit of run-time
performance for intuitive interactive behavior. Note also that you
cannot use `--interactive' in conjunction with `-Cf' or
`-CF'. Thus, this option is not really needed; it is on by default
for all those cases in which it is allowed.
You can force a scanner to not be interactive by using `--batch'
%option 7bit
'
flex
to generate a 7-bit scanner, i.e., one which can
only recognize 7-bit characters in its input. The advantage of using
`--7bit' is that the scanner's tables can be up to half the size of
those generated using the `--8bit'. The disadvantage is that such
scanners often hang or crash if their input contains an 8-bit character.
Note, however, that unless you generate your scanner using the
`-Cf' or `-CF' table compression options, use of `--7bit'
will save only a small amount of table space, and make your scanner
considerably less portable. Flex
's default behavior is to
generate an 8-bit scanner unless you use the `-Cf' or `-CF',
in which case flex
defaults to generating 7-bit scanners unless
your site was always configured to generate 8-bit scanners (as will
often be the case with non-USA sites). You can tell whether flex
generated a 7-bit or an 8-bit scanner by inspecting the flag summary in
the `--verbose' output as described above.
Note that if you use `-Cfe' or `-CFe' flex
still
defaults to generating an 8-bit scanner, since usually with these
compression options full 8-bit tables are not much more expensive than
7-bit tables.
%option 8bit
'
flex
to generate an 8-bit scanner, i.e., one which can
recognize 8-bit characters. This flag is only needed for scanners
generated using `-Cf' or `-CF', as otherwise flex defaults to
generating an 8-bit scanner anyway.
See the discussion of
`--7bit'
above for flex
's default behavior and the tradeoffs between 7-bit
and 8-bit scanners.
%option default
'
%option always-interactive
'
isatty()
in an attempt to determine whether the scanner's input
source is interactive and thus should be read a character at a time.
When this option is used, however, then no such call is made.
--never-interactive
'
always-interactive
.
%option posix
'
lex
. Since flex
was originally designed to implement the
POSIX definition of lex
this generally involves very few changes
in behavior. At the current writing the known differences between
flex
and the POSIX standard are:
lex
, the repeat operator, `{}', has lower
precedence than concatenation (thus `ab{3}' yields `ababab').
Most POSIX utilities use an Extended Regular Expression (ERE) precedence
that has the precedence of the repeat operator higher than concatenation
(which causes `ab{3}' to yield `abbb'). By default, flex
places the precedence of the repeat operator higher than concatenation
which matches the ERE processing of other POSIX utilities. When either
`--posix' or `-l' are specified, flex
will use the
traditional AT&T and POSIX-compliant precedence for the repeat operator
where concatenation has higher precedence than the repeat operator.
%option stack
'
%option stdinit
'
yyin
and
yyout
to `stdin' and `stdout', instead of the default of
`NULL'. Some existing lex
programs depend on this behavior,
even though it is not compliant with ANSI C, which does not require
`stdin' and `stdout' to be compile-time constant. In a
reentrant scanner, however, this is not a problem since initialization
is performed in yylex_init
at runtime.
%option yylineno
'
flex
to generate a scanner
that maintains the number of the current line read from its input in the
global variable yylineno
. This option is implied by %option
lex-compat
. In a reentrant C scanner, the macro yylineno
is
accessible regardless of the value of %option yylineno
, however, its
value is not modified by flex
unless %option yylineno
is enabled.
%option yywrap
'
--noyywrap)
, makes the scanner not call
yywrap()
upon an end-of-file, but simply assume that there are no
more files to scan (until the user points `yyin' at a new file and
calls yylex()
again).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
%option ansi-definitions
'
%option noansi-definitions
is specified, then the obsolete style
is generated.
%option ansi-prototypes
'
noansi-prototypes
is specified, then
prototypes will have empty parameter lists.
%option bison-bridge
'
GNU bison
parser. The scanner has minor API changes for
bison
compatibility. In particular, the declaration of
yylex
is modified to take an additional parameter,
yylval
.
See section A.2 C Scanners with Bison Parsers.
%option bison-locations
'
GNU bison
%locations
are being used.
This means yylex
will be passed
an additional parameter, yylloc
. This option
implies %option bison-bridge
.
See section A.2 C Scanners with Bison Parsers.
%option noline
'
flex
not to generate
#line
directives. Without this option,
flex
peppers the generated scanner
with #line
directives so error messages in the actions will be correctly
located with respect to either the original
flex
input file (if the errors are due to code in the input file), or
`lex.yy.c'
(if the errors are
flex
's
fault -- you should report these sorts of errors to the email address
given in 2. Reporting Bugs).
%option reentrant
'
flex
scanners, non-reentrant flex
code must be modified before it is suitable for use with this option.
This option is not compatible with the `--c++' option.
The option `--reentrant' does not affect the performance of the scanner.
%option c++
'
%option array
'
%option pointer
'
yytext
should be a char *
, not an array.
This default is char *
.
%option prefix="PREFIX"
'
flex
for all
globally-visible variable and function names to instead be
`PREFIX'. For example, `--prefix=foo' changes the name of
yytext
to footext
. It also changes the name of the default
output file from `lex.yy.c' to `lex.foo.c'. Here is a partial
list of the names affected:
@verbatim yy_create_buffer yy_delete_buffer yy_flex_debug yy_init_buffer yy_flush_buffer yy_load_buffer_state yy_switch_to_buffer yyin yyleng yylex yylineno yyout yyrestart yytext yywrap yyalloc yyrealloc yyfree |
(If you are using a C++ scanner, then only yywrap
and
yyFlexLexer
are affected.) Within your scanner itself, you can
still refer to the global variables and functions using either version
of their name; but externally, they have the modified name.
This option lets you easily link together multiple
flex
programs into the same executable. Note, though, that using this
option also renames
yywrap()
,
so you now
must
either
provide your own (appropriately-named) version of the routine for your
scanner, or use
%option noyywrap
,
as linking with
`-lfl'
no longer provides one for you by default.
%option main
'
main()
program for the
scanner, which simply calls yylex()
. This option implies
noyywrap
(see below).
%option nounistd
'
isatty()
, read()
.)
If you wish to use these functions, you will have to inform your compiler where
to find them.
See option-always-interactive. See option-read.
%option yyclass="NAME"
'
flex
that you have derived foo
as a subclass of
yyFlexLexer
, so flex
will place your actions in the member
function foo::yylex()
instead of yyFlexLexer::yylex()
. It
also generates a yyFlexLexer::yylex()
member function that emits
a run-time error (by invoking yyFlexLexer::LexerError())
if
called. See section 18. Generating C++ Scanners.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
%option align
'
%option ecs
'
flex
to construct equivalence classes, i.e., sets
of characters which have identical lexical properties (for example, if
the only appearance of digits in the flex
input is in the
character class "[0-9]" then the digits '0', '1', ..., '9' will all be
put in the same equivalence class). Equivalence classes usually give
dramatic reductions in the final table/object file sizes (typically a
factor of 2-5) and are pretty cheap performance-wise (one array look-up
per character scanned).
flex
should not compress the tables by taking advantages of
similar transition functions for different states.
%option meta-ecs
'
flex
to construct
meta-equivalence classes,
which are sets of equivalence classes (or characters, if equivalence
classes are not being used) that are commonly used together. Meta-equivalence
classes are often a big win when using compressed tables, but they
have a moderate performance impact (one or two if
tests and one
array look-up per character scanned).
%option read
'
stdio
) for input. Instead of calling fread()
or
getc()
, the scanner will use the read()
system call,
resulting in a performance gain which varies from system to system, but
in general is probably negligible unless you are also using `-Cf'
or `-CF'. Using `-Cr' can cause strange behavior if, for
example, you read from `yyin' using stdio
prior to calling
the scanner (because the scanner will miss whatever text your previous
reads left in the stdio
input buffer). `-Cr' has no effect
if you define YY_INPUT()
(see section 9. The Generated Scanner).
The options `-Cf' or `-CF' and `-Cm' do not make sense together - there is no opportunity for meta-equivalence classes if the table is not being compressed. Otherwise the options may be freely mixed, and are cumulative.
The default setting is `-Cem', which specifies that flex
should generate equivalence classes and meta-equivalence classes. This
setting provides the highest degree of table compression. You can trade
off faster-executing scanners at the cost of larger tables with the
following generally being true:
@verbatim slowest & smallest -Cem -Cm -Ce -C -C{f,F}e -C{f,F} -C{f,F}a fastest & largest |
Note that scanners with the smallest tables are usually generated and compiled the quickest, so during development you will usually want to use the default, maximal compression.
`-Cfe' is often a good compromise between speed and size for production scanners.
%option full
'
stdio
is bypassed.
The result is large but fast. This option is equivalent to
`--Cfr'
%option fast
'
stdio
bypassed). This representation is about as fast
as the full table representation `--full', and for some sets of
patterns will be considerably smaller (and for others, larger). In
general, if the pattern set contains both keywords and a
catch-all, identifier rule, such as in the set:
@verbatim "case" return TOK_CASE; "switch" return TOK_SWITCH; ... "default" return TOK_DEFAULT; [a-z]+ return TOK_ID; |
then you're better off using the full table representation. If only the identifier rule is present and you then use a hash table or some such to detect the keywords, you're better off using `--fast'.
This option is equivalent to `-CFr' (see below). It cannot be used with `--c++'.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
%option backup
'
-CF
is used, the generated scanner will run faster (see the `--perf-report' flag).
Only users who wish to squeeze every last cycle out of their scanners
need worry about this option. (see section 17. Performance Considerations).
%option debug
'
yy_flex_debug
is non-zero
(which is the default), the scanner will write to `stderr' a line
of the form:
@verbatim -accepting rule at line 53 ("the matched text") |
The line number refers to the location of the rule in the file defining the scanner (i.e., the file that was fed to flex). Messages are also generated when the scanner backs up, accepts the default rule, reaches the end of its input buffer (or encounters a NUL; at this point, the two look the same as far as the scanner's concerned), or reaches an end-of-file.
%option perf-report
'
flex
input file which will
cause a serious loss of performance in the resulting scanner. If you
give the flag twice, you will also get comments regarding features that
lead to minor performance losses.
Note that the use of REJECT
, and
variable trailing context (see section 24. Limitations) entails a substantial
performance penalty; use of yymore()
, the `^' operator, and
the `--interactive' flag entail minor performance penalties.
%option nodefault
'
%option trace
'
flex
run in trace mode. It will generate a lot of
messages to `stderr' concerning the form of the input and the
resultant non-deterministic and deterministic finite automata. This
option is mostly for use in maintaining flex
.
%option nowarn
'
%option verbose
'
flex
should write to `stderr' a summary of
statistics regarding the scanner it generates. Most of the statistics
are meaningless to the casual flex
user, but the first line
identifies the version of flex
(same as reported by `--version'),
and the next line the flags used when generating the scanner, including
those that are on by default.
%option warn
'
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
flex
's options to `stdout'
and then exits.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |