Title: CS 2130
1CS 2130
- Presentation 18
- Tools
- Lex
2Tools
- An important skill for a computer scientist is
knowing when to code the solution to a problem
and when to use a tool - It wasn't always this way
- A key to using tools is getting past the learning
curve - Some of the most useful tools are those which
produce usable code
3lex
- lex is a lexical analyzer generator
- It doesn't do lexical analysis. It writes code
that will perform lexical analysis for you - A program which can perform lexical analysis is
useful. - A program that can generate code for a custom
lexical analyzer that can be embedded into an
application you are creating is a gem
4Why learn lex?
- For some it will be a useful program that they
will use over and over - Others may use it but infrequently
- It appears in want ads!!!
- http//www.appforge.com/corp/careers/sfw_engineer.
html - The concepts involved with learning lex get at
the core material for this course. Even if you
never use lex again knowledge of its operation
will help you to better understand the
translation process
5More Info?
lex yacc, 2nd Edition By John Levine, Tony
Mason Doug Brown2nd Edition October
1992 1-56592-000-7 Order Number 0007386 pages,
29.95
http//www.oreilly.com/catalog/lex/
Note Some code taken from O'Reilly website
6lex process
- Create a specification in a file called scan.l
- lex processes scan.l and produces lex.yy.c
- The c compiler can turn lex.yy.c into a.out
- lex scan.l
- cc lex.yy.c -ll
- (-ll means link with lex library, use lfl if
using flex) - lex contains a function yylex( ) which does
actual lexical analysis
or gcc
7lex file format
- ltDefinitionsgt
- ...
- ...
-
- ltRulesgt
- ...
- ...
-
- ltSupplementary codegt
- ...
- ...
includes defines RegExps
Pattern/Action Pairs ltpattern1gt
ltaction1gt ltpattern2gt ltaction2gt
Additional code (Not always needed)
8The Simplest Lex Program
Put this code in a file called scan.l Run
lex lex scan.l Compile gcc lex.yy.c -ll Run by
typing a.out or a.out lt somefile.txt
This form will read from stdin. To terminate
type ctrl/d
9The Simplest Lex Program
Put this code in a file called scan.l Run
lex lex scan.l Compile gcc lex.yy.c -ll Run by
typing a.out or a.out lt somefile.txt
This form will read from stdin. To terminate
type ctrl/d
10The Simplest Lex Program
We got this by default!
11lex example
A definition involving a regular expression
- wspc \t\n
-
- wspc output( ' ' )
Reduce all whitespace to a single space. Note
This works on acme. On linux boxes e.g.
helsinki substitute putc(' ', yyout) for output(
' ' )
Note the curlies meaning substitute the
definition of wspc here
12lex example
-
- include ltstdio.hgt
- include ltctype.hgt
-
- word -'A-Za-z
-
- word printf("cs", toupper(yytext),
- yytext1)
How to include c code
13lex example
-
- include ltstdio.hgt
- include ltctype.hgt
-
- word -'A-Za-z
-
- word printf("cs", toupper(yytext),
- yytext1)
yytext is an Internal variable containing text
of word matched
Capitalize first letter of each word leaving
remainder of text unchanged the 777 hits
becomes The 777 Hits
14lex
- Text not matched is echoed as read
- Thus, there is an implied ECHO
- Which can be supressed. How?
- Lex patterns only match a given input character
or string once - Lex executes the action for the longest possible
match for the current input.
15To be more specific...
- If you don't specifiy a main you get one for
free!!! - If you call yylex it will start scanning the
appropriate input and as it recognizes rules do
the specified action - Example
- AAA printf("ltFound 3 A'sgt")
- AA printf("ltFound 2 A'sgt")
- Given AAAAAAAA
- Will print
- ltFound 3 A'sgtltFound 3 A'sgtltFound 2 A'sgt
- The scanning continues unless a value is returned!
16lex example
-
- include ltstdio.hgt
- static int lineno 0
-
-
- \n\n printf( "5d ", lineno ) ECHO
Print out file with line numbers
17another way
-
- include ltstdio.hgt
- static int lineno 0
-
- line \n\n
-
- line printf( "5d s", lineno, yytext )
Print out file with line numbers
18Or
-
- include ltstdio.hgt
- static int lineno 0
-
- line .\n
-
- line printf( "5d s", lineno, yytext )
Print out file with line numbers
19Or even
-
- include ltstdio.hgt
- static int lineno 0
-
- line .\n
-
- line printf("/ 5d / s", lineno, yytext )
Print out file with line numbers commented for c
20another example
-
- include "defs.h"
- static char BigLine NULL
- static int BigLineLen -1
-
- line \n\n
-
- line if( yyleng gt BigLineLen )
- free( BigLine )
- BigLineLen
- ( BigLine strdup( yytext ) )
- NULL ? -1 yyleng
-
-
- int yywrap( void )
- PRINTF(
- ("s",( BigLine NULL ) ? "" BigLine ))
- return 1
-
yywrap gets called at the end of input
21count chars, words, lines
-
- include ltstdio.hgt
- static int words 0, lines 0, chars 0
-
- word -'A-Za-z
-
- word words chars yyleng
- \n lines chars
- . chars
-
- int yywrap( void )
- printf( "8u8u8u\n", lines, words, chars )
- return 1
22Distribution of word lengths
23-
- include "defs.h"
- define MAX_WORD_LEN 100
- static unsigned int WrdLengArr MAX_WORD_LEN
- static unsigned int WrdLengSum, NumWords
-
- word \t\n
-
- word if( yyleng lt MAX_WORD_LEN )
-
- WrdLengArr yyleng
- WrdLengSum yyleng
- NumWords
-
- .\n / do nothing /
24- int yywrap( void )
- int i
- PRINTF(( "Length\tFrqncy\n" ))
- for( i 0 i lt MAX_WORD_LEN i )
- if(WrdLengArr i ! 0 )
-
- PRINTF(( "4u\t4u\n", i, WrdLengArr i
)) -
- PRINTF((" Avg\t0.2f\n", (float)WrdLengSum /
NumWords)) - return 1
-
25Word Replace
26-
- include "defs.h"
- define ARG( n ) ( argc lt (n) ? "" argv (n)
) - static char SearchWord
- static char InsertWord
-
- word -a-zA-Z
- num 0-9
- punct !.,()
-
27- punct
- num
- word
- if( strcmp( yytext, SearchWord )
0 ) -
- PRINTF(( "s", InsertWord ))
-
- else
-
- PRINTF(( "s", yytext ))
-
-
28- int main( int argc, char argv )
-
- const char OutFile "output.txt"
- char InFile ARG( 1 )
- SearchWord ARG( 2 )
- InsertWord ARG( 3 )
- if((yyin freopen( InFile, "r", stdin ) )
NULL -
- (yyout freopen( OutFile, "w", stdout ) )
NULL ) -
- ERR_MSG( freopen )
- return EXIT_FAILURE
-
- return ( yylex( ) 0 ) ? EXIT_SUCCESS
EXIT_FAILURE -
29Questions?
30(No Transcript)