JLex - PowerPoint PPT Presentation

About This Presentation
Title:

JLex

Description:

JLex Lecture 4 Mon, Jan 24, 2005 JLex JLex is a lexical analyzer generator in Java. It is based on the well-known lex, which is a lexical analyzer generator in C. – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 20
Provided by: RobbKo5
Learn more at: https://people.hsc.edu
Category:
Tags: directive | jlex

less

Transcript and Presenter's Notes

Title: JLex


1
JLex
  • Lecture 4
  • Mon, Jan 24, 2005

2
JLex
  • JLex is a lexical analyzer generator in Java.
  • It is based on the well-known lex, which is a
    lexical analyzer generator in C.
  • The gnu lexical analyzer flex is also based on
    lex.
  • JLex reads a description of a set of tokens and
    outputs a Java program that will process those
    tokens.

3
The JLex Input File
  • The input file to JLex uses the extension .lex.
  • The file is divided into three parts.
  • User code
  • JLex directives
  • Regular expression rules
  • These three sections are separated by .

4
JLex User Code
  • See Section 2.1 of the JLex Users Manual.
  • Any code written in the user-code section is
    copied directly into the Java source file created
    by JLex.
  • JLex creates a class named Yylex, which is at the
    heart of the lexer. The user code is not
    incorporated into this class.

5
JLex Directives
  • See Section 2.2 of the JLex Users Manual.
  • Any code bracketed within and is copied
    directly into the Yylex class, at the beginning.
  • Although this code is incorporated into the Yylex
    class, it is not incorporated into any Yylex
    member function.
  • Thus, we may define Yylex class variables or
    additional member functions.

6
The init Directive
  • Code bracketed within init and init is copied
    into the Yylex default constructor, which is
    called on by the other constructors.

init System.out.println("In the
constructor") init
7
The eof Directive
  • Code bracketed within eof and eof is copied
    into the Yylex function yy_do_eof(), which is
    called once upon end of file.

eof System.out.println("In
yy_do_eof()") eof
8
JLex Token Types
  • Unless we specify otherwise, the data type of the
    returned tokens is Yytoken.
  • This class is not created automatically.
  • We may change the return type to int by typing
    the directive integer.
  • We may change the return type to Integer by
    typing the directive intwrap.
  • We may set the return type to any other type by
    using the directive type.

9
JLex Token Types
  • If the return type is Yytoken or Integer, then
    the EOF token is null.
  • If the return type is int, then the EOF token is
    -1.
  • For any other type, we need to specify the EOF
    value.

10
JLex EOF Value
  • By using the eofval directive, we may indicate
    what value to return upon EOF.
  • We write

eofval return new type(value) eofval
11
JLex Regular Expression Rules
  • Each regular expression rule consists of a
    regular expression followed by an associated
    action.
  • The associated action is a segment of Java code,
    enclosed in braces .
  • Typically, the action will be to return the
    appropriate token.

12
JLex Regular Expressions
  • Regular expressions are expressed using ASCII
    characters (0 127).
  • The following characters are metacharacters.
  • ? ( ) . \
  • Metacharacters have special meaning they do not
    represent themselves.
  • All other characters represent themselves.

13
JLex Regular Expressions
  • Let r and s be regular expressions.
  • r? matches zero or one occurrences of r.
  • r matches zero or more occurrences of r.
  • r matches one or more occurrences of r.
  • rs matches r or s.
  • rs matches r concatenated with s.

14
JLex Regular Expressions
  • Parentheses are used for grouping.
  • ("""-")?
  • If a regular expression begins with , then it is
    matched only at the beginning of a line.
  • If a regular expression ends with , then it is
    matched only at the end of a line.
  • The dot . matches any non-newline character.

15
JLex Regular Expressions
  • Brackets match any single character listed
    within the brackets.
  • abc matches a or b or c.
  • A-Za-z matches any letter.
  • If the first character after is , then the
    brackets match any character except those listed.
  • A-Za-z matches any nonletter.

16
JLex Regular Expressions
  • A single character within double quotes " or
    after \ represents itself.
  • Metacharacters lose their special meaning and
    represent themselves when they stand alone within
    single quotes or follow \.
  • "?" and \? match ?.

17
JLex Escape Sequences
  • Some escape sequences.
  • \n matches newline.
  • \b matches backspace.
  • \r matches carriage return.
  • \t matches tab.
  • \f matches formfeed.
  • If c is not a special escape-sequence character,
    then \c matches c.

18
Running JLex
  • The lexical analyzer generator is the Main class
    in the JLex folder.
  • To create a lexical analyzer from the file
    filename.lex, type
  • java JLex.Main filename.lex
  • This produces a file filename.lex.java, which
    must be compiled to create the lexical analyzer.

19
Running the Lexical Analyzer
  • To run the lexical analyzer, a Yylex object must
    first be created.
  • The Yylex constructor has one parameter
    specifying an input stream.
  • For example
  • Yylex lexer new Yylex(System.in)
  • Then, calls to the yylex() member function will
    return tokens.
  • token lexer.yylex()
Write a Comment
User Comments (0)
About PowerShow.com