JavaCUP - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

JavaCUP

Description:

Add precedence and associativity. left means, that a b c is parsed as (a b) c ... Without precedence JavaCUP will tell us: Shift/Reduce conflict found ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 31
Provided by: jiang79
Category:

less

Transcript and Presenter's Notes

Title: JavaCUP


1
JavaCUP
  • JavaCUP (Construct Useful Parser) is a parser
    generator
  • Produce a parser written in java, itself is also
    written in Java
  • There are many parser generators.
  • YACC (Yet Another Compiler-Compiler) for C
    programming language (dragon book chapter 4.9)
  • There are also many parser generators written in
    Java
  • JavaCC
  • ANTLR

2
More on classification of java parser generators
  • Bottom up Parser Generators Tools
  • JavaCUP
  • jay, YACC for Java www.inf.uos.de/bernd/jay
  • SableCC, The Sable Compiler Compiler
    www.sablecc.org
  • Topdown Parser Generators Tools
  • ANTLR, Another Tool for Language Recognition
    www.antlr.org
  • JavaCC, Java Compiler Compiler www.webgain.com/jav
    a_cc

3
What is a parser generator
Scanner
Parser
assignment
Expr

id
Parser generator (JavaCup)
Exp id
id
Context Free Grammar
4
Steps to use JavaCup
  • Write a javaCup specification (cup file)
  • Defines the grammar and actions in a file (say,
    calc.cup)
  • Run javaCup to generate a parser
  • java java_cup.Main calc.cup
  • Notice the package prefix java_cup before Main
  • Will generate parser.java and sym.java (default
    class names, which can be changed)
  • Write your program that uses the parser
  • For example, UseParser.java
  • Compile and run your program

5
Example 1 parse an expression and evaluate it
  • Grammar for arithmetic expression
  • expr?expr expr expr expr expr
    expr expr /expr (expr) number
  • Example
  • (24)3
  • Our tasks
  • Tell whether an expression like (24)3 is
    syntactically correct
  • Evaluate the expression. (we are actually
    producing an interpreter for the expression
    language).

6
The overall picture
public interface Scanner public Symbol
next_token() throws java.lang.Exception
java_cup.runtime
Symbol
Scanner
lr_parser
implements
extends
CalcParser
CalcScanner
tokens
expression (24)3
CalcScanner
CalcParser
CalcParserUser
result
JLex
javaCup
calc.lex
calc.cup
7
Calculator javaCup specification (calc.cup)
  • terminal PLUS, MINUS, TIMES, DIVIDE, LPAREN,
    RPAREN
  • terminal Integer NUMBER
  • non terminal Integer expr
  • precedence left PLUS, MINUS
  • precedence left TIMES, DIVIDE
  • expr expr PLUS expr
  • expr MINUS expr
  • expr TIMES expr
  • expr DIVIDE expr
  • LPAREN expr RPAREN
  • NUMBER
  • Is the grammar ambiguous?
  • Add precedence and associativity
  • left means, that a b c is parsed as (a b)
    c
  • lowest precedence comes first, so a b c is
    parsed as a (b c)
  • How can we get PLUS, NUMBER, ...?
  • They are the terminals returned by the scanner.
  • How to connect with the scanner?

8
Ambiguous grammar error
  • If we enter the grammar as below
  • Expression Expression PLUS Expression
  • Without precedence JavaCUP will tell us
  • Shift/Reduce conflict found in state 4
  • between Expression Expression PLUS Expression
    ()
  • and Expression Expression () PLUS Expression
  • under symbol PLUS
  • Resolved in favor of shifting.
  • The grammar is ambiguous!
  • Telling JavaCUP that PLUS is left associative
    helps.

9
Corresponding scanner specification (calc.lex)
  • import java_cup.runtime.Symbol
  • Import java_cup.runtime.Scanner
  • implements java_cup.runtime.Scanner
  • type Symbol
  • function next_token
  • class CalcScanner
  • eofval return null
  • eofval
  • NUMBER 0-9
  • "" return new Symbol(CalcSymbol.PLUS)
  • "-" return new Symbol(CalcSymbol.MINUS)
  • "" return new Symbol(CalcSymbol.TIMES)
  • "/" return new Symbol(CalcSymbol.DIVIDE)
  • NUMBER return new Symbol(CalcSymbol.NUMBER,
    new Integer(yytext()))
  • \r\n.
  • Connection with the parser

10
Run JLex
  • D\214gtjava JLex.Main calc.lex
  • note the package prefix JLex
  • program text generated calc.lex.java
  • D\214gtjavac calc.lex.java
  • classes generated CalcScanner.class

11
Generated CalcScanner class
  • import java_cup.runtime.Symbol
  • Import java_cup.runtime.Scanner
  • class CalcScanner implements java_cup.runtime.Scan
    ner
  • ... ....
  • public Symbol next_token ()
  • ... ...
  • case 3 return new Symbol(CalcSymbol.MINUS)
  • case 6 return new Symbol(CalcSymbol.NUMBER,
    new Integer(yytext()))
  • ... ...
  • Interface Scanner is defined in java_cup.runtime
    package
  • public interface Scanner
  • public Symbol next_token() throws
    java.lang.Exception

12
Run javaCup
  • Run javaCup to generate the parser
  • D\214gtjava java_cup.Main -parser CalcParser
    -symbols CalcSymbol calc.cup
  • classes generated
  • CalcParser
  • CalcSymbol
  • Compile the parser and relevant classes
  • D\214gtjavac CalcParser.java CalcSymbol.java
    CalcParserUser.java
  • Use the parser
  • D\214gtjava CalcParserUser

13
The token class Symbol.java
  • public class Symbol
  • public int sym, left, right
  • public Object value
  • public Symbol(int id, int l, int r, Object o)
  • this(id) left l right r value o
  • ... ...
  • public Symbol(int id, Object o) this(id, -1,
    -1, o)
  • public String toString() return ""sym
  • Instance variables
  • sym the symbol type
  • left left position in the original input file
  • right right position in the original input file
  • value the lexical value.
  • Recall the action in lex file
  • return new Symbol(CalcSymbol.NUMBER, new Integer
    (yytext()))

14
CalcSymbol.java (default name is sym.java)
  • public class CalcSymbol
  • public static final int MINUS 3
  • public static final int DIVIDE 5
  • public static final int NUMBER 8
  • public static final int EOF 0
  • public static final int PLUS 2
  • public static final int error 1
  • public static final int RPAREN 7
  • public static final int TIMES 4
  • public static final int LPAREN 6
  • Contain token declaration, one for each token
    (terminal) Generated from the terminal list in
    cup file
  • terminal PLUS, MINUS, TIMES, DIVIDE, LPAREN,
    RPAREN
  • terminal Integer NUMBER
  • Used by scanner to refer to symbol types, e.g.,
  • return new Symbol(CalcSymbol.PLUS)
  • Class name comes from symbols directive.
  • java java_cup.Main -parser CalcParser -symbols
    CalcSymbol calc.cup

15
The program that uses the CalcPaser
  • import java.io.
  • class CalcParserUser
  • public static void main(String args) throws
    IOException
  • File inputFile new File ("d/214/calc.input")
  • CalcParser parser new CalcParser
  • (new CalcScanner(new FileInputStream(input
    File)))
  • parser.parse()
  • The input text to be parsed can be any input
    stream (in this example it is a FileInputStream)
  • The first step is to construct a parser object. A
    parser can be constructed using a scanner.
  • this is how scanner and parser get connected.
  • If there is no error report, the expression in
    the input file is correct.

16
Recap
  • To write a parser, how many things you need to
    write?
  • cup file
  • lex file
  • a program to use the parser
  • To run a parser, how many things you need to do?
  • Run javaCup, to generate the parser
  • Run JLex, to generate the scanner
  • Compile the scanner, the parser, the relevant
    classes, and the class using the parser
  • relevant classes CalcSymbol, Symbol
  • Run the class that uses the parser.

17
Recap (cont.)
java_cup.runtime
Symbol
Scanner
lr_parser
implements
extends
CalcParser
CalcScanner
tokens
expression 2(35)
CalcScanner
CalcParser
CalcParserUser
result
JLex
javaCup
calc.lex
calc.cup
18
Evaluate the expression
  • The previous specification only indicates the
    success or failure of a parser. No semantic
    action is associated with grammar rules.
  • To calculate the expression, we must add java
    code in the grammar to carry out actions at
    various points.
  • Form of the semantic action
  • expre1 PLUS expre2
  • RESULTnew Integer(e1.intValue()
    e2.intValue())
  • Actions (java code) are enclosed within a pair
  • Labels e2, e2 the objects that represent the
    corresponding terminal or non-terminal
  • RESULT The type of RESULT should be the same as
    the type of the corresponding non-terminals.
    e.g., expr is of type Integer, so RESULT is of
    type integer.
  • In the cup file, you need to specify expr is of
    Integer type.
  • non terminal Integer expr

19
Change the calc.cup
  • terminal PLUS, MINUS, TIMES, DIVIDE, LPAREN,
    RPAREN
  • terminal Integer NUMBER
  • non terminal Integer expr
  • precedence left PLUS, MINUS
  • precedence left TIMES, DIVIDE
  • expr expre1 PLUS expre2
  • RESULT new Integer(e1.intValue()
    e2.intValue())
  • expre1 MINUS expre2
  • RESULT new Integer(e1.intValue()-
    e2.intValue())
  • expre1 TIMES expre2
  • RESULT new Integer(e1.intValue()
    e2.intValue())
  • expre1 DIVIDE expre2
  • RESULT new Integer(e1.intValue()/
    e2.intValue())
  • LPAREN expre RPAREN RESULT e
  • NUMBERe RESULT e
  • How do you guarantee NUMBER is of Ineter type?
  • NUMBER return new Symbol(CalcSymbol.NUMBER,
    new Integer(yytext()))

20
Change CalcPaserUser
  • import java.io.
  • class CalcParserUser
  • public static void main(String a) throws
    Exception
  • CalcParser parser new CalcParser(
  • new CalcScanner(new FileReader(calc.input
    )))
  • Integer result (Integer)parser.parse().value
  • System.out.println("result is " result)
  • Why the result of parser().value can be casted
    into an Integer? Can we cast that into other
    types?
  • This is determined by the type of expr, which is
    the head of the first production in javaCup
    specification
  • non terminal Integer expr

21
Calc second round
  • Calc program syntax
  • program ? statement statement program
  • statement ? assignment SEMI
  • assignment ?ID EQUAL expr
  • expr ? expr PLUS expr
  • expr MULTI expr
  • LPAREN expr RPAREN
  • NUMBER
  • ID
  • Example program
  • X1 y2 zxy2
  • Task generate and display the parse tree in XML

22
Abstract syntax tree
X1 y2 zxy2
23
OO Design Rationale
  • Write a class for every non-terminal
  • Program, Statement, Assignment, Expr
  • Write an abstract class for non-terminal which
    has alternatives
  • Given a rule statement?assignment
    ifStatement
  • Statement should be an abstract class
  • Assignment should extends Statement
  • Semantic part of the CUP file will construct the
    object
  • assignment IDe1 EQUAL expre2
  • RESULT new Assignment(e1, e2)
  • The first rule will return the top level object
    (the Program object)
  • the result of parsing is a Program object
  • It is similar to XML DOM parser.

24
Calc2.cup
  • terminal String ID, LPAREN, RPAREN, EQUAL, SEMI,
    PLUS, MULTI
  • terminal Integer NUMBER
  • non terminal Expr expr
  • non terminal Statement statement
  • non terminal Program program
  • non terminal Assignment assignment
  • precedence left PLUS
  • precedence left MULTI
  • program statemente RESULT new
    Program(e)
  • statemente1 programe2 RESULTnew
    Program(e1, e2)
  • statement assignmente SEMI RESULT e
  • assignment IDe1 EQUAL expre2
  • RESULT new Assignment(e1, e2)
  • expr expre1 PLUSe expre2 RESULTnew
    Expr(e1,e2,e)
  • expre1 MULTIe expre2 RESULTnew
    Expr(e1,e2,e)
  • LPAREN expre RPAREN RESULT e
  • NUMBERe RESULT new Expr(e)
  • IDe RESULT new Expr(e)

25
Program class
  • import java.util.
  • public class Program
  • private Vector statements
  • public Program(Statement s)
  • statements new Vector()
  • statements.add(s)
  • public Program(Statement s, Program p)
  • statements p.getStatements()
  • statements.add(s)
  • public Vector getStatements() return
    statements
  • public String toXML() ... ...
  • Program statemente RESULTnew
    Program(e)
  • statemente1 programe2 RESULTnew
    Program(e1, e2)

26
Assignment class
  • class Assignment extends Statement
  • private String lhs
  • private Expr rhs
  • public Assignment(String l, Expr r)
  • lhsl
  • rhsr
  • String toXML()
  • String result"ltAssignmentgt"
  • result "ltlhsgt" lhs "lt/lhsgt"
  • result rhs.toXML()
  • result "lt/Assignmentgt"
  • return result
  • assignmentIDe1 EQUAL expre2
  • RESULT new Assignment(e1, e2)

27
Expr class
  • public class Expr
  • private int value
  • private String id
  • private Expr left
  • private Expr right
  • private String op
  • public Expr(Expr l, Expr r, String o)
    leftl rightr opo
  • public Expr(Integer i)
    valuei.intValue()
  • public Expr(String i) idi
  • public String toXML() ...
  • expr expre1 PLUSe expre2
  • RESULT new Expr(e1, e2, e)
  • expre1 MULTIe expre2 RESULT new
    Expr(e1, e2, e)
  • LPAREN expre RPAREN RESULT e
  • NUMBERe RESULT new Expr(e)
  • IDe RESULT new Expr(e)

28
Calc2.lex
  • import java_cup.runtime.
  • implements java_cup.runtime.Scanner
  • type Symbol
  • function next_token
  • class Calc2Scanner
  • eofval return null
  • eofval
  • IDENTIFIER a-zA-Za-zA-Z0-9_
  • NUMBER 0-9
  • "" return new Symbol(Calc2Symbol.PLUS,
    yytext())
  • "" return new Symbol(Calc2Symbol.MULTI,
    yytext())
  • "" return new Symbol(Calc2Symbol.EQUAL,
    yytext())
  • "" return new Symbol(Calc2Symbol.SEMI,
    yytext())
  • "(" return new Symbol(Calc2Symbol.LPAREN,
    yytext())
  • ")" return new Symbol(Calc2Symbol.RPAREN,
    yytext())
  • IDENTIFIER return new Symbol(Calc2Symbol.ID,
    yytext())
  • NUMBER return new Symbol(Calc2Symbol.NUMBER,
    new Integer(yytext()))

29
Calc2Parser User
  • class ProgramProcessor
  • public static void main(String args) throws
    IOException
  • File inputFile new File ("d/214/calc2.input")
  • Calc2Parser parser new Calc2Parser(
  • new Calc2Scanner(new
    FileInputStream(inputFile)))
  • Program pm (Program)parser.debug_parse().value
  • String xmlpm.toXML()
  • System.out.println("result is " xml)
  • Debug_parser() print out debug info, such as
    the current token being processed, the rule being
    applied.
  • Useful to debug javacup specification.
  • Parsing result value is of Program typethis is
    decided by the type of the program rule
  • Program statemente RESULT new
    Program(e)
  • statemente1 programe2 RESULTnew
    Program(e1, e2)

30
Another way to define the expression syntax
  • terminal PLUS, MINUS, TIMES, DIV, LPAREN, RPAREN
  • terminal NUMLIT
  • non terminal Expression, Term, Factor
  • start with Expression
  • Expression Expression PLUS Term
  • Expression MINUS Term
  • Term
  • Term Term TIMES Factor
  • Term DIV Factor
  • Factor
  • Factor NUMLIT
  • LPAREN Expression RPAREN
Write a Comment
User Comments (0)
About PowerShow.com