Top Down Parsing - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Top Down Parsing

Description:

Each procedure recognizes an instance of a non-terminal, ... E' ::= TE' | epsilon ... E ::= TE', E' ::= TE' | epsilon. Parses a b c as a (b c) ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 16
Provided by: drakbar
Category:
Tags: down | epsilon | parsing | top

less

Transcript and Presenter's Notes

Title: Top Down Parsing


1
Top Down Parsing
  • Recursive Descent Parsing
  • Top-down parsing
  • Build tree from root symbol
  • Each production corresponds to one recursive
    procedure
  • Each procedure recognizes an instance of a
    non-terminal, returns tree fragment for the
    non-terminal

2
General model
  • Each right-hand side of a production provides
    body for a function
  • Each non-terminal on the right hand side is
    translated into a call to the function that
    recognizes that non-terminal
  • Each terminal in the right hand side is
    translated into a call to the lexical scanner. If
    the resulting token is not the expected terminal
    error occurs.
  • Each recognizing function returns a tree fragment.

3
Example parsing a declaration
  • FULL_TYPE_DECLARATION
  • type DEFINING_IDENTIFIER is TYPE_DEFINITION
  • Translates into
  • get token type
  • Find a defining_identifier -- function
    call
  • get token is
  • Recognize a type_definition -- function call
  • get token semicolon
  • In practice, we already know that the first token
    is type, thats why this routine was called in
    the first place! Predictive parsing is guided by
    the next token

4
Example parsing a loop
  • FOR_STATEMENT
  • ITERATION_SCHEME loop STATEMENTS end loop
  • Node1 find_iteration_scheme --
    call function
  • get token loop
  • List1 Sequence of statements --
    call function
  • get token end
  • get token loop
  • get token semicolon
  • Result build loop_node with Node1 and List1
  • return Result

5
Problem
  • If there are multiple productions for a
    non-terminal, mechanism is required to determine
    which production to use
  • IF_STAT if COND then Stats end if
  • IF_STAT if COND then Stats ELSIF_PART
    end if
  • When next token is if, so which production to use
    ?

6
One Solution factorize grammar
  • If several productions have the same prefix,
    rewrite as single production
  • IF_STAT if COND then STATS ELSIF_PART end
    if
  • Problem now reduces to recognizing whether an
    optional
  • Component (ELSIF_PART) is present

7
Second Problem of Recursion
  • Grammar should not be left-recursive
  • E E T T
  • Problem to find an E, start by finding an E
  • Original scheme leads to infinite loop
  • Grammar is inappropriate for recursive-descent

8
Solution to left-recursion
  • E E T T means that eventually E expands
    into
  • T T T .
  • Rewrite as
  • E TE
  • E TE epsilon
  • Informally E is a possibly empty sequence of
    terms separated by an operator

9
Recursion can involve multiple productions
  • A B C D
  • B A E F
  • Can be rewritten as
  • A A E C F C D
  • Now apply previous method
  • General algorithm to detect and remove
    left-recursion

10
Further Problem
  • Transformation does not preserve associativity
  • E E T T
  • Parses a b c as (a b) c
  • E TE, E TE epsilon
  • Parses a b c as a (b
    c)
  • Incorrect for a - b c must rewrite tree

11
In practice use loop to find sequence of terms
  • Node1 P_Term -- call function that
    recognizes a term
  • loop
  • exit when Token not in
    Token_Class_Binary_Addop
  • Node2 New_Node (P_Binary_Adding_Ope
    rator)
  • Scan
    -- past operator
  • Set_Left_Opnd (Node2, Node1)
  • Set_Right_Opnd (Node2, P_Term) --
    find next term
  • Set_Op_Name (Node2)
  • Node1 Node2 --
    operand for next operation
  • end loop

12
LL (1) Parsing
  • LL (1) grammars
  • If table construction is successful, grammar is
    LL (1) left-to right, leftmost derivation with
    one-token lookahead.
  • If construction fails, can conceive of LL (2),
    etc.
  • Ambiguous grammars are never LL (k)
  • If a terminal is in First for two different
    productions of A, the grammar cannot be LL (1).
  • Grammars with left-recursion are never LL (k)
  • Some useful constructs are not LL (k)

13
Building LL (1) parse tables
  • Table indexed by non-terminal and token. Table
    entry is a production
  • for each production P A a loop
  • for each terminal a in First (a) loop
  • T (A, a) P
  • end loop
  • if e in First (a), then
  • for each terminal b in Follow (a) loop
    T (A, b) P
  • end loop
  • end if
  • end loop
  • All other entries are errors.
  • If two assignments conflict, parse table cannot
    be built.

14
Left Recursion Removal Left Factoring
  • Left Recursion Removal
  • Left Factoring

15
Synatx Tree Construction in LL(1)
  • First and Follow Sets
  • LL(k) Parsers (Extending the Lookahead
  • Error Recovery in Top Down Parsers
  • Error Recovery in LL(1) Parsers
Write a Comment
User Comments (0)
About PowerShow.com