Compiler Design Chapter 3 - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Compiler Design Chapter 3

Description:

can be re-written as. Grammars. Regular expressions describe the structure of ... Syntactic ambiguity problems in writing/understanding programs. EOF Marker. EOF Marker ... – PowerPoint PPT presentation

Number of Views:166
Avg rating:3.0/5.0
Slides: 26
Provided by: jian9
Category:

less

Transcript and Presenter's Notes

Title: Compiler Design Chapter 3


1
Compiler Design - Chapter 3
Parsing Context-Free Grammars
2
Syntax
3
Symbols representing regular expressions
  • Abbreviation
  • digits 0-9
  • sum (digits)digits
  • Defines sums of the form
  • 283019

4
Balanced Parentheses
  • digits 0-9
  • sum exprexpr
  • expr (sum)digits
  • defines
  • (10923)
  • 61
  • (1(2503))
  • Automaton cannot recognize balanced parentheses
    a machine with N states cannot remember
    parenthesis-nesting greater than N

5
How does a lexical analyzer implement regular
expression abbreviations?
digits 0-9
RHS 0-9 substituted for digits before
translation to finite automaton not possible
for sum-and-expr
Mutually Recursive
Recursion - Important
6
Alternation at top level
can be re-written as
7
Elimination of alternation
can be re-written as
8
Kleene closure unnecessary
can be re-written as
9
Grammars
  • Regular expressions describe the structure of
    lexical tokens
  • Grammars define syntactic structure declaratively
  • Need a more powerful tool than finite automata to
    parse languages described by grammars
  • Grammars can be used to describe the structure of
    lexical tokens but regular expressions are
    adequate and more concise

10
Context-free Grammars
  • Language set of strings
  • Each string finite set symbols taken from a
    finitealphabet
  • PARSING
  • Strings source programs
  • Symbols lexical tokens
  • Alphabet set of token types returned by
    lexicalanalyzer

11
Context-free Grammars
  • Context-free grammar describes a language
  • A grammar has a set of productions of the form

Zero or more symbols on RHS
  • Symbol can be
  • Terminal token from the alphabet of stringsin
    the language
  • token can not appear on LHS
  • Non-terminal appears on LHS of some production
  • can appear on RHS also

12
Context-free grammar example
A syntax for straight-line programs
13
Context-free grammar example
Sentence (statement) in this language
From source language (before lexical analysis)
Token- types (terminal symbols) id, num, , etc
Names (a, b, c, d) Numbers (7, 5, 6)
semantic values associated with tokens
14
Derivation
  • Derivation to show this sentence is in the
    language of the grammar
  • Start with the start symbol
  • Repeatedly replace any non-terminal by its RHS
  • Leftmost always expand leftmost non-terminal
    first
  • Rightmost always expand rightmost non-terminal
    first
  • Neither leftmost or rightmost

15
Left-most derivation
16
Parse tree
  • Parse tree
  • Connecting each symbol in derivation to the
    one from which it was derived
  • Two different derivations can have the same
    parse tree

17
Ambiguous Grammars
Grammar ambiguous if the same sentence can be
derived with two different parse trees
Two parse trees for the same sentence
18
Another Ambiguous Grammar
Grammar 3.5
-4
2
19
Another Ambiguous Grammar
Grammar 3.5
9
7
Different semantics!
20
Ambiguous Grammars
  • Problematic for compiling
  • Need unambiguous grammars
  • Ambiguous grammars can often be transformed to
    unambiguous grammars

21
Transforming an Ambiguous Grammar
  • has higher precedence (bind tighter) than
  • Each operator associates to the left (1-2)-3 not
    1-(2-3)
  • Introduce new non-terminal symbols
  • E expression
  • T term (things you add)
  • F factor (things you multiply)

22
Unambiguous Grammar
23
Associate to the right
24
Ambiguous Languages
  • Ambiguity can usually be eliminated
  • Some languages, howeveronly ambiguous grammars
  • These languages are problematic as programming
    languages
  • Syntactic ambiguity problems in
    writing/understanding programs

25
EOF Marker
Start symbol
Augmented Grammar Added
EOF Marker
Write a Comment
User Comments (0)
About PowerShow.com