Approximating ContextFree Grammar Ambiguity - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Approximating ContextFree Grammar Ambiguity

Description:

Approximating ContextFree Grammar Ambiguity – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 44
Provided by: bri60
Category:

less

Transcript and Presenter's Notes

Title: Approximating ContextFree Grammar Ambiguity


1
Approximating Context-Free Grammar Ambiguity
  • Claus Brabrand
  • brabrand_at_brics.dk
  • BRICS, Department of Computer Science
  • University of Aarhus, Denmark

2
// Abstract
Approximating Context-Free Grammar Ambiguity
Context-free grammar ambiguity is
undecidable. However, just because its
undecidable, doesnt mean there arent (good)
approximations! Indeed, the whole area of static
analysis works on side-stepping
undecidability. We exhibit a characterization
of context-free ambiguity which induces a whole
framework for approximating the problem. In
particular, we give an approximation, AMN, based
on the Mohri-Nederhof, 2000 regular
approximation of context-free grammars and show
how to boost the precision even further.
3
// Outline
  • Introduction
  • Vertical / Horizontal Ambiguity
  • Characterization of Ambiguity
  • (Over-)Approximation Framework
  • Approximation (AMN)
  • Assessment
  • Related Work
  • Conclusion

4
// Context-Free Grammar
  • N finite set of nonterminals
  • ? finite set of terminals
  • s ? N start nonterminal
  • ? N ? P(E) production function, E N ?
    ?

G ? N, ?, s, ? ?
  • Assume
  • All n?N reachable (from s)
  • All n?N derive some (finite) string

L G ? P(?) language of G,
L(G)
5
// Relevant CFG Decision Problems
  • Decidable
  • Membership ? ? L(GCFG)
  • Emptyness L(GCFG) ?
  • Intersection (w/ REG) L(GCFG) ? L(RREG)
    L(CCFG)
  • constructively
  • Undecidable
  • Intersection (w/ CFG) L(GCFG) ? L(GCFG) ?
  • Ambiguity ???? 2 derivation trees ?

6
// Ambiguity Undecidable!
  • Algorithms
  • Undecidable!
  • However
  • Ambiguity ???? 2 derivation trees ?

s
s
?
T
T
?
?

?
ambiguous
unambiguous
7
// Side-Stepping Undecidability
However, just because its undecidable, doesnt
mean there arent (good) approximations! Indeed,
the whole area of static analysis works on
side-stepping undecidability.
  • Unsafe approximation
  • Safe approximation

ambiguous
unambiguous
unsafe approximation
ambiguous
ambiguous
unambiguous
unambiguous
safe (over-)approximation
safe (under-)approximation
8
// Motivation
  • Use safe (over-)approximation
  • Yes! ? G guaranteed unambiguous!!!
  • Safely use any GLR parser on G
  • Because never two parses at runtime!
  • Hence
  • dynamic parse ambiguity ? static parse ambiguity

ambiguous
unambiguous
.
Yes!
9
// Motivation (contd)
  • Undecidability means therell always be a
    slack
  • However, still useful!
  • Possible interpretations of No?
  • Treat as error (reject grammar)
  • Please redesign your grammar (as in LALR(k))
  • Treat as warning
  • Here are some potential problems

ambiguous
.
unambiguous
.
No?
10
// Vertical Ambiguity
  • Vertical ambiguity
  • Example

G
?n ? N ??, ? ? ?(n) ? ? ? ? L(?) ?
L(?) ?
Z x A y x B y A a B a
?
Ambiguous string
xay
reduce/reduce conflict in Yacc
11
// Horizontal Ambiguity
  • Horizontal ambiguity
  • where
  • Example

G
?n ? N ?? ? ?(n) ?i ? 1..?-1 L(?0 .. ?i-1)
L(?i .. ??-1 ) ?
P(?) ? P(?) ? P(?)
X Y xay x,y?? ? a?? ? x,xa?L(X) ?
y,ay?L(Y)
Z A B A x a x B a y y
?
Ambiguous string
xay
shift/reduce conflict in Yacc
12
// Characterization of Ambiguity
  • Theorem 1
  • Lemma 1a (?)
  • Lemma 1b (?)

G ? G ? G unambiguous
G ? G ? G unambiguous
G ? G ? G unambiguous
13
// Proof (Lemma 1a) ?
G ? G ? G unambiguous
  • or contrapositively
  • Proof
  • Assume G ambiguous (i.e. ? 2 der. trees for ?)
  • Show
  • by induction in max height of the 2 derivation
    trees

G ambiguous ? G ? G
G ? G
14
// Proof (Lemma 1a) ? (Base)
  • Base case (height ? 1)
  • The ambiguity means that (for p?p)
  • Which means
  • i.e., we have a vertical ambiguity

N
?
N
1
1
p
p
?
?
?
?

L(?) ? L(?) ? ? ? ?
G
15
// Proof (Lemma 1a) ? (I.H.)
  • Induction step (height ? n)
  • Assume induction hypothesis (for height ? n-1)
  • The ambiguity means

N
N
1
1
p
p
?i
?i


?
n-1 ?
? n-1
..
..
..
..
?i
?i
??-1 ?0
??-1
?0
?
16
// Proof (Lemma 1a) ? (p?p)
  • Case p q (different production)
  • but then ?
  • i.e., we have a vertical ambiguity

p ? p
L(?) ? L(?) ? ? ? ?
G
N
N
1
1
p
p
?i
?i


?
n-1 ?
? n-1
..
..
..
..
?i
?i
??-1 ?0
??-1
?0
?
17
// Proof (Lemma 1a) ? (pp,1)
  • Case p ? q (same prod. ? )
  • i.e. the top of the trees are the same
  • Case
  • ? ambiguity in subtreei ( deriving same ?i)
  • Induction hypothesis (this subtree) ?

p p
?i ?i ?i
?i ?i ?i
?
G
G
N
N
1
1
p
p
?i
?i


?
n-1 ?
? n-1
..
..
..
..
?i
?i
??-1 ?0
??-1
?0
?
18
// Proof (Lemma 1a) ? (pp,2)
p p
  • Case p ? q (same prod. ? )
  • Case
  • but then (assume WLOG
    )
  • Now pick any k
  • ...then

?i ?i ?i
?
?i ?i ? ?i
? ?i ?i ?i
  • least such i
  • 2nd least such j

?j?i ?j ? ?j
i ? k lt j
?
L(?0 .. ?k) L(?k1 .. ?? ) ? ?
G
N
N
1
1
p
p
?i
?j
?i
?j
. .
. .
?
n-1 ?
? n-1
?i

?j
?i
?j
k
k
19
// Proof (Lemma 1b) ?
G ? G ? G unambiguous
  • Contrapositively
  • Assume (vertical conflict)
  • Then for some N?N
  • But then derive (using reachability
    derivability of N)

G ambiguous ? G ? G
N ? ? ? a, N ? ? ? a, L(?) ? L(?) ? a ? ?
s ? x N ? ? x ? ? ? x a ? ? x a y
s ? x N ? ? x ? ? ? x a ? ? x a y
20
// Proof (Lemma 1b) ? (contd)
  • Assume (horizontal conflict)
  • Then for some N?N
  • But then derive (using reachability
    derivability of N)

N ? ? ? , L(?) L(?) ? ?
i.e.
?x,y ? ? ?a ? ? x,xa ? L(?) ? y,ay ? L(?)
s ? v N ? ? v ? ? ? ? v x ? ? ? v x a y ?
? v x a y w
s ? v N ? ? v ? ? ? ? v x a ? ? ? v x a y ? ?
v x a y w
21
// (Over-)Approximation (A)
  • (Over-)Approximation A E ?
    P(?)
  • A decidable ? and decidable on
    co-dom(A)
  • Approximated vertical ambiguity
  • Approximated horizontal ambiguity

?? ? E L(?) ? A(?)
?
G
A
?n ? N ??, ? ? ?(n) A(?) ? A(?) ?
G
A
?n ? N ?? ? ?(n) ?i ? 1..?-1 A(?0 .. ?i-1)
A(?i .. ??-1) ?
22
// Ambiguity Approximation
  • Theorem 2
  • Proof
  • Conflicts w/ smaller sets ? conflicts w/ larger
    sets

? ? G unambiguous
G
G
A
A
? ? ?
G
G
G
G
A
A
A(?) ? A(?) ? ? L(?) ? L(?) ?
A(?) A(?) ? ? L(?) L(?) ?
23
// Compositionality (of As)
  • Colloary 3
  • Proof
  • Follows from definition omited
  • i.e. Approximations are compositional!

A, A decidable (over-)approximations
? A ? A decidable (over-)approximation
A
ambiguous
A ? A
unambiguous
ambiguous
unambiguous
?
ambiguous
unambiguous
A
24
// Choice(s) of A?
  • A?(?) ? (constant)
  • Worst approximation
  • but safe approximation!
  • Useless
  • Cannot determine that any grammars are
    unambiguous

ambiguous
unambiguous
worst approximation
25
// Choice(s) of A? (contd)
  • AMN(?) Mohri-Nederhof(?)
  • CFG ? DFA (NFA) Approximation
  • Properties of this Black-box
  • Good (over-)approximation!
  • Works on language, L(G)
  • not on grammatical structure, G
  • Approximation parameterizable
  • E.g. unfold nonterminals n times

Regular Approximation of Context-Free Grammars
through Transformation Mohri-Nederhof, 2000
Black-box
26
// Decidability (of AMN)
  • ? decidable (using DFAs)
  • O(XNFAYNFA)
  • decidable (using DFAs)
  • O(XNFAYNFA)
  • AMN decidable
  • With potential counterexamples (using DFAs)

X ? Y ?
X Y ?
? ? G unambiguous
AMN
AMN
27
// Decision Algorithm for (X Y)
?
?
  • For X,Y regular languages
  • All overlappings, xay, as DFAs variant of ?
    construction!

a
a
?
x
y
XNFA
YNFA
XNFA
YNFA
XYNFA
XYNFA
a
X
Y
? a ? path
X ? Y
?
?
x
a
y
a
X
Y
28
// Three Approximation Answers
  • Y!
  • G definitely not ambiguous!
  • ?/D?
  • ? Dont know?
  • could not find any potential counterexamples.
  • D? Dont know look at over-approx, D?
  • and here are all potential counterexamples
  • Note some strings do not even parse!
  • Improve Parse S ?FIN D ? subset of real
    counterexamples

True answer
29
// Regaining Lost Precision!
  • Now parse all counterexamples!
  • i.e. parse DFA, DDFA
  • 1) i.e. construct
  • Decidable in O(DG)
  • 2) Decide emptyness on C
  • Decidable in O(C DG)
  • Only potential counterexamples that parse!

L(CCFG) L(DDFA) ? L(GCFG)
L(CCFG) ?
30
// Three Approximation Answers
  • Y!
  • G definitely not ambiguous!
  • ?/C?
  • ? Dont know?
  • could not find any counterexamples.
  • C? Dont know look at over-approx, C?
  • and here are all potential counterexamples
  • Note all strings actually parse (maybe not
    ambiguously)!
  • Improve extract finite under-approximation...?

True answer
31
// Asymptotic (Time) Complexity
h
  • Mohri-Nederhof O(n2vh)
  • Vertical Amb O(n3v4h4)
  • Horizontal Amb O(n3v3h5)
  • Total O(n3v3h4(vh)) ? O(g5)

N1 e1,1 ea,1 e1,p ea,p
  • n N
  • v max?(N), N?N
  • h max?, ???(N), N?N
  • g nvh G



v

n
32
// Related Work (Dynamic)
  • Dynamic disambiguation
  • Disambiguation-by-convention
  • Longest match, most specific match,
  • Customizable
  • Bison v. 1.5 dprec, merge
  • ASFSDF disambiguation filters
  • Dynamic ambiguity interception
  • GLR (Tomita, Early, Bison, ASFSDF, )

33
// Related Work (Static)
  • Static disambiguation
  • Disambiguation-by-convention
  • First match, most specific match,
  • Customizable
  • Yacc left, right, nonassoc, prec
  • Static ambiguity interception
  • LL(k), LA-LR(k),
  • Our work goes here (but for GLR)!

34
// Implementation
  • disamb (Java)

In progress!
35
// Assessment
  • Quality of approximation
    Quantity of false-positives
  • Precision
  • Our \ LR(k) ?
  • LR(k) \ Our ?
  • False-positives ?
  • Characterize ? / N?
  • In terms of grammatical structure ?
  • Efficiency (in practise)

In progress!
36
// Example Expression chains
  • !?

E -gt E T -gt T T -gt T F -gt F F -gt
( E ) -gt x
37
// Example Balancing Structures
  • Nasty
  • Requires
  • Unbounded memory ( xes)
  • i.e. CFG structure
  • Unbounded lookahead
  • i.e. any finite k is insufficient
  • ? False-positives!

S -gt A A A -gt x A x -gt y
Example string
xxyxxxyx
38
// Future Work
  • Permit
  • With disambiguating conventions for
  • Associativity
  • Precedence
  • Parsing optimization
  • Exploit compile-time analysis information at
    runtime

E -gt E ? E
39
// Conclusion
Approximating Context-Free Grammar Ambiguity
Context-free grammar ambiguity is
undecidable. However, just because its
undecidable, doesnt mean there arent (good)
approximations! Indeed, the whole area of static
analysis works on side-stepping
undecidability. We exhibit a characterization
of context-free ambiguity which induces a whole
framework for (over-)approximation. In
particular, we give an approximation based on the
Mohri-Nederhof, 2000 regular approximation of
context-free grammars and show how to boost the
precision even further.
But wait, theres more
40
// Lessons Learned
  • Framework
  • Plug in your favorite (over-)approximation of
    L(?)
  • Even take intersection of them A ?i Ai
  • Approximation closed under intersection
  • Methodology
  • Just because its undecidable doesnt mean there
    arent (good) approximations
  • Quantity of false-positives (practically
    motivated)
  • What to do with false-positives (pratically
    motivated)
  • Dont be scared of undecidability

41
bonus slides
42
// Membership Decidable!
  • Membership (aka. parsing)
  • Given ? ? ?
  • Is the string, ?, in the language of G
  • Algorithms
  • LL(k) O(?)
  • LA-LR(k) O(?)
  • GLR O(?3)

? ? L(G)
43
// Parsing Greedily Left-to-Right
  • The ambiguity problem for XY...
  • In fact, already a problem if x goes too far
  • Thus, we only have a problem if (X eats into
    Y)
  • Essentially disambiguation by picking longest
    match

... may occur in 2 cases
x
y
- (too little) Not possible (due to
greediness)
x
y
- (too much) Only this is a problem!
x
y
x
y
X Y ?
? X ? X( prefix(Y) \ ? ) ? ? ?
Write a Comment
User Comments (0)
About PowerShow.com