Title: A Monadic-Memoized Solution for
1A Monadic-Memoized Solution for Left-Recursion
Problem of Combinatory Parser Rahmatullah
Hafiz 60-520 Fall, 2005
2Outline Part1 Basic Concepts Part2 Related
Works Part3 Our Solution
3- Parsing
-
- Process of determining if input string can be
- recognized by a set of rules for a specific
language - Parser Program that does parsing
- Input
- 256
- Rules
- exp -gt digit op Exp digit
- digit -gt 01..9
- Op-gt
-
exp
Top-Down Parsing
exp
op
digit
digit
exp
op
digit
2
5
6
4- Top-Down Parsing
- Recognition of Input starts at root and
- proceeds towards leaves
- Left to Right recognition
- Recursive-decent (back-tracking) parsing
- If one rule fails then try another rule
- recursively
- Comparatively easy to construct
- Exponential in worst-case
5- Combinatory Parser
-
- Parsers written in functional languages
- Lazy-Functional Languages
- (Miranda, Haskell)
- Can be used to parse
- Natural Language (English)
- Formal Language (Haskell)
- Need to follow some rules
- Context-Free Grammar
6- Why Lazy-Functional Language
-
- Modular code and easy to implement
- Higher-Order functions can represent
- BNF notations of CFG
- Functions are First-Class citizens
- Suitable for Top-Down Recursive-Decent
- fully backtracking parsing
- Higher-Order functions
- Input/Output arguments could be
- another function(s)
-
7- Frost and Launchbury (JFP, 1989)NLI in Miranda
- Hutton (JFP, 1992 ) Uses of Combinatory Parser
- Example
- BNF notation of CFG
- s a sempty
- We can read it like
- s is either a then s or empty
- gt s a then s or empty
- Possible to write parsers exactly like this in
LFL - using-higher order functions
-
8- Example (cont.)
- --s a sempty
- empty input input
- a (xxs) if xa then xs else
- (p or q ) input p input q input
- (p then q) input if r
- then
- else q r
- where r p input
- s input (a then s or empty) input
- --Maingt s "aaa
- --"","a","aa","aaa"
-
9Problem with Left-Recursive Grammar
Right-recursive grammar s a sa Input aaa
Left-recursive grammar s s aa Input aaa
s
aaa
a
a
s
aa
(a, aa)
a
s
a
(aa, a)
a
Terminates
(aaa, )
10- Example
- --s s aempty
- empty input input
- a (xxs) if xa then xs else
- (p or q ) input p input q input
- (p then q) input if r
- then
- else q r
- where r p input
- s input (s then a or empty) input
- --Maingt s "aaa
- --( Exception stack overflow
-
Never Terminates
11- Why bother for Left-Recursive Parsing?
- Easy to implement
- Very modular
- In case of Natural Language Processing
- Expected ambiguity can be achieved easily
- Non-left Recursive grammars might not
generates all possible parses - As left recursive grammars root grows all the
way to the bottom level, it ensures all possible
parsing -
12- Related Approaches
- Approach leads to exponential time complexity
- Lickman (1995) Fixed point solution
- Most of the approaches deals with
- Transforming left-recursive grammar to non-left
recursive grammar - Frost (1992) Guarding left-production with non-
left productions - Hutton (1992, 1996) Grammar transformation
-
-
13- Related Approaches
- Transforming left-recursive grammar to non-left
recursive grammar - violates semantic rules
- structure of parse trees are completely
different - Example
- ssa sbs
- rule 1 rule ??
- b sasempty
- rule 2 rule ??
-
s
s
a
S
s
b
a
a
S
s
empty
a
b
14- Our Approach Monadic-Memoization
- Frost and Hafiz (2005)
- The idea is simple
- Let not the parse tree grow infinitely
-
-
Left-recursive grammar s s a a Input
aaa
Depth1 Length3
s
aaa
s
aaa
a
Depth2 Length3
a
aaa
Parser fails when Depth gt length
s
Depth3 Length3
a
s
Wadler (1985)-- How to replace failure by a list
of successes
15- Monadic-Memoization (cont.)
-
-
s
aaa
Left-recursive grammar s s a a Input
aaa
Depth1 Length3
s
aaa
a
Depth2 Length3
- As the recognizer fails, the production rule
ssa fails too - Control goes to upper level with failure
- backtracking, parser tries alternatives
a
aaa
s
Depth3 Length3
a
s
- If Alternative rule succeeds
- Control goes to upper level with left
un-recognized inputs - Recursive procedure
16- Monadic-Memoization (cont.)
-
-
Left-recursive grammar s s a a Input
aaa
s
Depth1 Length3
s
aa
- ssa fails but sa succeeds
- backtracking
a
a
Depth2 Length3
a
s
fail
fail
Depth3 Length3
a
s
17- Monadic-Memoization (cont.)
- This approach is applicable in Mixed-Environment
- Grammar may contain
- Left-recursive production
- non Left-recursive production
- s s a a
- a b a b
- During parsing execution of one rule for same
- input may occur multiple time
- Also need to keep track of input length and
depth -
-
Top-Down Back-trucking is Exponential
18- Monadic-Memoization (cont.)
- Memoization is helpful
- The idea is
- Checking a Memo table along with input to each
recursive call - Memo table contains
- List of previous parsed outputs for any input,
- paired with appropriate production
- Length and depth of current parse
- Before parsing any input
- if lookup to Memo table fails
- then perform parsing update the memo
table - else return the result from table
19- Monadic-Memoization (cont.)
-
-
LR Production keeps track of length and depth
s
Production a lookups the memo table
s
b
a
aa
a
Production a updates the memo table
aa
20- Monadic-Memoization (cont.)
- Memoization reduces worst-case time complexity
- from exponential to O(n3)
- The problem is
- Lazy-functional languages dont let variable
updating or keeping a global storage for the
whole program - Need to pass around the Memo table so that
- All recursive parsing calls access Memo table
- if Memo table is used as the function arguments
- Code gets messy and error-prone
21- Monadic-Memoization (cont.)
- Or we can use Monad
- Derived from Category Theory -- Moggi (1989)
- S.E approaches for LFLs -- Wadler (1990)
- State, exception, I/O etc of LFL
- Monadic Framework for Parsing -Frost (2003)
- Reusable
- Complex tasks could be achieved
- by adding/modifying existing monadic objects
- Structured computation
-
22- Monadic-Memoization (cont.)
- Monad is a triple (M, unit, bind)
- M type constructor
- memo (Char,(Char,Char),(Int,Int))
- M inp memo -gt (inp, memo)
- unit a?M a
- takes a value and returns the computation of the
value - Works as a container
- bind M a ? (a ? M b) ? M b
- applies the computation a ? M b to the
computation M a and returns a computation M
b - Ensures sequential computation
23- Monadic-Memoization (cont.)
- The mental picture of how monad works
- Monad is a triple (M, unit, bind)
M Loader unit tray bind combiner
Picture source Newbern (2003)
24- Monadic-Memoization (cont.)
- Transform combinatory parsers into Monadic
object - Example
Original Or recognizer (p or q ) inp p inp
q inp
Monadic version (p or q) inp p inp bindS f
where f m q inpbindSg
where g n unitS(nub(m
n))
Check LR
Sa or b Input ab
lookup Memo
update Memo
a
Monadic Object Or
b
Parsed out
Memo1
Memo2
ab
25- Monadic-Memoization (cont.)
-
-
LR Production keeps track of length and depth
s
Production a lookups the memo table
s
b
aa
a
aa
a
Production a updates the memo table
Memo table propagation is ensured correctly gt
O(n3)
("","a","aa",(a",("aa","","a","aa"),("a",
"","a"),(10,11)))
26- References
- Frost, R. A. and Launchbury, E. J. (1989)
Constructing natural language interpreter in a
lazy functional language. The computer Journal
Special edition on Lazy functional Programming,
32(2) 3-4 - Wadler, P. (1992) Monads for functional
programming. Computer and systems sciences,
Volume 118 - Hutton, G. (1992) Higher-order functions for
parsing. Journal of Functional Programming,
2(3)323-343, Cambridge University Press - Frost, R. A. (1992)Guarded attribute grammars
top down parsing and left recursive productions.
SIGPLAN Notices 27(6) 72-75 - Lickman, P. (1995) Parsing With Fixed Points.
Masters thesis. Oxford University - Frost, R.A., Szydlowski, B. (1996) Memoizing
Purely Functional Top-Down Backtracking Language
Processors. Sci. Comput. Program. 27(3) 263-288 - Hutton, G. (1998) Monadic parsing in Haskell.
Journal of Functional Programming, 8(4)437-444,
Cambridge University Press - Frost, R.A.(2003) Monadic Memoization towards
Correctness-Preserving Reduction of Search.
Canadian Conference on AI 2003 66-80