Title: Monadic Compositional Parsing with Context Using Maltese as a Case Study
1Monadic CompositionalParsing with ContextUsing
Maltese as a Case Study
- Gordon J. Pace
- September 2004
2Monadic CompositionalParsing with ContextUsing
Maltese as a Case Study
- A Users Perspective.
- For the technical details, read the paper!
3Combinator-Based Programming
- A Recipe
- Provide a class/type of objects you will be
talking about - Provide a few basic objects of that type
- Provide a few combinators to combine objects of
the type into more complex ones. - Mix well and put on low heat for 35 min.
4Combinator-Based Programming
A basic type A parser which consumes a text
stream, returning an object of type a
5Combinator-Based Programming
- fail Parser a
- return a - Parser a
- one Parser Char
- Parser a - Parser a - Parser a
- mthen Parser a - Parser a - Parser a
Some basic parsers
6Combinator-Based Programming
- fail Parser a
- return a - Parser a
- one Parser Char
- Parser a - Parser a - Parser a
- mthen Parser a - Parser a - Parser a
And some combinators which produce new parsers
out of old ones.
7Combinator-Based Programming
- fail Parser a
- return a - Parser a
- one Parser Char
- Parser a - Parser a - Parser a
- mthen Parser a - Parser a - Parser a
A parser which always fails
8Combinator-Based Programming
- fail Parser a
- return a - Parser a
- one Parser Char
- Parser a - Parser a - Parser a
- mthen Parser a - Parser a - Parser a
Given a value, this parser leaves the input
stream untouched, returning the given value
9Combinator-Based Programming
- fail Parser a
- return a - Parser a
- one Parser Char
- Parser a - Parser a - Parser a
- mthen Parser a - Parser a - Parser a
A parser which returns the first character on the
input stream
10Combinator-Based Programming
- fail Parser a
- return a - Parser a
- one Parser Char
- Parser a - Parser a - Parser a
- mthen Parser a - Parser a - Parser a
Non-deterministic choice between two parsers
11Combinator-Based Programming
-
- fail Parser a
- return a - Parser a
- one Parser Char
- Parser a - Parser a - Parser a
- mthen Parser a - Parser a - Parser a
Sequential composition of two parsers
Actually, Im cheating by somewhat simplifying
the type
12New parsers are easy to derive
- p1 p2
- do
- x1
- x2
- return (x1,x2)
Parse a pair of objects in sequence
13New parsers are easy to derive
- matchSat
- (Char - Bool) - Parser Char
- matchSat cond
- do
- x
- if cond x
- then return x
- else fail
Parse a character, if it satisfies a given
condition
14New parsers are easy to derive
Parse a particular character
- matchChar c matchSat (c)
- matchString return
- matchString (ccs)
- do
- matchChar c
- matchString cs
- return (ccs)
Parse a particular string
15New parsers are easy to derive
- star Parser a - Parser a
- star p
- return
-
- do
- (x,xs) star p
- return (xxs)
Kleene star
16New parsers are easy to derive
- p1 p2
- do
- w1
- plus (parseSat isSpace)
- w2
- return (w1,w2)
Space separated words
17Context The Maltese Article
- Basic form of definite article is il.
- Nouns starting with a sun letter (xemxin)
transform the l to match the first letter of the
noun. - Nouns starting with 2 or 3 consonants, the first
being x or s, drop the initial i, and add an
initial i to the noun. - Nouns starting with a vowel drop the initial i
from the article. - The initial i is also dropped when the preceding
word starts with a vowel.
18Context The Maltese Article
Context
- Basic form of definite article is il.
- Nouns starting with a sun letter (xemxin)
transform the l to match the first letter of the
noun. - Nouns starting with 2 or 3 consonants, the first
being x or s, drop the initial i, and add an
initial i to the noun. - Nouns starting with a vowel drop the initial i
from the article. - The initial i is also dropped when the preceding
word starts with a vowel.
Context
Context
Context
19Context The Maltese Article
Context
How can we write a compositional parser one
that parses a stand-alone article?
- Basic form of definite article is il.
- Nouns starting with a sun letter (xemxin)
transform the l to match the first letter of the
noun. - Nouns starting with 2 or 3 consonants, the first
being x or s, drop the initial i, and add an
initial i to the noun. - Nouns starting with a vowel drop the initial i
from the article. - The initial i is also dropped when the preceding
word starts with a vowel.
Context
Context
Context
20Looking at the Context
- We would like to write
- definiteNoun
- article noun
21Looking at the Context
- We can look at local context using other parsers
- article
- do
- initial_i
- c
- parseChar -
- matchFuture (parseChar c)
22Looking at the Context
- We can look at local context using other parsers
- initial_i
- matchPast (
- (parseSat isVowel wordSep)
- )ifnot
- parseChar i
23More Context First Form Verbs
24More Context First Form Verbs
Context
25Setting the Context
- We would like to write something like
- sentence
- noun verb noun
26Setting the Context
- We set a context by setting attributes
- setAttribute (name, value)
- getAttribute name
- renameAttribute (name, name)
-
27Setting the Context
Clashing values cause the parser to fail
- We set a context by setting attributes
- setAttribute (name, value)
- getAttribute name
- renameAttribute (name, name)
-
28Setting the Context
- parsePersonSingular
- do
- setAttribute ("SubjectNumber", "Singular")
- c
- case c of
- 'n' - setAttribute ("SubjectPerson" 1)
- 'j' - setAttributes ("SubjectPerson",3"),
... - 't' - setAttribute ("SubjectPerson",2")
-
- setAttributes ("SubjectPerson",3"),
... -
29Setting the Context
- parseSubject
- do
- n
- renameAttributes
- ("Noun"x, "Subject"x)
- x
-
- parseObject ...
- parseSentence
- parseSubject parseVerb parseObject
-
30Contribution
- Local context parsed using the same parser
combinators - Global, shared information shared via attributes
- All this done compositionally, and
- By enriching standard parser combinators.
31Conclusions
- Optimise it, apply it to larger NL subset,
compare with other, more traditional approaches. - Optimise the new context-aware combinators.
- Extend it to work both as a language generator
and parser.