Title: Local Parallel Rewriting : Theory and Applications
1Local Parallel Rewriting Theory and
Applications
- Giorgio Satta
- University of Padua
Joint work with D. Melamed, O. Rambow, B.
Wellington
2- Part I Introduction
- Parallel rewriting
- Locality
- Part II Theory
- Descriptional complexity
- Normal forms and language hierarchy
- Part III Applications
- Synchronous rewriting
- Translation Algorithms
Summary
3In Computational Linguistics, formal grammars are
used to model the syntactic structure of natural
language sentences Syntactic modeling is
fundamental to applications such as natural
language understanding, machine translation, etc.
We can abstractly view a grammar as a set of
elementary objects (productions) that represent
descriptions of basic syntactic relations. These
objects are then combined in order to obtain
syntactic structures
4Example Context-Free Grammars (CFGs) Swent
NPpat VPwent VPwent VPwent
Advearly VPwent Vwent
NPhome Vwent went
Swent
VPwent
NPpat
VPwent
Advearly
Vwent
NPhome
went
5- So called long distance relations cannot be
directly represented by CFGs - Example extraction
- what did Alice eat ?
- Seat NPwhat SeatVPeat Veat
NPe - There is no direct dependence between the two
productions above
6- Several other constructions are problematic for
CFGs - Topicalization
- I enjoyed the soup, but the main course I
thought was awful - Clitic climbing
- Mari lo queria terminar de hacer
- Scrambling
- dass den Kuhlschrank niemand zu reparieren
versprochen hat
7A related problem arises in machine translation,
where we want to relate phrases that correspond
under the translation Example damoy Pat rano
pashol Pat went home early
8- Several solutions have been proposed in the
literature - Enriching CFGs with feature structures (HPSG,
LFG, ) - Use of logical constraints (ID/LP, linearization
grammars) - Use of special purpose operators (domain union,
HPSG) - Local parallel rewriting(TAG, LCFRS, )
9A parallel rewriting system defines elementary
objects that allow simultaneous rewriting of a
sentential form at several places Example TAG
10- The simplest way we can implement parallel
rewriting is by using tuples of context-free
productions - (A1 a1 , , Ar ar )
- and define rewriting as the simultaneous
application of all the production components - g1 A1 g2 gr -1 Ar gr
- ? g1 a1 g2 gr -1 ar gr
11- Several parallel rewriting systems have been
defined in the literature, as for instance Matrix
Grammars, Vector Grammars, Scattered Context
Grammars, etc. - These systems are too powerful. They can
generate - NP-complete languages
- Non-semilinear languages
12Local rewriting is a restriction that requires
each application of a production to rewrite
symbols that have been introduced in a single
step in a sentential form Example (A
ABA , A BA )
? A B A C B A
A C A
? A B A B A C B B A
13Local rewriting can be implemented using
superscript indices to distinguish among
different occurrences of the same symbol Example
(A A (1) B (1) A (1) , A B
(1) A (1) ) A (1) C (1) A (1) ? A (2) B (2)
A (2) C (1) B (2) A (2) ? A (3) B (3) A
(3) B (2) A (2) C (1) B (2) B (3) A (3)
14By combining parallel rewriting and local
rewriting we obtain a local parallel
grammar Example damoy Pat rano pashol
lit home Pat early went Pat went home
early (Spashol VPpashol(1) NPpat(2)
VPpashol (1)) (VPpashol VPpashol(1) ,
VPpashol Advrano(2) VPpashol(1)) (VPpash
ol NPdamoy(1) , VPpashol Vpashol(2))
15Example (contd) Spashol(1) ?
VPpashol(2) NPpat(3) VPpashol(2) ?
VPpashol(4) NPpat(3) Advrano(5)
VPpashol(4) ? NPdamoy(6) NPpat(3)
Advrano(5) Vpashol(7) ? damoy Pat rano
pashol
16Parallelism and locality have been exploited in
many rewriting systems that have been
independently defined in the literature,
motivated by different application domains All
these superficially different systems were later
shown to be generatively equivalent This
provides evidence that parallel rewriting and
local rewriting are natural notions
17- Known local parallel rewriting systems
- Syntax-directed compilers
- Deterministic Tree-Walking Transducers (Aho and
Ullman, 1971) - Visual and relational languages, syntactic
pattern matching and biological data modeling - String generating Context-Free Hypergraph
Grammars (Bauderon Courcelle, 1987 Habel
Kreowsky, 1987) - Formal language and translation theory
- Finite Copying Top-Down Tree-to-String
Transducers (Engelfriet et al., 1980) - Local Unordered Scattered Grammars (Rambow
Satta, 1998)
18- Known local parallel rewriting systems (contd)
- Natural language processing
- Linear Context-Free Rewriting Systems
(Vijay-Shanker, Weir Joshi, 1987) - Multiple Context-Free Grammars (Kasami et al.,
1987) - Multi-Component Tree Adjoining Grammars (Weir,
1988) - Finite-copying Lexical Functional Grammars (Seki
et al., 1993) - Minimalist Grammars (Stabler, 1997)
- Simple Range Concatenation Grammars (Boullier,
1998)
19- Summary
- Part I Introduction
- Parallel rewriting
- Locality
- Part II Theory
- Descriptional complexity
- Normal forms and language hierarchy
- Part III Applications
- Synchronous rewriting
- Translation Algorithms
20- Languages generated by local parallel rewriting
systems have the following important properties - Are included in the class of Context-Sensitive
Languages - Can be parsed in deterministic polynomial time
- Are semilinear
- These languages belong to the class of Mildly
Context Sensitive Languages (Joshi 1985 Joshi,
Vijay-Shanker Weir, 1991)
21In a local parallel rewriting system, derivations
can be described by the trees generated by a
context-free grammar
( Spashol VPpashol(1) Npat(2)
VPpashol(1) )
( Npat Pat )
( VPpashol VPpashol(1), VPpashol
Advrano(2) VPpashol )
( VPpashol NPdamoy(1), VPpashol
Vpashol(2) )
. . .
. . .
22- Parallelism and locality can be viewed as
resources and can therefore be measured - Degree of parallelism fan-out
- max number of components in productions
- Degree of locality rank
- max number of productions that can rewrite a
local domain - max branching in underlying derivation
23Example
24- Examples
- Linear Context-Free Languages f 1 , r
1 - Context-Free Languages f 1 , r 2
- Tree-Adjoining Languages f 2 , r 2
- Questions
- Are there normal forms with bounded fan-out ?
- Are there normal forms with bounded rank ?
- Can we trade the two resources ?
25Fan-out f , rank r
r 1
2
. . .
3
4
LCFL
nl-CFL
f 1
2
ET0Lf2
TAL
3
CCL3
4
26- Summary
- Part I Introduction
- Parallel rewriting
- Locality
- Part II Theory
- Descriptional complexity
- Normal forms and language hierarchy
- Part III Applications
- Synchronous rewriting
- Translation Algorithms
27- A synchronous grammar is a formal model of
translation - The model combines grammars by pairing
productions that represent corresponding phrases - Rewriting is carried out synchronously
- Examples
- Finite Transducers
- Syntax Directed Translation Schemata
- Synchronous Tree Adjoining Grammars
28Example ongaku wo kiku lit music to
listening listening to music VPlistening
Vlistening(1) PPmusic(2) , VPkiku
PPongaku(2) Vkiku(1) PPmusic
Pto(1) NPmusic(2) , PPongaku
NPongaku(2) Pwo(1)
29- Synchronous grammars can be viewed as specific
applications of local parallel rewriting - Parallelism is exploited to rewrite on several
dimensions - Locality is exploited to enforce synchronous
rewriting - In this way we have a common and well-understood
framework for investigating synchronous rewriting
and for comparing already known synchronous
formalisms
30Example damoy Pat rano pashol lit home
Pat early went Pat went home early
(Spashol VPpashol(1) NPpat(2) VPpashol
(1)), (Swent NPpat(2) VPwent (1)) .
. . (VPpashol NPdamoy(1), VPpashol
Vpashol(2)), (VPwent Vwent(2)
NPhome(1))
31Example (contd) Spashol(1) , Swent(1)
? VPpashol(2) NPpat(3) VPpashol(2) ,
NPpat(3) VPwent(2) ? VPpashol(4)
NPpat(3) Advrano(5) VPpashol(4),
NPpat(3) VPwent(4) Advearly(5) ?
NPdamoy(6) NPpat(3) Advrano(5)
Vpashol(7), NPpat(3) Vwent(7)
NPhome(6) Advearly(5) ? damoy Pat rano
pashol, Pat went home early
32- Translation problem for synchronous grammar G
G1, G2 - Input string u
- Output set T (G, u ) of all strings that
translate u (and their derivations) - We encode T (G, u ) as a local parallel grammar
33- Assume synchronous grammar G G1, G2
- L (Gj ) language freely generated by grammar
component Gj - Tj projection on the j th dimension of
translation T (G ) - In general we have L (Gj ) ? Tj
- We can construct a local parallel grammar
auto-proj(G, j ) that generates Tj - Weak language preservation property
34- We can intersect a synchronous translation T (G )
with relation L (M1) L (M2), where M1 , M2 are
finite automata - This is done through a generalization of a
construction for CFG due to (Bar-Hillel et al.,
1964) - The result is a synchronous grammar
35- Algorithm
- Construct G? by intersecting G and finite
automata M1 and M2, where M1 generates u and M2
generates S - Construct local parallel grammar auto-proj(G? ,
2)
36Local parallel rewriting systems have been used
in several fields of Computer Science These
languages form a two-dimensional non-collapsing
hierarchy on the fan-out and rank
parameters Synchronous rewriting can be viewed as
a specific application of local parallel
rewriting Translation algorithms can be developed
on the basis of already known parsing techniques