Title: Relation Alignment for Textual Entailment Recognition
1Relation Alignment for Textual Entailment
Recognition Cognitive Computation Group,
University of Illinois
Recognizing Textual Entailment
Experimental Results
Title
The RATER System
The RATER system first annotates entailment pairs
with a suite of NLP analytics, generating a
multi-view representation mapping each analysis
to the original text. Resource-specific metrics
are then used to compare constituents in each
(T,H) paired view (e.g., NE metrics are used to
compare constituents in the T, H Named Entity
views) to build a match graph. An Aligner then
selects edges from these graphs (see panel
below). Features are then extracted over the
resulting set of alignments, and used to train a
classifier which is used to label examples.
The RATER system was trained using the RTE5
Development corpus and evaluated on the RTE5 Test
corpus. We compare the systems performance
against a smart lexical baseline that uses
WordNet-based similarity resources. In addition,
we carried out an ablation study with three
versions of the system without WordNet-based
resources (No WN) without Named Entity
resources (No NE) and with simple Named Entity
similarity (Basic NE). After the submission
deadline, we augmented the shallow semantic
predicates in the full system using Coreference
information to create predicates spanning
multiple sentences (Coref). Table 1 shows
the performance of these variants of the system
on the Development corpus, while table 2 shows
the results on the Test corpus. Performance is
consistent with the expected behavior of the
system as semantic resources are removed, system
performance declines. Wordnet (Miller et al.
1990), Named Entity (Ratinov and Roth, 2009), and
Coreference (Bengtson and Roth, 2009) each make a
significant contribution to overall performance.
The task of Recognizing Textual Entailment frames
Natural Language Text understanding as
recognizing when two text spans express the same
meaning. In the example below, the text span
T contains the meaning of the text span H,
so a successful RTE system would say that T
entail s H.
T The Shanghai Co-operation Organization
(SCO), is a fledgling association that binds
Russia, China and four other nations. H China
is a member of SCO.
Contributions
- Most successful systems share a basic assumption
that semantics is largely compositional, meaning
that we can combine the results of local
entailment decisions to reach a global decision.
Many systems share the same basic architecture - Preprocess the TE pair with a range of NLP tools
- Determine some structure over each sentence in
the Entailment pair - Align some level of structure in the Hypothesis
with structure in the Text - Either directly compute entailment result based
on alignment (either online or in batch mode)OR
Extract features using alignment (and possibly
other resources), and determine the label of the
TE pair based on this feature representation. - (Zanzotto et al. 2006) take the first approach,
computing the best alignment for each pair,
then learning a classifier over all aligned pairs
in a corpus, thereby using alignment directly to
determine the entailment label. - Others, such as (Hickl et al. 2007, de Marneffe
et al 2008) use alignment as a filtering step to
select among possible feature sources. (Zanzotto
and Moschitti 2006). explain their alignment as
capturing valid and invalid syntactic
transformations across many entailment pairs.
(de Marneffe et al.) propose an alignment task
that is separate from the entailment decision, in
which elements in the Hypothesis are paired with
the most relevant elements of the Text. - We believe that Alignment is a valuable inference
framework in RTE, but found problems with
existing approaches when we tried to incorporate
new analysis and comparison resources. In the
present work, we share our insights about the
Alignment process and its relation to Textual
Entailment decisions.
- Identify clear roles for Alignment in Textual
Entail- ment systems filter and decider - Propose an alignment framework to leverage
focused knowledge resources, avoid canonization
Figure 1 Architecture of the RATER system
Alignment over Multiple Views
In the alignment step, instead of aligning only a
single shallow or unified representation (as
previous alignment systems have done), RATER
divides the set of views in to groups, and
computes a separate alignment for each group
(groups contain analysis sources for which the
comparison metrics share a common output scale).
Within each alignment, RATER selects the edges
that maximize match score while minimizing the
distance of mapped constituents in the text from
each other the objective function is given
below. The selected constituents of H must
respect the constraint that each token in H may
be mapped to at most one token in T.
RTE5 Development RTE5 Development RTE5 Development RTE5 Development RTE5 Development
System All QA IE IR
Baseline 0.628 0.641 0.557 0.683
Submtd 0.648 0.647 0.552 0.744
No NE 0.640 0.631 0.577 0.708
Basic NE 0.623 0.655 0.543 0.670
No WN 0.647 0.650 0.533 0.755
Coref 0.663 0.665 0.559 0.765
Figure 3 Example showing multiple alignments
over different views of the entailment pair
Table 1 RTE5 2-way Task Results (Dev. Corpus)
RTE5 Test RTE5 Test RTE5 Test RTE5 Test RTE5 Test
System All QA IE IR
Baseline 0.600 0.550 0.500 0.750
Submtd 0.644 0.580 0.576 0.775
No NE 0.629 0.580 0.530 0.775
Basic NE 0.633 0.580 0.605 0.715
No WN 0.603 0.565 0.535 0.710
Coref 0.666 0.596 0.615 0.785
Figure 2 Objective function for Alignment
Selected References
Marie-Catherine de Marneffe, Trond Grenager, Bill
MacCartney, Daniel Cer, Daniel Ramage, Chloe
Kiddon, and Christopher D. Manning Aligning
semantic graphs for textual inference and machine
reading. In AAAI Spring Symposium at Stanford,
2007. Andrew Hickl, John Williams, Jeremy
Bensley, Kirk Roberts, Bryan Rink, and Ying Shi
Recognizing textual entailment with LCCs
groundhog system. In Proc. of the 2nd PASCAL
Challenges Workshop on Recognizing Textual
Entailment, 2006. Fabio Massimo Zanzotto and
Alessandro Moschitti Automatic learning of
textual entailments with cross-pair similarities.
In Proceedings of the 21st Intl. Conf. on
Computational Linguistics and 44th Annual Meeting
of the ACL, 2006. L. Ratinov and D. Roth Design
challenges and misconceptions in named entity
recognition. In Proc. of CoNLL 2009. E. Bengtson
and D. Roth Understanding the value of features
for coreference resolution, in EMNLP 2008.
Table 2 RTE5 2-way Task Results (Test Corpus)
Mark Sammons, V.G.Vinod Vydiswaran, Tim Vieira,
Nikhil Johri, Ming-Wei Chang, Dan Goldwasser,
Vivek Srikumar, Gourab Kundu, Yuancheng Tu, Kevin
Small, Joshua Rule, Quang Do, Dan Roth