Temporal Information Extraction - PowerPoint PPT Presentation

1 / 100
About This Presentation
Title:

Temporal Information Extraction

Description:

Title: Previous Results Author: Valued PowerSpec Customer Last modified by: MITRE Employee Created Date: 12/12/2001 4:37:17 PM Document presentation format – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 101
Provided by: Value56
Category:

less

Transcript and Presenter's Notes

Title: Temporal Information Extraction


1
Temporal Information Extraction
  • Inderjeet Mani
  • imani_at_mitre.org

2
Outline
  • Introduction
  • Linguistic Theories
  • AI Theories
  • Annotation Schemes
  • Rule-based and machine-learning methods.
  • Challenges
  • Links

3
Motivation Question-Answering
  • When is Ramadan this year?
  • What was the largest U.S. military operation
    since Vietnam?
  • Tell me the best time of the year to go
    cherry-picking.
  • How often do you feed a pet gerbil?
  • Is Gates currently CEO of Microsoft?
  • Did the Enron merger with Dynegy take place?
  • How long did the hostage situation in Beirut
    last?
  • What is the current unemployment rate?
  • How many Iraqi civilian casualties were there in
    the first week of the U.S. invasion of Iraq?
  • Who was Secretary of Defense during the Gulf War?

4
Motivation Coherent and Faithful Summaries
..worked in recent summers.. ..was the source of
the virus last week.. where Morris was a
computer science undergraduate until
June.. ..whose virus program three years ago
disrupted
  • Single-document sentence extraction summarizers
    are plagued by dangling references
  • especially temporal ones
  • Multi-Document summarizers can be misled by the
    weakness of vocabulary overlap methods
  • leads to inappropriate merging of distinct events

5
An Example Story
  • Feb. 18, 2004
  • Yesterday Holly was running a marathon when she
    twisted her ankle. David had pushed her.

1. When did the running occur? Yesterday. 2. When
did the twisting occur? Yesterday, during the
running. 3. Did the pushing occur before the
twisting? Yes. 4. Did Holly keep running after
twisting her ankle? 5. Probably not.
6
Temporal Information Extraction Problem
  • Feb. 18, 2004
  • Yesterday Holly was running a marathon when she
    twisted her ankle. David had pushed her.
  • Input A natural language discourse
  • Output representation of events and their
    temporal relations

7
IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
Idea for Temporal IE Make progress by focusing
on a particular top-down slice (i.e., time),
using its rich structure
8
Theories
AI logic
Formal Linguistics
9
Linguistic Theories
  • Events
  • Event Structure (event subclasses and parts)
  • Tense (indicates location of event in time, via
    verb inflections, modals, auxiliaries, etc.)
  • Grammatical Aspect (indicates whether event is
    ongoing, finished, completed)
  • Time Adverbials
  • Relations between events and/or times
  • temporal relations
  • we will also need discourse relations

10
Tense
  • All languages that have tense (in the semantic
    sense of locating events in time) can express
    location in time
  • Location can be expressed relative to a deictic
    center that is the current moment of speech, or
    speech time, or speech point
  • e.g., tomorrow, yesterday, etc.
  • Languages can also express temporal locations
    relative to a coordinate system
  • a calendar, e.g., 1991 (A.D.),
  • a cyclically occurring event, e.g., morning,
    spring,
  • an arbitrary event, e.g., the day after he
    married her.
  • A language may have tense in the above semantic
    sense, without expressing it using tense
    morphemes
  • Instead, aspectual morphemes and/or modals and
    auxiliaries may be used.

11
Mandarin Chinese
  • Has semantic tense
  • Lacks tense morphemes
  • Instead, it uses aspect markers to indicate
    whether an event is ongoing (-zhai, -le),
    completed (-wan), terminated (-le, -guo), or in a
    result state (-zhe)
  • But aspect markers are often absent

? ? ?? wo kan dianshi I watch / will watch
/ watched TV Example from Congmin Min, MS
Thesis, Georgetown, 2005.
12
Burmese
  • No semantic tense, but all languages that lack
    semantic tense all have a realis/irrealis
    distinction.
  • Events that are ongoing or that were observed in
    the past are expressed by sentence-final realis
    particles te, -tha, -ta, and hta.
  • For unreal or hypothetical events (including
    future and present and hypothetical past events),
    the sentence-final irrealis particles me, -ma,
    and hma are used.

Comrie, B. Tense. Cambridge, 1985.
13
Tense as Anaphor Reichenbach
  • A formal method for representing tense, based on
    which one can locate events in time
  • Tensed utterances introduce references to 3 time
    points
  • Speech Time S
  • Event Time E
  • Reference Time R
  • SI had mailed the letterE when John came
    told me the newsR
  • E lt R lt S
  • Three temporal relations are defined on these
    time points
  • at, before, after
  • 13 different relations are possible
  • N.B. the concept of time point is an
    abstraction - it can map to an interval

14
Reichenbachian Tense Analysis
  • Tense is determined by relation between R and S
  • RS, RltS, RgtS
  • Aspect is determined by relation between E and R
  • ER, E lt R, Egt R
  • Relation of E relative to S not crucial
  • Represent RltSE as EgtRltS
  • Only 7 out of 13 relations are realized in
    English
  • 6 different forms, simple future being ambiguous
  • Progressive no different from simple tenses
  • But I was eating a peach ?gt I ate a peach

EgtRltS
EltRgtS
15

Priorean Tense Logic
  • G? It is always going to be the case that ?.H?
    It always has been the case that ?.F? It will
    be at some point in the future be the case that
    ?. P? It was at some point in the past the case
    that ?. F? G? P? H?
  • System Kt(a) ? ? H F ? What is, has always
    been going to be(b ? ? G P ? What is, will
    always have been(c) H(? ??) ? (H? ? H?)
    Whatever always follows from what always has
    been, always has been(d) G(? ??) ? (G? ? G?)
    Whatever always follows from what always will be,
    always will be.

16
Tense as Operator Prior
  • Free iteration captures many more tenses,
  • I would have slept PFP?
  • But also expresses many non-NL tenses
  • PPPP? It was the case4 John had slept

17
Event Classes (Lexical Aspect)
  • STATIVES know, sit, be clever, be happy,
    killing, accident
  • can refer to state itself (ingressive) John knows
    , or to entry into a state (inceptive) John
    realizes
  • John is knowing Bill, Know the answer, What
    John did was know the answer
  • ACTIVITIES walk, run, talk, march, paint
  • if it occurs in period t, a part of it (the same
    activity) must occur for most sub-periods of t
  • X is Ving entails that X has Ved
  • John ran for an hour,John ran in an hour
  • ACCOMPLISHMENTS build, cook, destroy
  • culminate (telic)
  • X is Ving does not entail that X has Ved.
  • John booked a flight in an hour, John stopped
    building a house
  • ACHIEVEMENTS notice, win, blink, find, reach
  • instantaneous accomplishments
  • John dies for an hour, John wins for an hour,
    John stopped reaching New York

18
Aspectual Composition
  • Expressions of one class can be transformed into
    one of another class by combining with another
    expression.
  • e.g., an activity can be changed into an
    accomplishment by adding an adverbial phrase
    expressing temporal or spatial extent
  • I walked (activity)
  • I walked to the station / a mile / home
    (accomplishment)
  • I built my house (accomplishment).
  • I built my house for an hour (activity).
  • Moens Steedman (1988) implement aspectual
    composition in a transition network

19
Example Classifying Question Verbs
  • Androutsopouloss (2002) NLITDB system allows
    users to pose temporal questions in English to an
    airport database that uses a temporal extension
    of SQL
  • Verbs in single-clause questions with non-future
    meanings are treated as states
  • Does any tank contain oil?
  • Some verbs may be ambiguous between a (habitual)
    state and an accomplishment
  • Which flight lands on runway 2?
  • Does flight BA737 land on runway 2 this afternoon
  • Activities are distinguished using the
    imperfective paradox
  • Were any flights taxiing? implies that they
    taxied
  • Were any flights taxiing to gate 2? does not
    imply that they taxied.
  • So, taxi will be given
  • an activity verb sense, one that doesnt expect a
    destination argument, and
  • an accomplishment verb sense, one that expects a
    destination argument.

20
Grammatical Aspect
  • Perfective focus on situation as a whole
  • John built a house
  • Imperfective focus on internal phases of
    situation
  • John was building a house

English Verbal tense and aspect morphemes, e.g., for present and past perfect
French Tense (passé composé)
Mandarin morphemes le and guo
English progressive verbal inflection -ing
French Tense (imparfait)
Mandarin progressive morpheme zai and resultative morpheme zhe.
was building.a.h
21
Inferring Temporal Relations
  1. Yesterday Holly was running a marathon when she
    twisted her ankle. FINISHES David had pushed her.
    BEFORE
  2. I had mailed the letter when John came told me
    the news AFTER
  3. Simpson made the call at 3. Later, he was spotted
    driving towards Westwood. AFTER
  4. Max entered the room. Mary stood up/was seated on
    the desk. AFTER/OVERLAP
  5. Max stood up. John greeted him. AFTER
  6. Max fell. John pushed him. BEFORE
  7. Boutros-Ghali Sunday opened a meeting in Nairobi
    of ....He arrived in Nairobi from South Africa
    BEFORE
  8. John bought Mary some flowers. He picked out
    three red roses. DURING

22
Linguistic Information Needed for Temporal IE
  • Events
  • Tense
  • Aspect
  • Time adverbials
  • Explicit temporal signals (before, since, at,
    etc.)
  • Discourse Modeling
  • For disambiguation of time expressions based on
    context
  • For tracking sequences of events (tense/aspect
    shifts)
  • For computing Discourse Relations
  • Commonsense Knowledge
  • For inferring Discourse Relations
  • For inferring event durations

23
Narrative Ordering
  • Temporal Discourse Interpretation Principle
    (Dowty 1979)
  • Reference time for the current sentence is a time
    consistent with its time adverbials if any, or
    else it immediately follows reference time of the
    previous sentence.
  • The overlap of statives is a pragmatic
    inference,(hinting at a theory of defaults)
  • A man entered the White Hart. He was wearing a
    black jacket. Bill served him a beer.
  • Discourse Representation Theory (Kamp and Reyle
    1993)
  • In successive past tense sentences which lack
    temporal adverbials, events advance the narrative
    forward, while states do not.
  • Overlapping statives come out of semantic
    inference rules
  • Neither theory explicitly represents discourse
    relations, though they are needed (e.g., 6-8
    above)

24
Discourse Representation Theory (example)
  • A man entered the White Hart. He was wearing a
    black jacket. Bill served him a beer.

Rpt ? e1, t1, x, y enter(e1, x, y), man(x),
y theWhiteHart t1 lt n, e1 ? t1 Rpt ?
e1 -----------------------------------------------
----------- e2, t2, x1, y1 PROG(wear(e2, x1,
y1)), black-jacket(y1), x1x t2 lt n, e2 ? t2, e1
? e2 ---------------------------------------------
------------- e3, t3, x2, y2, z serve(e3, x2, y2,
z), beer(z), x2Bill, y2x t3 lt n, e3 ? t3 Rpt ?
e3 e1 lt e3
25
Overriding Defaults
  • Lascarides and Asher (1993) temporal ordering
    is derived entirely from discourse relations
    (that link together DRSs, based on SDRT
    formalism).
  • Example
  • Max switched off the light. The room was pitch
    dark.
  • Default inference OVERLAP
  • Use an inference rule that if the room is dark
    and the light was just switched off, the
    switching off caused the room to become dark.
  • Inference AFTER
  • Problem requires large doses of world knowledge

LP 1993
26
Outline
  • Introduction
  • Linguistic Theories
  • AI Theories
  • Annotation Schemes
  • Rule-based and machine-learning methods.
  • Challenges
  • Links

27
Time and Events in Logic
Events
Time
Time
Time
Events
Events
Instants
Intervals
Intervals
Intervals
Instants
Instants
28
Instant Ontology
  • Consider the event of Johns reading the book
  • Decompose into an infinite set of infinitesimal
    instants
  • Let T be a set of temporal instants.
  • Let lt (BEFORE) be a temporal ordering relation
    between instants
  • Properties irreflexive, antisymmetric,
    transitive, and complete
  • Antisymmetric gt time has only one direction of
    movement
  • Irreflexive and Transitive gt time is
    non-cyclical
  • Complete gt lt is a total ordering

29
Instants -- Problem Where Truth Values Change
  • P The race is on
  • T-R the time of running the race
  • T-AR the time after running the race
  • R and AR have to meet somewhere
  • If we choose instants, there is some instant x
    where T-R and AR meet
  • Either we have P and not P both true at x, or
    there is a truth value gap at x
  • This is called the Divided Instant Problem
    (D.I.P.)

30
Ordering Relations on Intervals
  • Unlike instants, where we have only lt, we can
    have at least 3 ordering relations on intervals
  • Precedence lt I1 lt I2 iff ?t1 ? I1, ?t2 ? I2, t1
    lt t2 (where lt is defined over instants)
  • Temporal Overlap O I1 O I2 iff I1? I2 ? ?
  • Temporal Inclusion ? I1 ? I2 iff I1? I2

31
Instants versus Intervals
  • Instants
  • We understand the idea of truth at an instant
  • In cases of continuous change, e.g., a tossed
    ball, we need a notion of a durationless event in
    order to explain the trajectory of the ball just
    before it falls
  • Intervals
  • We often conceive of time as broken up in terms
    of events which have a certain duration, rather
    than as a (infinite) sequence of durationless
    instants.
  • Many verbs do not describe instantaneous events.,
    e.g., has read, ripened
  • Duration expressions like yesterday afternoon
    arent construed as instants

32
Allens Interval-Based Ontology
  • Instants are banished
  • So, avoids the divided instant problem
  • Short duration intervals will be instant-like
  • Uses 13 relations
  • Relations are mutually exclusive
  • All 13 relations can be expressed using meet
  • ?X?Y Before (X, Y) ?
  • ?Z meet(X, Z)
  • meet(Z, Y)

James F. Allen, Towards a General Theory of
Action and Time, Artificial Intelligence 23
(1984) 12354.
33
Allens 13 Temporal Relations
lt, gt
m, mi
o, oi
s, si
f, fi
d, di
34
lt gt d di o oi m mi s si f fi
lt lt ? ltomds lt lt ltomds lt ltomds lt lt ltomds lt
gt ? gt gt oi mi d f gt gt oi mi d f gt gt oi mi d f gt gt oi mi d f gt gt gt
d lt gt d ? ltomds gt oi mi d f lt gt d gt oi mi d f d ltomds
di
o
oi
m
mi
s
si
f
fi
35
Temporal Closure Sputlink in TANGO
Verhagen (2005)
36
AI Reasoning about Events
  • John gave a book to Mary
  • Situation Calculus
  • Holds(Have(John, book), t1)
  • Holds(Have(Mary, book), t2)
  • Holds(Have(Z, Y), Result(give(X, Y, Z), t))
  • t-i are states
  • Concurrent actions cannot be represented
  • No duration of actions or delayed effects
  • Event Calculus
  • HoldsAt(Have(J, B), t1)
  • HoldsAt(Have(M, B), t2)
  • Terminates(e1, Have(J, B))
  • Initiates(e1, Have(M, B))
  • Happens(e, t)
  • t is a time point
  • Involves non-monotonic reasoning
  • Handles frame problem using circumscription

37
Temporal Question-Answering using IE Event
Calculus
  • Mueller (2004) Takes instantiated MUC terrorist
    event templates and represents information in EC
  • Adds commonsense knowledge about terrorist domain
  • e.g., if a bomb explodes, its no longer
    activated
  • Commonsense knowledge includes frame axioms
  • e.g., if an object starts falling, then its
    height will be released from the commonsense law
    of inertia
  • Example temporal questions
  • Was the car dealership damaged before the
    high-power bombs exploded? Ans No.
  • Requires reasoning that the damage did not occur
    at all times t prior to the explosion
  • Problem requires large doses of world knowledge

Mueller, Erik T. (2004). Understanding
script-based stories using commonsense
reasoning. Cognitive Systems Research, 5(4),
307-340.
38
Temporal Question Answering using IE Temporal
Databases
  • In NLITDB, semantic relation between a question
    event and the adverbial it combines with is
    inferred by a variety of inference rules.
  • State point adverbial
  • Which flight was queueing for runway 2 at 500
    pm?
  • state coerced to an achievement, viewed as
    holding at the time specified by the adverbial.
  • Activity point adverbial
  • can mean that the activity holds at that time, or
    that the activity starts at that time, e.g.,
    Which flight queued for runway 2 at 500 pm?
  • An accomplishment may indicate inception or
    termination
  • Which flight taxied to gate 4 at 500 pm? can
    mean the taxiing starts or ends at 5 pm.

39
Outline
  • Introduction
  • Linguistic Theories
  • AI Theories
  • Annotation Schemes
  • Rule-based and machine-learning methods.
  • Challenges
  • Links

40
IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
41
Events in NLP
  • Topic well-defined subject for searching
  • document- or collection-level
  • Template structure with slots for participant
    named entities
  • document-level
  • Mention linguistic expression that expresses an
    underlying event
  • phrase-level (verb/noun)

42
Event Characteristics
  • Can have temporal a/o spatial locations
  • Can have types
  • assassinations, bombings, joint ventures, etc.
  • Can have members
  • Can have parts
  • Can have people a/o other objects as participants
  • Can be hypothetical
  • Can have not happened

43
MUC Event Templates
Wall Street Journal, 06/15/88 MAXICARE HEALTH
PLANS INC and UNIVERSAL HEALTH SERVICES INC have
dissolved a joint venture which provided health
services.
44
ACE Event Templates
Type Subtype
Life Be-Born, Marry, Divorce, Injure, Die
Movement Transport
Transaction Transfer-Ownership, Transfer-Money
Business Start-Org, Merge-Org, Declare-Bankruptcy, End-Org
Conflict Attack, Demonstrate
Contact Meet, Phone-Write
Personnel Start-Position, End-Position, Nominate, Elect
Justice Arrest-Jail, Release-Parole, Trial-Hearing, Charge-Indict, Sue, Convict, Sentence, Fine, Execute, Extradite, Acquit, Appeal, Pardon
  • Four additional attributes for each event mention
  • Polarity (it did or did not occur)
  • Tense (past, present, future)
  • Modality (real vs. hypothetical)
  • Genericity (specific vs. generic)
  • Argument slots (4 -7) specific to each event
  • E.g., Trial-Hearing event has slots for the
    Defendant, Prosecutor, Adjudicator, Crime, Time,
    and Place.

From Lisa Ferro _at_MITRE
45
Mention-Level Events
  • Event expressions
  • tensed verbs has left, was captured, will
    resign
  • stative adjectives sunken, stalled, on board
  • event nominals merger, Military Operation, war
  • Dependencies between events and times
  • Anchoring John left on Monday.
  • Orderings The party happened after midnight.
  • Embedding John said Mary left.

46
TIMEX2 (TIDES/ACE) Annotation Scheme
  • Time Points ltTIMEX2 VAL"2000-W42"gtthe third week
    of Octoberlt/TIMEX2gt
  • Durations ltTIMEX2 VALPT30Mgthalf an hour
    longlt/TIMEX2gt
  • Indexicality ltTIMEX2 VAL2000-10-04gttomorrowlt/T
    IMEX2gt
  • He wrapped up a ltTIMEX2 VAL"PT3H"
    ANCHOR_DIR"WITHIN" ANCHOR_VAL"1999-07-15"gtthree-
    hourlt/TIMEX2gt meeting with the Iraqi president in
    Baghdad ltTIMEX2 VAL"1999-07-15"gttodaylt/TIMEX2gt.
  • Sets ltTIMEX2 VALXXXX-WXX-2" SET"YES
    PERIODICITY"F1W" GRANULARITYG1Dgtevery
    Tuesdaylt/TIMEX2gt
  • Fuzziness ltTIMEX2 VAL1990-SUgtSummer of 1990
    lt/TIMEX2gt
  • ltTIMEX2 VAL1999-07-15TMOgtThis
    morninglt/TIMEX2gt
  • ltTIMEX2 VAL2000-10-31TNI MODSTARTgtearly
    last nightlt/TIMEX2gt

47
TIMEX2 Inter-annotator Agreement
  • Georgetown/MITRE (2001)
  • 193 English docs, .79 F Extent, .86 F VAL
  • 5 annotators
  • Annotators deviate from guidelines, and produce
    systematic errors (fatigue?)
  • several years ago PXY instead of PAST_REF
  • all day P1D instead of YYYY-MM-DD
  • LDC (2004)
  • 49 English docs, .85 F Extent, .80F VAL
  • 19 Chinese docs, .83 Extent
  • 2 annotators

48
Example of Annotator Difficulties (TERN 2004)
Time Expression Recognition and Normalization
Competition (timex2.mitre.org)
49
TIMEX2 A Mature Standard
  • Extensively debugged
  • Detailed guidelines for English and Chinese
  • Evaluated for English, Arabic, Chinese, Korean,
    Spanish, French, Swedish, and Hindi
  • Applied to news, scheduling dialogues, other
    types of data
  • Corpora available through ACE, MITRE

50
Temporal Relations in ACE
  • Restricted to verbal events (verbs of scheduling,
    occurrence, aspect etc.)
  • The event and the timex must be in the same
    sentence
  • Eight temporal relations
  • Within
  • The bombing occurred during the night.
  • Holds
  • They were meeting all night.
  • Starting, Ending
  • The talks ended (on) Monday.
  • Before, After
  • The initial briefs have to be filed by 4 p.m.
    Tuesday
  • At-Beginning, At-End
  • Sharon met with Bill at the start of the
    three-day conference

From Lisa Ferro _at_MITRE
51
Outline
  • Introduction
  • Linguistic Theories
  • AI Theories
  • Annotation Schemes
  • Rule-based and machine-learning methods.
  • Challenges
  • Links

52
TimeML Annotation Scheme
  • A Proposed Metadata Standard for Markup of
    events, their temporal anchoring, and how they
    are related to each other
  • Marks up mention-level events, time expressions,
    and links between events (and events and times)
  • Developer James Pustejovsky ( co.)

53
An Example Story
  • Feb. 18, 2004
  • Yesterday Holly was running a marathon when she
    twisted her ankle. David had pushed her.

1. When did the running occur? Yesterday. 2. When
did the twisting occur? Yesterday, during the
running. 3. Did the pushing occur before the
twisting? Yes. 4. Did Holly keep running after
twisting her ankle? 5. Probably not.
54
An Attested Story
  • AP-NR-08-15-90 1337EDT
  • Iraq's Saddam Hussein, facing U.S. and Arab
    troops at the Saudi
  • border, today sought peace on another front by
    promising to
  • withdraw from Iranian territory and release
    soldiers captured
  • during the Iran-Iraq war. Also today, King
    Hussein of Jordan arrived in
  • Washington seeking to mediate the Persian Gulf
    crisis. President Bush on
  • Tuesday said the United States may extend its
    naval quarantine to Jordan's
  • Red Sea port of Aqaba to shut off Iraq's last
    unhindered trade route.
  • Past lt Tuesday lt Today lt Indef Future
  • __________________________________________________
    _________
  • war said sought withdraw
  • captured release
  • arrived extend
  • quarantine

55
TimeML Events
  • AP-NR-08-15-90 1337EDT
  • Iraq's Saddam Hussein, facing U.S. and Arab
    troops at the Saudi
  • border, today sought peace on another front by
    promising to withdraw from Iranian
  • territory and release soldiers captured
  • during the Iran-Iraq war. Also today, King
    Hussein of Jordan arrived in
  • Washington seeking to mediate the Persian Gulf
    crisis. President Bush on
  • Tuesday said the United States may extend its
    naval quarantine to Jordan's
  • Red Sea port of Aqaba to shut off Iraq's last
    unhindered trade route.
  • In another mediation effort, the Soviet Union
    said today it had
  • sent an envoy to the Middle East on a series of
    stops to include
  • Baghdad. Soviet officials also said Soviet women,
    children and
  • invalids would be allowed to leave Iraq.

56
TimeML Event Classes
  • Occurrence
  • die, crash, build, merge, sell, take advantage
    of, ..
  • State
  • Be on board, kidnapped, recovering, love, ..
  • Reporting
  • Say, report, announce,
  • I-Action
  • Attempt, try,promise, offer
  • I-State
  • Believe, intend, want,
  • Aspectual
  • begin, start, finish, stop, continue.
  • Perception
  • See, hear, watch, feel.

57
Temporal Anchoring Links
  • AP-NR-08-15-90 1337EDT
  • Iraq's Saddam Hussein, facing U.S. and Arab
    troops at the Saudi
  • border, today sought peace on another front by
    promising to
  • withdraw from Iranian territory and release
    soldiers captured
  • during the Iran-Iraq war. Also today, King
    Hussein of Jordan arrived in
  • Washington seeking to mediate the Persian Gulf
    crisis. President Bush on
  • Tuesday said the United States may extend its
    naval quarantine to Jordan's
  • Red Sea port of Aqaba to shut off Iraq's last
    unhindered trade route.
  • In another mediation effort, the Soviet Union
    said today it had
  • sent an envoy to the Middle East on a series of
    stops to include
  • Baghdad. Soviet officials also said Soviet women,
    children and
  • invalids would be allowed to leave Iraq.

58
TLINK Types
  • Simultaneous (happening at the same time)
  • Identical (referring to the same event)
  • John drove to Boston. During his drive he ate a
    donut.
  • Before the other
  • In six of the cases suspects have already been
    arrested.
  • Immediately before the other
  • All passengers died when the plane crashed into
    the mountain.
  • Including the other
  • John arrived in Boston last Thursday.
  • Exhaustively during the duration of the other
  • John taught for 20 minutes.
  • Beginning of the other
  • John was in the gym between 600 p.m. and 700
    p.m.
  • Ending of the other
  • John was in the gym between 600 p.m. and 700
    p.m.

59
TLINK Example
John taught 20 minutes every Monday. John
ltEVENT eid"e1" class"OCCURRENCE"gt taught
lt/EVENTgt ltMAKEINSTANCE eiid"ei1" eventID"e1"
pos"VERB" tense"PAST" aspect"NONE"
polarity"POS"/gt ltTIMEX3 tid"t1"
type"DURATION" value"P20TM"gt 20 minutes
lt/TIMEX3gt ltTIMEX3 tid"t2" type"SET"
value"xxxx-wxx-1" quant"EVERY"gt every Monday
lt/TIMEX3gt ltTLINK timeID"t1" relatedToTime"t2"
relType"IS_INCLUDED"/gt ltTLINK
eventInstanceID"ei1" relatedToTime"t1"
relType"DURING"/gt
60
Subordinated Links
  • AP-NR-08-15-90 1337EDT
  • Iraq's Saddam Hussein, facing U.S. and Arab
    troops at the Saudi
  • border, today sought peace on another front by
    promising to withdraw from Iranian
  • territory and release soldiers captured
  • during the Iran-Iraq war. Also today, King
    Hussein of Jordan arrived in
  • Washington seeking to mediate the Persian Gulf
    crisis. President Bush on
  • Tuesday said the United States may extend its
    naval quarantine to Jordan's
  • Red Sea port of Aqaba to shut off Iraq's last
    unhindered trade route.
  • In another mediation effort, the Soviet Union
    said today it had
  • sent an envoy to the Middle East on a series of
    stops to include
  • Baghdad. Soviet officials also said Soviet women,
    children and
  • invalids would be allowed to leave Iraq.

61
SLINK Types
SLINK or Subordination Link is used for contexts
introducing relations between two events, or an
event and a signal, of the following sort
Modal Relation introduced mostly by modal verbs
(should, could, would, etc.) and events that
introduce a reference to a possible world
--mainly I_STATEs John should have bought some
wine. Mary wanted John to buy some wine.
Factive Certain verbs introduce an entailment
(or presupposition) of the argument's veracity.
They include forget in the tensed complement,
regret, manage John forgot that he was in
Boston last year. Mary regrets that she didn't
marry John. Counterfactive The event
introduces a presupposition about the
non-veracity of its argument forget (to), unable
to (in past tense), prevent, cancel, avoid,
decline, etc. John forgot to buy some wine.
John prevented the divorce. Evidential
Evidential relations are introduced by REPORTING
or PERCEPTION John said he bought some wine.
Mary saw John carrying only beer. Negative
evidential Introduced by REPORTING (and
PERCEPTION?) events conveying negative polarity
John denied he bought only beer. Negative
Introduced only by negative particles (not, nor,
neither, etc.), which will be marked as SIGNALs,
with respect to the events they are modifying
John didn't forgot to buy some wine. John did
not wanted to marry Mary.
62
Aspectual Links
  • Th' U.S. military buildup in Saudi Arabia
    corntinued at fevah pace, wif Syrian troops now
    part of a multinashunal fo'ce camped out in th'
    desert t'guard the Saudi kin'dom fum enny noo
    threst by Iraq.
  • In a letter to President Hashemi Rafsanjani of
    Iran, read by a broadcaster over Baghdad radio,
    Saddam said he will begin withdrawing troops from
    Iranian territory a week from tomorrow and
    release Iranian prisoners of war.

63
Towards TIMEX3
  • Decompose more
  • Smaller tag extents compared to TIMEX2
  • ltTIMEX2 ID"t28" VAL"2000-10-02"gtjust days after
    another court dismissed other corruption charges
    against his fatherlt/TIMEX2gt.
  • N. B. extent marking a source of inter-annotator
    disagreements in ACE TERN 2004 evaluation
  • Avoid tag Embedding
  • ltTIMEX2 VAL"1999-08-03"gttwo weeks from ltTIMEX2
    VAL"1999-07-20"gtnext Tuesdaylt/TIMEX2gtlt/TIMEX2gt
  • Include temporal functions for delayed evaluation
  • Allow non-consuming tags
  • Put relationships in Links

64
TIMEX3 Annotation
  • Time Points
  • ltTIMEX3 tidt1 typeTIME valueT2400gtmidn
    ightlt/TIMEX3gt
  • ltTIMEX3 tidt2 typeDATE value2005-02-15
    temporalFunctionTRUE anchorTimeIDt0gttomorrow
    lt/TIMEX3gt
  • Durations
  • ltTIMEX3 tid"t6" type"DURATION" value"P2W"
    beginPoint"t61" endPoint"t62"gttwo
    weekslt/TIMEX3gt from ltTIMEX3 tid"t61" type"DATE"
    value"2003-06-07"gtJune 7, 2003lt/TIMEX3gt
  • ltTIMEX3 tid"t62" type"DATE" value"2003-06-21"
    temporalFunction"true" anchorTimeID"t6"/gt
  • Sets
  • ltTIMEX3 tidt1 typeSET valueP1M
    quantEVERY freqP3Dgt
  • three days every monthlt/TIMEX3gt
  • ltTIMEX3 tidt1 typeSET valueP1M
    freqP2Xgt
  • twice a monthlt/TIMEX3gt

65
TimeML and DAML-Time Ontology
  • We shipe1 2 dayst1 after the purchasee2
  • TimeML
  • ltTLINK eventInstanceIDe1 relatedToTimet1
    relTypeBEGINS/gt
  • lt
  • TLINK eventInstanceIDe1 relatedToEventInstancee2
    relTypeAFTER/gt
  • DAML-OWL
  • atTime(e1, t1) atTime(e2, t2) after(t1, t2)
    timeBetween(T, t1, t2) duration(T, Days)2

Hobbs Pustejovsky, in I. Mani et al., eds.,
The Language of Time
66
Outline
  • Introduction
  • Linguistic Theories
  • AI Theories
  • Annotation Schemes
  • Rule-based and machine-learning methods.
  • Challenges
  • Links

67
IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
68
Callisto Annotation Tool
69
Tabular Annotation of Links
70
TANGO Graphical Annotator
71
Outline
  • Introduction
  • Linguistic Theories
  • AI Theories
  • Annotation Schemes
  • Rule-based and machine-learning methods.
  • Challenges
  • Links

72
IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
73
Timex2/3 Extraction
  • Accuracy
  • Best systems TIMEX2 95 F Extent, .8XF VAL
    (TERN 2004 English)
  • GUTime .85F Extent, .82F VAL (TERN 2004 training
    data English)
  • KTX .87F Extent, .86F VAL (100 Korean
    documents)
  • Machine Learning
  • Tagging Extent easily trained
  • Normalizing Values harder to train

74
TimeML Event Extraction
  • Easier than MUC template events (those were .6F)
  • Part-of-speech tagging to find verbs
  • Lexical patterns to detect tense and lexical and
    grammatical aspect
  • Syntactic rules to determine subordination
    relations
  • Recognition and Disambiguation of event nominals,
    e.g., war, building, construction, etc.
  • Evita (Brandeis)
  • 0.8F on verbal events (overgenerates generic
    events which werent marked in TimeBank)
  • 0.64F on event nominals (WordNet-derived,
    disambiguated via SemCor training)

75
TempEx in Qanda
76
Extracting Temporal Relations based on Tense
Sequences
  • Song Cohen 1991 Adopt a Reichenbachian tense
    representation
  • Use rules for permissible tense sequences
  • When the tense moves from simple present to
    simple past, the event time moves backward, and
    from simple present to simple future, it moves
    forward.
  • When the tense of two successive sentences is the
    same, they argue that the event time moves
    forward, except for statives and unbounded
    processes, which keep the same time.
  • Wont work in cases of discourse moves
  • When the tense moves from present perfect to
    simple past, or present prospective (John is
    going to run) to simple future, the event time of
    the second sentence is less than or equal to the
    event time of the first sentence.
  • However, incorrectly rules out, among others,
    present tense to past perfect transitions.

Song Cohen AAAI91
77
Extracting Temporal Relations by Heuristic Rule
Weighting
  • Approach assigns weights to different ordering
    possibilities based on the knowledge sources
    involved.
  • Temporal adverbials and discourse cues are first
    tried if neither are present, then default rules
    based on tense and aspect are used.
  • Given a sentence describing past tense activity
    followed by one describing a past tense
    accomplishment or achievement, the second event
    can only occur just after the activity it cant
    precede, overlap, or be identical to it.
  • If the ordering is still ambiguous at the end of
    this, semantic rules are used based on modeling
    the discourse in terms of threads.
  • Assumes there is one thread that the discourse
    is currently following.
  • a. John went into the florist shop.
  • b. He had promised Mary some flowers.
  • c. She said she wouldnt forgive him if he
    forgot.
  • d. So he picked out three red roses.
  • Each utterance is associated with exactly one of
    two threads
  • (i) going into the florists shop and
  • (ii) interacting with Mary.
  • Prefer an utterance to continue a current thread
    which has the same tense or is semantically
    related to it
  • (i) would be continued by d. based on tense

Janet Hitzeman, Marc Moens, and Claire Grover,
Algorithms for Analysing the Temporal structure
of Discourse, EACL1995, 25360.
78
Heuristic Rules (Georgetown GTag)
  • Uses 187 hand-coded rules
  • LHS tests based on TimeML-related features and
    pos-tags
  • RHS TimeML TLINK classes ( 13 Allen)
  • Ordered into Classes
  • R12 event anchored w/o signal to time in same
    clause
  • R3 (28) main clause event in 2 successive
    sentences
  • R4 reporting verb and document time
  • R5 (54) reporting verb and event in same
    sentence
  • R6 (87) events in same sentence
  • R7 timex linked to document time
  • Rules can have confidence
  • ruleNum6-6
  • If sameSentenceYES
  • sentenceTypeANY
  • conjBetweenEventsYES
  • arg1.classEVENT
  • arg2.classEVENT
  • arg1.tensePAST
  • arg2.tensePAST
  • arg1.aspectNONE
  • arg2.aspectNONE
  • arg1.posVB
  • arg2.posVB
  • arg1.firstVbEventANY
  • arg2.firstVbEventANY
  • then infer relationBEFORE
  • Confidence 1.0
  • Comment they traveled far and slept the night
    in a rustic inn

79
Using Web-Mined Rules
  • Lexical relations (capturing causal and other
    relations, etc.)
  • kill gt die (always)
  • push gt fall (sometimes Max fell. John pushed
    him.)
  • Idea leverage the distributions found in large
    corpora
  • VerbOcean database from ISI that contains
    lexical relations mined from Google searches
  • E.g., X happens before Y, where X and Y are
    WordNet verbs highly associated in a corpus
  • Converted to GUTenLink Format
  • Yields 4199 rules!
  • ruleNum8-3991
  • If arg1.classEVENT
  • arg2.classEVENT
  • arg1.wordlearn uses
    morph normalization
  • arg2.wordforget
  • then infer relationBEFORE

80
Outline
  • Introduction
  • Linguistic Theories
  • AI Theories
  • Annotation Schemes
  • Rule-based and machine-learning methods.
  • Challenges
  • Links

81
IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
82
Related Machine Learning Work
  • (Li et al. ACL2004) obtained 78-88 accuracy on
    ordering within-sentence temporal relations in
    Chinese texts.
  • (Mani et al., HLT2003 short) obtained 80.2
    F-measure training a decision tree on 2069
    clauses in anchoring events to reference times
    that were inferred for each clause.
  • (Lapata and Lascarides NAACL2004) used found
    data to successfully learn which (possibly
    ambiguous) temporal markers connect a main and
    subordinate clause, without inferring underlying
    temporal relations.

83
Car Sim Text to Accident Simulation System
  • Carries out TimeML annotation of Swedish accident
    reports
  • Builds an event ordering graph using machine
    learning, with separate decision trees for local
    and global TLINKS
  • Generates, based on domain knowledge, a
    simulation of the accident

Anders Berglund. Extracting Temporal Information
and Ordering Events for Swedish. MS Thesis. Lund
University. 2004.
84
Prior Machine Learning from TimeBank
  • Mani (p.c., 2004)
  • TLINKs converted into feature vectors from
    TimeBank 1.0 tags
  • TLINK relType converted to feature vector class
    label, after collapsing
  • Accuracy of C5.0.1 decision rules .55 F
  • majority class
  • Boguraev Ando (IJCAI2005)
  • Uses features based on local syntactic context
    (chunks and clause-structure)
  • trained a classifier for within-sentence TLINKS
    on Timebank 1.1 .53F
  • Bottom Line TimeBank corpus doesnt provide
    enough data for training learners?

85
Insight TLINK Annotation (Humans)
  • Inter-annotator reliability is .55F
  • But agreement on LINK labels 77
  • So, the problem is largely which events to link
  • Within sentence, adjacent sentences, across
    document?
  • Guidelines arent that helpful
  • Conclusion global TLINKing is too fatiguing
  • 0.84 TLINKS/event in corpus

86
Temporal Reasoning to the Rescue
  • Earlier experiments with SputLINK in TANGO
    (interactive, text-segmented closure) indicated
    that without closure, annotators cover 4 of all
    possible links.
  • With closure, an annotator could cover about 65
    of all possible links in a document.
  • Of those links, 84 were derived by the algorithm

Initial Links 36 4
User Prompts 109 12
Derived Links 775 84
87
IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
Axioms
88
Temporal Closure as an Oversampling Method
Corpus 186 TimeBank 1.2.1 73 Opinion Corpus
  • Closing the Corpus (with 745 axioms)
  • Number of TLINKs goes up gt 11 times!
  • BEFORE links go up from 3170 Event-Event and 1229
    Event-Time TLINKs to 68,585 Event-Event and
    186,65 Event-Time TLINKs
  • Before closure 0.84 TLINKs/event
  • After closure 9.49 TLINKs/event

12750 Events, 2114 Times 12750 Events, 2114 Times 12750 Events, 2114 Times
Relation Event-Event Event-Time
IBEFORE 131 15
BEGINS 160 112
ENDS 208 159
SIMULTANEOUS 1528 77
INCLUDES 950 3001 (65.3)
BEFORE 3170 (51.6) 1229
TOTAL 6147 4593
89
ML Results
  • Features each TLINK is a feature vector
  • For each event in the pair
  • event-class occurrence, state, reporting,
    i-action, i-state, aspectual, perception
  • aspect progressive, perfective,
    progressive_perfective
  • modality nominal
  • negation nominal
  • string string
  • tense present, past, future
  • signal string
  • shiftAspect boolean
  • shiftTense boolean
  • class SIMULTANEOUS, IBEFORE, BEFORE, BEGINS,
    ENDS, INCLUDES

Link Labeling Accuracy
90
TLINK Extraction Conclusion
  • Annotated TimeML Corpus provides insufficient
    examples for training machine learners
  • Significant Result
  • number of examples expanded 11 times by Closure
  • Training learners on the expanded corpus yields
    excellent performance
  • Performance exceeds human intuitions, even when
    augmented with lexical rules
  • Next steps
  • Integrate GUTenLinkVerbOcean rules into machine
    learning framework
  • Integrate with s2tlink and a2tlink
  • Feature engineering

91
Challenges Temporal Reasoning
  • Temporal reasoning for IE has used qualitative
    temporal relations
  • Trivial metric relations (distances in time) can
    be extracted from anchored durations and sorted
    time expressions
  • But commonsense metric constraints are missing
  • Time(Haircut) ltlt Time(fly Boston2Sydney)
  • First steps
  • Hobbs et al. at ACL06
  • Mani Wellner at ARTE06 workshop

92
Challenges Integrating Reasoning and Learning
93
Difficulties in Annotation
  • In an interview with Barbara Walters to be shown
    on ABCs Friday nights, Shapiro said he tried
    on the gloves and realized they would never fit
    Simpsons larger hands.
  • BEFORE or MEET?
  • More coarse-grained annotation may suffice

94
Discourse Relations
  • Lexical Rules from VerbOcean are still very
    sparse, even though they are less brittle
  • But need to match arguments when applying lexical
    rules (e.g., subj/obj of push/fall)
  • A discourse model should in fact be used

95
Temporal Relations as Surrogates for Rhetorical
Relations
a. John went into the florist shop. b. He had
promised Mary some flowers. c. She said she
wouldnt forgive him if he forgot. d. So he
picked out three red roses.
  • When E1 is left-sibling of E2 and E1 lt E2, then
    typically, Narration(E1, E2)
  • When E1 is right-sibling of E2 and E1 lt E2, then
    typically Explanation(E2, E1)
  • When E2 is a child node of E1, then typically
    Elaboration(E1, E2)

Expl
Elab
Narr
constraints Eb lt Ec, Ec lt Ea, Ea lt Ed
96
TLINKS as a measure of fluency in Second Language
Learning
  • Analyzed English oral and written proficiency
    samples elicited from 16 speakers of English
  • 8 native speakers and 8 students in Advanced
    courses in an Intensive English Program.
  • Corpus includes 5888 words elicited from subjects
    via a written narrative retelling task
  • Chaplins Hard Times
  • On average, native speakers (NSs) use
    significantly fewer wds to create TLinks
    (8.2/TLink vs. 10.1 for NNSs).
  • Number of closed TLINKS for NS far exceeds the
    number for NNS (12,330 vs. 4924).
  • This means NS have, on the average, longer chains
    of TLINKS

Joint work with Jeff Connor-Linton at AAAL05.
97
Outline
  • Introduction
  • Linguistic Theories
  • AI Theories
  • Annotation Schemes
  • Rule-based and machine-learning methods.
  • Challenges
  • Links

98
Corpora
  • News (newswire and broadcast)
  • TimeML TimeBank, AQUAINT Corpus (all English)
  • TIMEX2 TIDES and TERN English Corpora, Korean
    Corpus (200 docs), TERN Chinese and Arabic news
    data (extents only)
  • Weblogs
  • TIMEX2 TERN corpus (English, Chinese, Arabic
    the latter with extents only)
  • Dialogues
  • TIMEX2- 95 Spanish Enthusiast dialogs, and their
    translations
  • Meetings
  • TIMEX2 Spanish portions of UN Parallel corpus
    (23,000 words)
  • Childrens Stories
  • Reading Comprehension Exams from MITRE, Remedia
    120 stories, 20K words, CBC 259 stories, 1/3
    tagged, 50K

99
Links
  • TimeBank (April 17, 2006)
  • http//www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?
    catalogIdLDC2006T08
  • TimeML
  • www.timeml.org
  • TIMEX2/TERN ACE data (English, Chinese, Arabic)
  • timex2.mitre.org
  • TIMEX2/3 Tagger
  • http//complingone.georgetown.edu/linguist/GU_TIM
    E_DOWNLOAD.HTML
  • Korean and Spanish data . imani_at_mitre.org
  • Callisto callisto.mitre.org

100
References
  • Mani, I., Pustejovsky, J., and Gaizauskas, R.
    (eds.). (2005) The Language of Time A Reader.
    Oxford University Press.
  • Mani, I., and Schiffman, B. (2004). Temporally
    Anchoring and Ordering Events in News. In
    Pustejovsky, J. and Gaizauskas, R. (eds), Time
    and Event Recognition in Natural Language. John
    Benjamins, to appear.
  • Mani, I. (2004). Recent Developments in Temporal
    Information Extraction. In Nicolov, N., and
    Mitkov, R. Proceedings of RANLP'03, John
    Benjamins, to appear.
  • Jang, S., Baldwin, J., and Mani, I. (2004).
    Automatic TIMEX2 Tagging of Korean News. In Mani,
    I., Pustejovsky, J., and Sundheim, B. (eds.), ACM
    Transactions on Asian Language Processing
    Special issue on Temporal Information Processing.
  • Mani, I., Schiffman, B., and Zhang, J. (2003).
    Inferring Temporal Ordering of Events in News.
    Short Paper. In Proceedings of the Human Language
    Technology Conference (HLT-NAACL'03).
  • Ferro, L., Mani, I., Sundheim, B. and Wilson G.
    (2001). TIDES Temporal Annotation Guidelines
    Draft - Version 1.02. MITRE Technical Report MTR
    MTR 01W000004. McLean, Virginia The MITRE
    Corporation.
  • Mani, I. and Wilson, G. (2000). Robust Temporal
    Processing of News. In Proceedings of the 38th
    Annual Meeting of the Association for
    Computational Linguistics (ACL'2000), 69-76.
Write a Comment
User Comments (0)
About PowerShow.com