Title: Arabic Event Detection and Tracking
1Arabic Event Detection and Tracking
2Outline
- Arabic Event Detection
- Detection System
- Experiments
- Arabic Event Tracking
- Data
- Approaches
- Experiments
3Arabic Event Detection
- Goal
- Group together stories that discuss the same
event. - CMU Detection System
- GAC group average clustering algorithm
- INCR a single-pass incremental clustering
algorithm
4Arabic Detection Experiments
- Data
- TDT3
- Arabic news stories from 1998/10 1998/12
- TDT4
- Arabic news stories from 2000/10 2001/01
- Detection Tasks
- Arabic, English Translations
- Arabic, Native Orthography
5Results Arabic Event Detection
- TDT 2002 dry-run evaluation result
Condition Cost
TEArb,Eng 0.1770
TEArb,Nat 0.1732
6Results Arabic Event Detection
- TDT 2002 formal run evaluation result
Condition Cost
TEArb,Eng 0.2450
TEArb,Nat 0.2437
7Arabic Event Tracking
- Data
- Approaches
- Evaluation
- DET curve
- Experiments
8Data -- Arabic Event Tracking
- Data
- TDT3
- Task
- Training set English
- Test set Arabic
- Events
- 23 events
9Approaches Arabic Event Tracking
- Translating test set
- IBM MT system (released by NIST)
- Translating training set
- English-Arabic bilingual dictionary (statMT) we
automatically extracted from UN English-Arabic
parallel corpus
10Experimental Results Arabic Event Tracking
Condition Cost
TDT2002 dryrun best result for multilingual task 0.1307
CMU Arabic task (after adaptation) 0.1364
11Experiments Results Arabic Event Tracking
Condition Cost Reduction Ration
Without Adaptation 0.1429 --
With Adaptation 0.1364 4.5
12Experimental Results Arabic Event Tracking
Condition Cost Reduction Ratio
translating training set by DICT 0.1772 --
translating training set by DICT Adaptation 0.1633 7.8
13Experiment Results Arabic Event Tracking