Text segmentation in Informedia - PowerPoint PPT Presentation

About This Presentation
Title:

Text segmentation in Informedia

Description:

001630 CENTURY WE PEOPLE TEND TO. 001631 PUT THINGS LIKE THE ... 001633 CELEBRATE, CONTEMPLATE, EVEN. 001635 WORRY A BIT, SOMETIMES WORRY A. 001636 LOT. ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 5
Provided by: cjc2
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Text segmentation in Informedia


1
Text segmentation in Informedia
Faculty Mentor Alex Hauptmann
TA Mentor Vandi Verma
Students Zhirong Wang Ningning HuJichuan Chang
2
Data and Methods
  • Data
  • CNN WorldView (01/1999-10/2000)
  • Stemming, merging, stop words removal,
  • Methods
  • Classification
  • Artificial Neural Network (sentence)
  • Naive Bayes (sentence/fixed length window)
  • SVM (sentence)
  • Topic change detection
  • EM clustering
  • topics, block size

001630 CENTURY gtgtgt WE PEOPLE TEND TO 001631 PUT
THINGS LIKE THE PASSING OF A 001633 MILLENIUM IN
SHARP FOCUS. WE 001633 CELEBRATE, CONTEMPLATE,
EVEN 001635 WORRY A BIT, SOMETIMES WORRY A
001636 LOT. AFTER ALL, IT'S SOMETHING 001638
THAT HAPPENS ONLY ONCE EVERY ONE 001641 THOUSAND
YEARS. A BIG DEAL? 001641 PERHAPS NOT TO ALL
LIVING THINGS, 001642 AS CNN'S RICHARD BLYSTONE
001643 FOUND OUT WHEN HE CONSIDERED ONE 001654
VERY OLD TREE. gtgtgt HO HUM. 001654 ANOTHER
MILLENNIUM. THE GREAT YEW
3
Experimental Result
Identified boundary
Sentences
Reference boundary
False Alarm
Miss
OK
OK
OK
Recall (OK) / (OK Miss) Precision (OK)
/ (OK False Alarm)
  • Feature selection
  • Block size
  • Best Classifier
  • Naive Bayes Classifier
  • Fixed length block

4
Discussion
  • Impact of data set
  • Good recall, lower precision
  • Noisy close-captioning text
  • Ratio of positive to negative examples
  • Combining different classifiers
  • Different granularity
  • Voting
Write a Comment
User Comments (0)
About PowerShow.com