Identifying Fragmented Words in Spoken Dialogue - PowerPoint PPT Presentation

1 / 17

About This Presentation

Title:

Identifying Fragmented Words in Spoken Dialogue

Description:

Identifying Fragmented Words in Spoken Dialogue. Piroska Lendvai ... Incomplete words pose problems to NLP. Spoken Dutch Corpus ... 3,137 tagged fragmented words (0.9 ... – PowerPoint PPT presentation

Number of Views:123

Avg rating:3.0/5.0

Slides: 18

Provided by: piroska4

Category:

Tags: dialogue | fragmented | identifying | spoken | words

Transcript and Presenter's Notes

Title: Identifying Fragmented Words in Spoken Dialogue

1
Identifying Fragmented Words in Spoken Dialogue

Piroska Lendvai
Induction of Linguistics Knowledge Research Group
Dept. of Computational Linguistics
Tilburg University, NL

2
Outline

Incomplete words pose problems to NLP
Spoken Dutch Corpus
Task classify each lexical item in sentence as
in/completely uttered word
Memory-based vs Rule inducing learner
Extensive optimization strategy to
find optimal parameter settings
select informative features
Findings

3
Introduction

Spoken input contains disfluent speech
Disfluent speech involves fragmented words
he did not c-- call
the safe usage of interne-- sorry of electronic
commerce
Presence of fragment is often incorporated into
NLP tools taken for granted, but
Automatic identification of a fragment cannot be
straightforwardly done
1-letter?
not in Lexicon?
may be identical to existing word

4
Data

Spoken Dutch Corpus, 203 transcribed discourses
Various genres, 1-7 speakers
40,840 lexical tokens
44,939 sentences
3,137 tagged fragmented words (0.9)
Instance generation automatically extract
feature values binary class

5
Cues in learning incomplete words

Vector of 22 features based on corpus and the
literature
Readily available, word-based cues
Lexical window neighbouring 2 unigram words
left/right focus word (string)
Overlap in letters/matching words (binary)
Sentence-based general (numeric)
Context-type (binary)

6
22 features
7
Feature vector
8
Learning process

Memory-based lazy learner IB1 in TiMBL
Rule inducing eager learner RIPPER
Discourse-based partitioning 10-fold CV
Optimization with iterative deepening on training
data
Optimal learner per fold classifies held-out test
data

9
Task baselines

In-lexicon baseline
1-letter baseline
Evaluative measures Accuracy, Precision, Recall,
Fß1
Accuracy Prec Rec Fß1
In-lexicon 91.4 2.4 53.6 4.6
1-letter 97.4 54.3 43.9 48.5

10
Optimization by iterative deepening

Construct large number of different learners by
varying algorithm parameters
Varying employed feature groups
Iterate
Learners trained on a portion of 90 training set
Tested on data elsewhere from 90 training set
Ranked on F-score of Fragment class
Good learners are re-trained on more data

11
Optimizing IB1

4,301 learning experiments
Algorithm settings tested
Number of NNs used for extrapolating the class
1
Distance weighting metric of NNs -
Feature importance metric ?2
Metric for computing instance similarity
overlap
Frequency threshold in feature similarity metric
-
Data variedly represented by all possible
combinations of feature groups use all groups

12
Optimizing RIPPER

3,187 learning experiments
Algorithm settings tested
Negative tests on the feature attributes
dis/allowed
Number of optimization rounds on ruleset 3
Amount of examples to be minimally covered 1
Simplification/complication of hypotheses
complicate
Loss ratio of costs 0.5
Allowed to add redundant features -

13
Results of default vs optimized learners
14
Results

15
Error analysis

Frequent false negatives fragmented item that
resembles true word
False positives short true lexical items
Named entities, foreign words, acronyms cause
confusion
Annotation errors

16
Discussion

Readily available, word-based features
Context, letter overlap, word identity reliable
Iterative deepening search method beneficial
optimal settings uniformly differ from defaults
Example-based learning approach more successful
than abstract rule induction
RIPPER overfits on non-total data, IB1 not
Beneficial instance-specific behaviour observed
(lt100 rules, 1-7 conditions)

17
Future Work

Use ASR lexical output
Generate syntactic info from ASR output
Extract prosodic properties
Extend research to other disfluency types

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Turn-Taking in Spoken Dialogue Systems PowerPoint PPT Presentation

Turn-Taking in Spoken Dialogue Systems - ... intonation: the use of any pitch-level-terminal juncture combination other than at the end of a phonemic clause refers to a phonemic clause ending on a ... | PowerPoint PPT presentation | free to view

Discourse Annotation for Improving Spoken Dialogue Systems PowerPoint PPT Presentation

Discourse Annotation for Improving Spoken Dialogue Systems - Title: PowerPoint Presentation Last modified by: Joel Tetreault Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles | PowerPoint PPT presentation | free to view

Spoken Dialogue Systems PowerPoint PPT Presentation

Spoken Dialogue Systems - Spoken Dialogue Systems Bob Carpenter and Jennifer Chu-Carroll June 20, 1999 Tutorial Overview: Data Flow Speech and Audio Processing Signal processing: Convert the ... | PowerPoint PPT presentation | free to view

Spoken Dialogue Systems PowerPoint PPT Presentation

Spoken Dialogue Systems - Dictionary doesn't cover language due to: open classes. names ... example of HMM for lab' (following speech' for crossword triphone) ... | PowerPoint PPT presentation | free to view

Overview of Issues in Discourse and Dialogue PowerPoint PPT Presentation

Overview of Issues in Discourse and Dialogue - Notate accenting, pitch. 13. Computational Models of Discourse ... 2) Grosz & Sidner (1986): Attention (Focus), Intention (Goals), and Structure ... | PowerPoint PPT presentation | free to view

Speechbased Information Retrieval System with Clarification Dialogue Strategy PowerPoint PPT Presentation

Speechbased Information Retrieval System with Clarification Dialogue Strategy - Automatically extract potentially ambiguous words from knowledge base ... Use metadata attached to target knowledge base. Example of metadata in KB ... | PowerPoint PPT presentation | free to view

SpeechtoSpeech MT CSTARNespoleLingWear PowerPoint PPT Presentation

SpeechtoSpeech MT CSTARNespoleLingWear - Spoken dialogue is very different from written text: ... operatives in the field to assimilate forien language information they encounter ... | PowerPoint PPT presentation | free to view

Overview of Issues in Discourse and Dialogue PowerPoint PPT Presentation

Overview of Issues in Discourse and Dialogue - S: Would you like movie showtime or theater playlist. information? U: Movie showtime. S: What movie do you want showtime information about? U: Saving Private Ryan. ... | PowerPoint PPT presentation | free to view

Overview of Issues in Discourse and Dialogue PowerPoint PPT Presentation

Overview of Issues in Discourse and Dialogue - Overview of Issues in Discourse and Dialogue Gina-Anne Levow CS 35900-1 Discourse and Dialogue September 25, 2006 Agenda Definition(s) of Discourse Different Types of ... | PowerPoint PPT presentation | free to view

Todays Plan PostGrad Spoken English PowerPoint PPT Presentation

Todays Plan PostGrad Spoken English - When I am not busy travelling or teaching myself a new skill, I teach. ... the benefit from using computers; the increasing use of light in technology ... | PowerPoint PPT presentation | free to view

Challenges in Dialogue PowerPoint PPT Presentation

Challenges in Dialogue - ... common in human-human Even more common in human-computer dialogue Implicature & Grice s Maxims ... From Human to Computer Conversational agents ... | PowerPoint PPT presentation | free to view

Targeted Meeting Understanding at CSLI PowerPoint PPT Presentation

Targeted Meeting Understanding at CSLI - High speech recognition error rates (20-30% WER) Overhearing is hard ... Interesting findings on useful features: lexical, prosodic, fine-grained dialogue acts) ... | PowerPoint PPT presentation | free to view

Capitalization and Punctuation Glossary PowerPoint PPT Presentation

Capitalization and Punctuation Glossary - Capitalization and Punctuation Glossary This category of vocabulary will make up approximately 25% of the test. Apostrophe This is used to show the possessive form of ... | PowerPoint PPT presentation | free to view

The JANUSIII Translation System PowerPoint PPT Presentation

The JANUSIII Translation System - Dialogue between one traveler and one or more travel agents ... Information about Sights and Events. General Travel Information. Cross Domain. Semantic Grammars ... | PowerPoint PPT presentation | free to view

Conquering the Comma PowerPoint PPT Presentation

Conquering the Comma - Conquering the Comma | PowerPoint PPT presentation | free to view

Prosody%20in%20Recognition/Understanding PowerPoint PPT Presentation

Prosody%20in%20Recognition/Understanding - Prosody in Recognition/Understanding | PowerPoint PPT presentation | free to view

Automatic Decision Detection in Conversational Speech PowerPoint PPT Presentation

Automatic Decision Detection in Conversational Speech - automatic decision detection in conversational speech ... Prosodic features ... Prosodic features (Shriberg and Stolcke, 2001; Murray et al., 2006) Duration ... | PowerPoint PPT presentation | free to view

TUTORIAL SESSION PowerPoint PPT Presentation

TUTORIAL SESSION - ... with Qualia Information' Sara Mendes and Rui Pedro Chaves. WORKSHOP ... 'OLAC: The Open Language Archives Community' - Steven Bird, Gary Simons and Eva Banik ' ... | PowerPoint PPT presentation | free to view

ImageText Relations in Multimedia Systems PowerPoint PPT Presentation

ImageText Relations in Multimedia Systems - Digital multimedia new media' BACKGROUND. Multimedia systems process ... television broadcasts and with films in some cinemas and on VHS/DVD releases: ... | PowerPoint PPT presentation | free to view

Radical Revision PowerPoint PPT Presentation

Radical Revision - Radical Revision Teaching Revision Techniques Revision: What does it mean to revise? What is your revision process? How has your revision process changed over the ... | PowerPoint PPT presentation | free to view

Oral Interaction PowerPoint PPT Presentation

Oral Interaction - Speaking and Oral Interaction | PowerPoint PPT presentation | free to view

Pronouns PowerPoint PPT Presentation

Pronouns - Pronouns Correcting Faulty Parallelism Faulty parallelism occurs when a writer uses unequal grammatical structures to express related ideas. Correct a sentence ... | PowerPoint PPT presentation | free to view

ELTM Unit 5 Focus on Speaking PowerPoint PPT Presentation

ELTM Unit 5 Focus on Speaking - ELTM Unit 5 Focus on Speaking Presented By: Jia Lin (Dana) Unit 5 Objectives Think about the main features of oral communication. Understand the implications of these ... | PowerPoint PPT presentation | free to view

Some Activities on Speech Translation at ITCirst PowerPoint PPT Presentation

Some Activities on Speech Translation at ITCirst - Copular constructions ( the hotel is cheap and near Trento' ... Number of dialogues containing ambiguities concerning place names (ski-areas, towns, hotels) ... | PowerPoint PPT presentation | free to view

Analysing Collateral Text Corpora PowerPoint PPT Presentation

Analysing Collateral Text Corpora - The moving story of an English mapmaker and his dying memories of the romance ... television broadcasts and with films in some cinemas and on VHS/DVD releases. ... | PowerPoint PPT presentation | free to view

Analysing Collateral Text Corpora PowerPoint PPT Presentation

Analysing Collateral Text Corpora - The moving story of an English mapmaker and his dying memories of the romance ... Adverbs - anxiously, happily, nervously, desperately, sadly ... | PowerPoint PPT presentation | free to view

communication and collaboration models PowerPoint PPT Presentation

communication and collaboration models - chapter 14 communication and collaboration models CSCW Issues and Theory All computer systems have group impact not just groupware Ignoring this leads to the failure ... | PowerPoint PPT presentation | free to view