Multilingual Conversational Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Multilingual Conversational Systems

Description:

... from English to target language, can create reverse system almost effortlessly ... Translation achieved by parsing and table lookup ... – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 44
Provided by: sen7
Category:

less

Transcript and Presenter's Notes

Title: Multilingual Conversational Systems


1
Multilingual Conversational Systems
2
Steps to Develop Language Learning System
  • Begin with existing mature system in English
  • Develop English-to-Mandarin translation
    capability
  • Induce Mandarin corpus from English corpus
  • Train LM statistics for both recognizers from
    corpora
  • Develop parsing grammar for Mandarin queries and
    generation rules for Mandarin responses
  • Not yet completed
  • Develop domain-specific user simulation
    capability
  • Generate thousands of dialogues in both languages
  • 3. Train recognizers and users from simulated
    dialogues

3
Activities over the Last Nine Months
  • Translation from English to Mandarin
  • Mainly focused on user queries (as contrasted
    with responses)
  • Integrating generation-based translation with
    example-based approach
  • Exploring the use of statistical machine
    translation
  • Use phrase-based statistical translation
    framework developed by Phillip Koehn
  • Utilized the formal methods to generate
    domain-specific parallel corpus in weather query
    domain
  • Implemented a finite-state transducer version of
    the decoder and integrated with Galaxy
  • Translation from Mandarin to English
  • Use statistical method to obtain Chinese to
    English translation capability
  • Explore grammar induction techniques to create
    parsing grammar for Mandarin queries, towards
    developing formal methods for Mandarin to English
    translation

4
Activities over the Last Nine Months, Contd
  • System Development
  • Upgraded weather harvesting process
  • Upgraded database server to support Postgres in
    addition to Oracle
  • Improved dialogue management
  • Better handling of meta queries
  • Developed a new GUI interface ovecoming firewall
    limitations
  • Support automatic checking and correction of
    typed tone errors
  • Better display of tones as diacritcs
  • Developed a new concatenative speech synthesis
    capability for high quality translation of user
    queries spoken in English using Envoice
  • Developed a batchmode capability to process
    synthetic speech through dialogue interaction to
    aid system development

5
Activities over the Last Nine Months, Contd
  • Presentations
  • Three talks at InStill Workshop in Venice
  • Wang and Seneff Translation
  • Seneff et al. LL Systems
  • Peabody et al. Web based interface for tone
    acquisition
  • ISCSLP
  • Seneff et al. Focused on MuXing system overall
  • SigDial Demo Session
  • Wang and Seneff Presentation and live
    demonstration
  • One hour seminar at Microsoft Chinas Speech
    Group
  • One hour seminar at Defense Language Institute in
    Monterey
  • Demonstrated system to Julian Wheatley, head of
    Chinese department at MIT and to Henry Jenkins,
    director of MIT Comparative Media Studies

6
Activities over the Last Nine Months, Contd
  • Data collection initiatives
  • Eight subjects have completed Web-based exercise
    at MIT
  • Two visits by Stephanie Seneff to Defense
    Language Institute in Monterey California
  • One successful class participation exercise
  • Another attempted but aborted due to power outage
  • Installed Web-based exercise system on computers
    at MIT Language Lab
  • Julian Wheatley has agreed to support data
    collection initiatives with students in the MIT
    Chinese classes

7
Bilingual Recognizer Construction
English corpus
  • Two languages compete in common search space
  • Automatically translate existing English corpus
    into Mandarin
  • Use NL grammar to automatically induce language
    model for both English and Mandarin recognizers

8
Automatic Grammar Induction
Once translation ability exists from English to
target language, can create reverse system almost
effortlessly
English Sentence
Corpus Pairs
Utilizes English parse tree and Mandarin
generation lexicon to induce Mandarin parse tree
9
Multilingual Spoken Translation Framework
  • Common meaning representation semantic frame

Semantic Frame
10
Challenges in Cross-languageGeneration for
Translation
  • Some expressions have very different syntactic
    structures in different languages

What is your name? ?(you) ?(call) ??(what)
??(name)? I like her. Ella me gusta.
  • Syntactic features are expressed in many
    different ways
  • Determiners (English but not Chinese)

??(vicinity) ??(where) ?(have) ??(bank)? Where is
a bank nearby?
  • Particles (Chinese but not English)

that hotel ?(that) ?()
??(hotel) I lost my key. ?(I) ?(lose) ?(
tense) ??(my) ??(key).
  • Gender (extensive in Spanish)

11
An Example English/Chinese
How long does it take to take a taxi there
How long does it take to take a taxi there
How long take take taxi
there
How long need take taxi
there
How long need take taxi go
there
( take taxi go there
need
how long )
? ??? ? ?? ? ??
  • Function words disappear in Chinese
  • Two instances of take have different
    translations
  • Verb go omitted in English
  • Sentence structure is very different

12
Semantic Frame for Example
  • Semantic frame is identical for both inputs,
    except for missing function words in Mandarin
  • Where necessary, constituent movement is invoked
    to render the same hierarchical structure
  • English generation predicts missing function
    words
  • Mandarin generation infers go from
    destination predicate

13
Strategies for Achieving High Quality and
Robustness
  • Interlingua-based translation
  • Maintain consistency of semantic frame
    representation across different languages
    whenever possible
  • Seed grammar rules for each new language on
    English grammar rules
  • Target language dependent generation rules
    specify constituent order
  • Word sense disambiguation achieved through
    semantic features
  • Restricted conversational domains (lesson plans)
  • Emphasis on mechanisms to enable rapid porting to
    new domains and languages
  • Use parsability to assess quality of translation
    outputs
  • Back off to example-based method when parse fails

14
Schematic of Generation into Mandarin
c verify aux will subject it
pred p rain pred p locative
prep in topic q city
name boston pred p temporal
topic q weekday quanitifier
this name weekend
15
Generation-based Translation
  • Semantic frame serves as interlingua
  • Translation achieved by parsing and generation
  • Use Mandarin grammar to detect potential problems
  • Rejected sentences routed to example-based
    translation for a second chance

16
Example-based Translation
  • Requires translation pairs and a retrieval
    mechanism
  • Corpus automatically obtained via the
    generation-based approach
  • Retrieval based on lean semantic information
  • Encoded as key-value pairs
  • Obtained from semantic frame via simple
    generation rules
  • Generalizes words to classes (e.g., city name,
    weekday, etc.) to overcome data sparseness

17
Example-based Translation Procedure
KV-Chinese Table
Is there any chance of rain in San Francisco?
WEATHER rain CITY San Francisco
jiu4 jin1 shan1
jiu4 jin1 shan1
  • Key-value string serves as interlingua
  • Translation achieved by parsing and table lookup
  • City name masked during retrieval and recovered
    in final surface string

18
Evaluation English to MandarinWeather Domain
  • Evaluation data
  • Drawn from the publicly available Jupiter weather
    system
  • Telephone recordings conversational speech
  • Unparsable utterances (English grammar) were
    excluded
  • Total of 695 utterances, with 6.5 words per
    utterance on average
  • System configuration
  • Text input or speech input
  • Recognizer achieved 6.9 word error rate, and
    19.0 sentence error rate
  • Generation-based method preferred over
    example-based method
  • NULL output if both failed
  • Evaluation criteria
  • Yield of each translation method
  • Human judgment of translation quality

19
Spoken Language Translation Evaluation Results
13(2)
  • Recognizer WER was 6.9
  • Bilingual judge rated translations
  • Example-based translation increased yield by 6
  • Incorrect translation provided only 2 of the
    time
  • Often due to recognition errors
  • English paraphrase provides context for errors

20
Multilingual Weather Responses
English source Some thunderstorms may be
accompanied by gusty winds and hail
clause weather_event topic precip_act, name
thunderstorm, num pl quantifier some pred
accompanied_by adverb possibly topic
wind, num pl, pred gusty and precip_act,
name hail
Frame indexed under wind, rain, storm, and hail
21
Stage 1 Drill Exercises
  • Web-based Interface to provide practice in typing
    queries in the weather domain
  • 10 weather scenarios to be solved using typed
    pinyin Boston, rain, tomorrow
  • Student given feedback on both query completeness
    and tone accuracy
  • Separate recording sessions allow user to
    practice both read and spontaneous spoken queries
  • Recordings will be used to train the system on
    accented speech
  • Recordings will also be assessed for tone quality
  • The Defense Language Institute in Monterey
    conducted a successful experiment using this
    Web-based interface in a class of 30 students
  • We are planning to introduce the exercise in the
    language laboratory at MIT

22
Lexical Tone Correction
  • Character representation does not explicitly
    encode tone
  • ??????????
  • Exploit pinyin to help student acquire tonal
    knowledge
  • Diacritic luò shan ji xing qi yi gua feng ma?
  • Numeric luo4 shan1 ji1 xing1 qi1 yi1 gua1 feng1
    ma5?
  • Hypothesis Errors in typed pinyin reflect
    inaccurate knowledge of tones
  • luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2?
  • Provide explicit feedback about typed tone errors

23
Lexical Tone Correction
  • Exploit some features of Chinese
  • Syllable lexicon is small, approximately 420
    unique syllables
  • 5 tones (including neutral tone)
  • Exploit some abilities of TINA NL system
  • Ability to parse weighted word FST using
    probabilistic models
  • FST normally represents a list of recognizer
    hypotheses
  • A path through the FST represents the most likely
    correct parse
  • Given some input
  • Generate FST of single sentence
  • Expand the tones on each syllable
  • Attempt to parse FST
  • Selected path through FST represents corrected
    tones

24
FST Example Step 1
  • Step 1 Generate simple FST

Given luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2
25
FST Example Step 2
  • Step 2 Assign benefit of doubt to items that
    appear in lexicon

Items that do not appear in lexicon are removed.
Given luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2
26
FST Example Step 3
  • Step 3 Expand each syllable to alternate tones.
    More compact than specifying each possible
    sentence variant.

Given luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2
27
FST Example Step 4
  • Step 4 Remaining probability is uniformly
    distributed among alternate tones

Given luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2
28
FST Example Step 5
  • Step 5 Parsing reveals the correct tones

Given luo3 shan1 ji3 xing1 qi2 yi1 gua4 feng2 ma2
Correct luo4 shan1 ji1 xing1 qi1 yi1 gua1 feng1
ma5
29
Web interface Practice Exercise
Student is prompted for city, time, and event
30
Web interface Practice Exercise
Xing1 qi1 er3 jiu3 jin3 shan1 hui4 bu2 hui4 re1
  • Student types in
  • A question concerning this topic in Mandarin
    using pinyin
  • OR
  • An English word or phrase for a translation

31
Web interface Practice Exercise
Student is given feedback
32
Web interface
33
Spoken Conversational Interaction
  • Weather information domain (rain, snow, wind,
    temperature, etc.)
  • Initial version configured for American learning
    Mandarin
  • Recognizer supports both English and Mandarin
  • Seamless language switching
  • English queries are translated into Mandarin
  • Mandarin queries are answered in Mandarin
  • User can ask for a translation into English of
    the response at any time
  • Uses Mandarin synthesizer provided by DELTA
    Electronics for responses, Envoice concatenative
    synthesizer for query translations
  • System can be configured as telephone-only or as
    telephone augmented with a Web-based gui
    interface

34
Illustration of Dialogue Interaction
  • User Bo1 Shi4 Dun4 ming2 tian1 hui4 xia4 yu3
    ma5? (Is it going to rain tomorrow in
    Boston?)
  • System Tian1 qi4 yu4 bao4 ming2 tian1 Bo1 shi4
    dun4 mei2 you3 yu3. (The forecast calls for
    no rain tomorrow in Boston)
  • User (in English) What is the
    temperature?
  • System (translates) Qi4 wen1 shi4 duo1
    shao3?
  • User (emulates) Qi4 wen1 shi4 duo1 shao3?
  • System Bo1 Shi4 Dun4 ming2 tian1 zui4 gao1 qi4
    wen1 er4 she4 shi4 du4, ming2 tian1 ye4 jian1,
    zui4 di4 qi4 wen1 ling2 xia4 wu3 she4 shi4 du4.
  • User Could you translate that?
  • System In Boston tomorrow, high 2 degrees
    Celsius, Tomorrow night, low -5 Celsius.

35
Example Dialogue in Weather Domain
  • What is the forecast for San Francisco
    tomorrow?
  • System paraphrases request, then answers
  • Please translate
  • High quality synthesis for translation using
    MITs Envoice concatenative synthesis framework
  • Could you repeat that system provides
    translation
  • User emulates in Mandarin and system repeats
    previous response
  • Will it rain in London?
  • Im sorry I didnt understand you. response
    given when it fails to recognize or parse the
    user query

36
Video Clip
Demo
37
Assessment
  • Phonetic aspects
  • Expand phonological rules to support non-native
    realizations (e.g., /dh/ ? /d/ or schwa
    insertion)
  • Allow realizations of selected phones from native
    language to compete in recognizer search
  • Tonal aspects (Mandarin)
  • Use tone recognition system (Wang et al., 1998)
    to score tone productions highlight
    worst-scoring words
  • Tabulate frequencies of tone errors in typed
    inputs (pinyin)
  • Use phase-vocoder techniques (Tang et al., 2001)
    to repair users tone productions by replacing
    prosodic contour with native speech patterns
  • Fluency measures
  • Word-by-word speaking rate (Chung Seneff, 1999)
  • Percentage of utterance containing pauses and
    disfluencies

38
Tone analysis Native vs Non-Native Mandarin
  • Creating pitch contours
  • F0 extracted using algorithm in (Wang and Seneff,
    2000)
  • Statistics of each pitch contour over each
    syllable considered without regard for left or
    right contexts
  • Normalization
  • Duration normalized by sampling at 10 intervals
  • Pitch normalized according to
  • Comparisons based on (Wang et al., 2003)
  • Include normalized F0 value, peak, valley, range,
    peak position, valley position, falling range,
    and rising range
  • Corpus (from the Defense Language Institute)
  • 2065 utterances from 4 native speakers
  • 4657 utterances from 20 non-native speakers

39
Tonal averages over all syllablesNative Example
40

Tonal averages over all syllables Non-Native
Example
41
Capturing Phonological Errors
  • Leverage phonological modeling capabilities of
    SUMMIT
  • Model typical pronunciation errors explicitly
  • Direct and intuitive mapping from linguistic
    rules
  • Support both within-language and cross-language
    substitutions
  • Initial experiments completed on Koreans learning
    English (Kim et al.,
    ICSLP 2004)
  • Phonological rules capture typical problems such
    as schwa insertion and /dh/ /d/ confusions
  • Best path in alignment used to detect errors
  • Verbal feedback given to student
  • Current research to apply to Americans learning
    Mandarin
  • Build single recognizer to support both languages
  • Use data-driven approaches to discover most
    likely cross-language phone substitution errors
  • Explicitly encode such errors in formal
    phonological rules
  • Side benefit may be improved recognition for
    English-accented Mandarin

42
Detecting Phonological Errors
CONSONANT td CONSONANT tcl t tcl t
ax // No CCC allowed in Korean
dd dcl d ax // A vowel may be
inserted after a coda consonant (Staccato Rhythm)
dh dh dcl d // Becomes an onset
stop as in 'they'. No dh in Korean phonemes..
43
Future Plans
  • Develop tools to rapidly port to new domains and
    languages
  • Automatic grammar induction
  • Generic dialogue modeling
  • Simulated dialogue interactions
  • Develop various scoring algorithms for quality
    assessment of students speech
  • Develop high quality synthesis capability for
    Mandarin translations, for multiple domains of
    knowledge
  • Collect and transcribe data from language
    learners and evaluate both system and students
  • Begin with weather domain, our most mature system
  • Extend to other domains once they are better
    developed
  • Refine all aspects of systems based on collected
    data
Write a Comment
User Comments (0)
About PowerShow.com