Meaningful Intonational Variation

1 / 28
About This Presentation
Title:

Meaningful Intonational Variation

Description:

Propositional attitude (uncertainty) Did you feed the animals? I fed the L*+H goldfish L-H% Distinguish direct/indirect speech acts Can you open the door? – PowerPoint PPT presentation

Number of Views:0
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Meaningful Intonational Variation


1
Meaningful Intonational Variation
2
Today
  • Assigning variation for TTS, CTS
  • Contours
  • Accent
  • Phrasing
  • Pitch Range
  • Amplitude and timing

3
TTS Production Pipeline
  • Orthographic input Dr. Smith lives on Elm Dr.
  • Text normalization abbreviation expansion
  • Pronunciation modeling POS id, WS disambiguation
  • Intonation assignment parsing, POS id, robust
    semantics
  • Phonetic/phonological realization phonological
    parsing, phonetic analysis
  • Unit selection acoustic analysis

4
Intonation Assignment Phrasing
  • Traditional hand-built rules
  • Punctuation 234-5682
  • Context/function word no breaks after function
    word He went to dinner
  • Parse? She favors the nuts and bolts approach
  • Current statistical analysis of large labeled
    corpus
  • Punctuation, pos window, utt length,

5
Functions of Phrasing
  • Disambiguates syntactic constructions, e.g. PP
    attachment
  • S You should buy the ticket with the discount
    coupon.
  • Disambiguates scope ambiguities, e.g. Negation
  • S You arent booked through Rome because of the
    fare.
  • Or modifier scope
  • S This fare is restricted to retired politicians
    and civil servants.

6
Intonation Assignment Accent
  • Hand-built rules
  • Function/content distinction He went out the back
    door/He threw out the trash
  • Complex nominals
  • Main Street/Park Avenue
  • city hall parking lot
  • Statistical procedures trained on large corpora
  • Contrastive stress, given/new distinction?

7
Functions of Pitch Accent
  • Given/new information
  • S Do you need a return ticket.
  • U No, thanks, I dont need a return.
  • Contrast (narrow focus)
  • U No, thanks, I dont need a RETURN. (I need a
    time schedule, receipt,)
  • Disambiguation of discourse markers
  • S Now let me get you the train information.
  • U Okay (thanks) vs. Okay.(but I really want)

8
Intonation Assignment Contours
  • Simple rules
  • . declarative contour
  • ? yes-no-question contour unless wh-word
    present at/near front of sentence
  • Well, how did he do it? And what do you know?
  • What else might we do?

9
Contours Accent Phrasing
  • What do intonational contours mean (Ladd 80,
    Bolinger 89)?
  • Speech acts (statements, questions, requests)
  • S Thatll be credit card? (L H- H)
  • Propositional attitude (uncertainty, incredulity)
  • S Youd like an evening flight. (LH L- H)
  • Speaker affect (anger, happiness, love)
  • U I said four SEVEN one! (LH L- L)
  • Personality
  • S Welcome to the Sunshine Travel System.

10
  • Propositional attitude (uncertainty)
  • Did you feed the animals?
  • I fed the LH goldfish L-H
  • Distinguish direct/indirect speech acts
  • Can you open the door?

11
The TTS Front End Today
  • Corpus-based statistical methods instead of
    hand-built rule-sets
  • Dictionaries instead of rules (but fall-back to
    rules)
  • Modest attempts to infer contrast, given/new
  • Text analysis tools pos tagger, morphological
    analyzer, little parsing

12
TTS Where are we now?
  • Natural sounding speech for some utterances
  • Where good match between input and database
  • Stillhard to vary prosodic features and retain
    naturalness
  • Yes-no questions Do you want to fly first class?
  • Context-dependent variation still hard to infer
    from text and hard to realize naturally

13
  • Appropriate contours from text
  • Emphasis, de-emphasis to convey focus, given/new
    distinction I own a cat. Or, rather, my cat
    owns me.
  • Variation in pitch range, rate, pausal duration
    to convey topic structure
  • Characteristics of emotional speech little
    understood, so hard to convey a voice that
    sounds friendly, sympathetic, authoritative.
  • How to mimic real voices?

14
TTS vs. CTS
  • Decisions in Text-to-Speech (TTS) depend on
    syntax, information status, topic structure,
    information explicitly available to NLG
  • Concept-to-Speech (CTS) systems should be able to
    specify better prosody the system knows what
    it wants to say and can specify how
  • But.generating prosody for CTS isnt so easy

15
To(nes and)B(reak)I(ndices)
  • Developed by prosody researchers in four meetings
    over 1991-94
  • Goals
  • devise common labeling scheme for Standard
    American English that is robust and reliable
  • promote collection of large, prosodically
    labeled, shareable corpora
  • ToBI standards also proposed for Japanese,
    German, Italian, Spanish, British and Australian
    English,....

16
  • Minimal ToBI transcription
  • recording of speech
  • f0 contour
  • ToBI tiers
  • orthographic tier words
  • break-index tier degrees of junction (Price et
    al 89)
  • tonal tier pitch accents, phrase accents,
    boundary tones (Pierrehumbert 80)
  • miscellaneous tier disfluencies, non-speech
    sounds, etc.

17
Sample ToBI Labeling
18
  • Online training material,available at
  • http//www.ling.ohio-state.edu/phonetics/ToBI/
  • Evaluation
  • Good inter-labeler reliability for expert and
    naive labelers 88 agreement on presence/absence
    of tonal category, 81 agreement on category
    label, 91 agreement on break indices to within 1
    level (Silverman et al. 92,Pitrelli et al 94)

19
Pitch Accent/Prominence in ToBI
  • Which items are made intonationally prominent and
    how?
  • Accent type
  • H simple high (declarative)
  • L simple low (ynq)
  • LH scooped, late rise (uncertainty/
    incredulity)
  • LH early rise to stress (contrastive focus)
  • H!H fall onto stress (implied familiarity)

20
  • Downstepped accents
  • !H,
  • L!H,
  • L!H
  • Degree of prominence
  • within a phrase HiF0
  • across phrases

21
Prosodic Phrasing in ToBI
  • Levels of phrasing
  • intermediate phrase one or more pitch accents
    plus a phrase accent (H- or L- )
  • intonational phrase 1 or more intermediate
    phrases boundary tone (H or L )
  • ToBI break-index tier
  • 0 no word boundary
  • 1 word boundary
  • 2 strong juncture with no tonal markings
  • 3 intermediate phrase boundary
  • 4 intonational phrase boundary

22
(No Transcript)
23
(No Transcript)
24
Contour Examples
  • http//www.cs.columbia.edu/julia/cs6998/cards/exa
    mples.html

25
And Other Things Contribute Pitch Range and
Timing (Rate, Pause)
  • Level of speaker engagement
  • Hello vs. HELLO
  • Contour interpretation
  • Rise/fall/rise (LH L-H) Elephantiasis isnt
    incurable
  • Discourse/topic structure paratones

26
Corpus-Based Research
  • Predicting accent, phrasing, contours from large
    ToBI-labeled corpora
  • Features
  • Word position, p.o.s. window, word cooccurence,
    punctuation, capitalization, sentence length,
    paragraph position,
  • Results
  • 80-85 correct accent prediction
  • 92-96 correct phrase boundary prediction
  • Contours????
  • Reality

27
  • This is my version of a rather long sentence
    which ideally should be broken into several
    phrases automatically by a smart system but we
    don't know if this will actually happen do we?
  • Is a yes-no question uttered with falling
    intonation? Does that sound delightful?
    Mellifluous?
  • I dont want cereal I want toast.
  • .

28
Next
  • Story analysis and generation (readings will be
    available later this week well send mail)
Write a Comment
User Comments (0)