Phonetic details in prosodic phenomena

About This Presentation

Title:

Phonetic details in prosodic phenomena

Description:

Summary of German intonation (in terms of KIM) In the Kiel Intonation Model (KIM, Kohler 1991), these attitudinal meanings have ... – PowerPoint PPT presentation

Number of Views:79

Avg rating:3.0/5.0

Slides: 39

Provided by: bft

Category:

more less

Transcript and Presenter's Notes

Title: Phonetic details in prosodic phenomena

1
Phonetic details in prosodic phenomena

Oliver Niebuhr
Presentation at the Laboratoire de Phonétique et
Phonologie, Paris 3
January, 30th, 2009
oliver.niebuhr_at_lpl-aix.fr

2
Summary of German intonation (in terms of KIM)

A number of attitudinal meanings are known to be
signalled in German by prosodic means
The speaker can convey that an information is
(a) settled, concluding
(b) presenting, open(ing)
(c) astonishing
Furthermore, speaker can
(d) superordinate or
(e) subordinate her/himself to the dialogue
partner
The specific interpretations of these attitudinal
meanings may vary depending on the semantic
composition and the linguistic structure of the
utterance
In negative contexts (a) may be interpreted as
resignation and (c) as disbelieving / taken
aback
(d) and (e) can convey statement and question

3
Summary of German intonation (in terms of KIM)

In the Kiel Intonation Model (KIM, Kohler 1991),
these attitudinal meanings have been assigned to
pitch-accent categories
The KIM distinguishes between 2 basic
phonological classes of pitch movements
rising-falling peaks
G
(falling)-rising valleys
Co-occur with accented (i.e. perceptually
salient) syllables
Timing is phonological, not phonetic (?
alignment) ? synchronization
Relevant according to Kohler (1987, 1991) F0
maximum (peaks) or minimum (valleys) relative to
the accented-vowel boundaries (onsets)

4
Summary of German intonation (in terms of KIM)

early peaks ? F0 max. before acc.Von
settled, concl.
medial peaks ? F0 max. after acc.Von
presenting
late peaks ? F0 max. after acc.Voff
astonishing
early/late valleys ? F0 min. before/after
acc.Von subordination /
? questioning

Did you hear me
Answer hmmnot exactly.
Answer yes, sure.
5
The signalling of early, medial, and late peaks

The reference to the accented-vowel boundaries in
the KIM originates from peak-shift experiments by
Kohler (1987, 1991).
On the other hand, his experiments showed that
the location of the category boundary is shifted
for different stimulus utterances (Kohler 1991)
Sie hat ja gelogen lateral vowel
Sie ist ja geritten fricative vowel
Sie hat ja gejodelt approximant vowel
Why?

? earlier boundary
? later boundary
6
The signalling of early, medial, and late peaks

Niebuhr (2006, 2007c) maybe, it is not the
segment boundary between CV in terms of a
spectral change (e.g., formant transitions) that
matters, but the increasing / decreasing
intensity into and out of the accented vowel
The intensity change is more abrupt for sequences
nasalvowel or lateralvowel than for
approximantvowel.
It also depends on the vowel quality itself.
Starting from this idea,
Two f0 peak shift series were resynthesized
One using the stimulus utterance Sie war mal
Malerin
The other hand keeps exactly the F0 and intensity
patterns of the Malerin series, but on a
constant Schwa-like vowel quality ( HUM in
praat)
So, basically, the two stimulus series (Malerin
and HUM) differ just with regard to the
presence / absence of the segmental string.
Two parallel perception experiments with two
separate groups of subjects
Indirect identification for Malerin series
AXB test for HUM series

7
The signalling of early, medial, and late peaks
Indirect identification of inton. categories via
meaning
Context Stimulus
Test Stimuli
8
The signalling of early, medial, and late peaks
9
The signalling of early, medial, and late peaks
10
The signalling of early, medial, and late peaks
11
The signalling of early, medial, and late peaks
Malerin series

The dynamics of the perceptual change from
early to medial decreases with decreasing
dynamics of the underlying intensity change.
The same effect shows up, if a less pointed F0
peak is shifted.
A comparable effect of the dynamics of the F0 and
intensity courses on the pitch-accent perception
can be found for peak-shift series from medial
to late, based on a manipulation of the
decreasing intensity at the VC boundary

12
The signalling of early, medial, and late peaks

Conclusions of Niebuhr (2006, 2007c) The picture
sketched by Kohler (1987, 1991) must be refined
The abruptness of the perceptual changes between
early, medial, and late is not determined
by the categories themselves.
The change from early to medial can be turned
into a gradual one
? The change from medial to late can be made
categorical
The signalling of early, medial, and late
is based on an interplay between F0 and intensity
changes (or levels).
? The findings support the central claim of the
KIM that the synchronization of the
(rising-falling) F0-peak contour relative to the
vowel boundaries is decisive for the pitch-accent
identification
The findings can explain to some extent, why
different alignment patterns of the F0 peaks are
found for different structures and segmental
compositions of the accent syllable and the
adjacent syllables. They also make sense in terms
of articulatory anchoring.

13
The signalling of early, medial, and late peaks
14
The signalling of early, medial, and late peaks
mmm
15
The signalling of early, medial, and late peaks

? Implication Synchronization itself is not
phonological it is an effective, economic tool
that speakers can use to highlight different
parts of the pitch pattern by making use of the
intensity pattern that results from the segmental
string
Alternative, complementary strategies changing
the overall peak shape, changing intensity
levels, changing segment durations and
articulations (e.g., openness of the vowel,
sibilant / intrinsic pitch)

16
The signalling of early, medial, and late peaks

Intonational difference in focus phrase-final
(nuclear) high-rising valleys vs. terminal
falling peaks
Segments (Phonemes) in focus
Phrase-final /?/
Like /t/ aspiration, it can create sibilant
pitch and is therefore particularly likely to
vary systematically according to the peak-valley
difference.
Moreover /?/ is realized in German with rounding
?? this rounding does typically characterize
already the preceding vowel ? e.g., Tisch
t????.
So, is /?/ in the context of a high-rising valley
lighter than in the context of a terminal falling
peak? Is rounding involved in this effect? If so,
then the effect should already be noticeable in
the vowel preceding /?/.

17
The signalling of early, medial, and late peaks

Intonational difference in focus phrase-final
(nuclear) high-rising valleys vs. terminal
falling peaks
Segments (Phonemes) in focus
Phrase-final /x/
It is also able to convey pitch by means of
changes in the energy pattern of the noise
spectrum
Moreover following /u/, /x/ is realized as a
rounded velar fricative x?.
So, is /x/ in the context of a high-rising valley
lighter than in the context of a terminal falling
peak? Is rounding involved in this effect? If so,
then the effect should already be noticeable in
the vowel /u/ preceding /x/.

18
The signalling of early, medial, and late peaks

Intonational difference in focus phrase-final
(nuclear) high-rising valleys vs. terminal
falling peaks
Segments (Phonemes) in focus
Phrase-final lt-ergt word endings
It is realized as a vocoid sound ?, which is
known to show considerable dialectal variation
between ?? and ?
So, is it also influenced by the coinciding
intonation categories? If so, is it lighter (more
fronted and/or open) in connection with
high-rising valleys?
Phrase-final /?/
Its phonetic quality is known to be strongly
context dependent in German
So, is /?/ lighter (more fronted and/or open) in
connection with high-rising valleys?

19
The signalling of early, medial, and late peaks

Pairs of target words (increases the n)
Tisch, Fisch ? /?/ with preceding /?/
Buch, Tuch ? /x/ with preceding /u/
lecker, Bäcker ? lt-ergt realized as ?
Tage, Schramme ? /?/
Placed sentence- and hence phrase-finally in
contexts of high-rising valleys and terminal
falling peaks
Acoustic analysis, including
F2 (based on LPC LTAS) at three points in the
vowels 20ms after onset, centre, 20ms before
offset
Centre-of-Gravity (CoG) determined every 7ms in
the fricatives and then averaged across the whole
fricative segment.
Segment durations of vowels and fricatives

sibilant pitch
intrinsic vowel pitch
20
The signalling of early, medial, and late peaks

Corpus recorded with quasi-spontaneous, informal
sounding speech, using an improved method of
Kohler und Niebuhr (2007)
This means
Written dialogue texts with informal, everyday
contents/situations.
Target words are integrated sentence-finally
without highlighting.
The high-rising valleys and terminal falling
peaks as well as the corresponding accented
syllables are elicited solely by creating
appropriate semantic-pragmatic contexts.
Dialogues were produced by good friends.
They were allowed to modify the texts according
to their own way of speaking (e.g., by
introducing or exchanging words and particles).
One of the speakers was the experimenter (me) he
tried to guide the subject with regard to
speaking style, and his productions were part of
the pragmatic context.
Every dialogue was produced 4 times in a row, and
only the last two productions were used for the
acoustic analysis.
So far, 5 subjects have been recorded (? n20),
10 are planned.

21
The signalling of early, medial, and late peaks

Results for Tisch and Fisch
The sibilant /?/ is considerably lighter in the
contexts of the high-rising valley. This is
reflected in significantly different mean CoG
values. The fricative durations, however, do not
differ significantly.
Also the productions of /?/ differ depending on
the intonation context. That is, F2 (middle) is
significantly higher after the high-rising
valley.
Supported by perceptual analysis, this effect
involves de-rounding.

22
The signalling of early, medial, and late peaks

Results for Buch and Tuch
The fricative /x/ is considerably lighter in the
contexts of the high-rising valley. This is
reflected in significantly different mean CoG
values. The fricative durations, however, do not
differ significantly.
Also the productions of /u/ differ slightly in
the way that the F2 (middle) tends to be higher
after the high-rising valley.
Supported by perceptual analysis, this is again
due to de-rounding.

peak
valley
23
The signalling of early, medial, and late peaks

Results for lecker and Bäcker
The vocoid realizations of the word ending lt-ergt
are not consistently lighter in connection with
the high-rising valley. Such an effect i.e. a
higher F2 can only be observed in tendency
towards the end of the sound.
? in connection with the high-rising valley lt-ergt
becomes a diphthongized ???, which also tends
to be longer than in connection of the terminal
falling peak.

peak
valley
650 1650
700 1300
24
The signalling of early, medial, and late peaks

Results for Tage and Schramme
The productions of /?/ are lighter, i.e. they
show a significantly higher F2 at the centre and
towards the end of the vowel in connection with
the high-rising valley.
However, the vowel durations do not differ
significantly depending on the intonational
context.

peak
valley
25
Summary of the intonational part

We know that the F0 course contributes to the
coding of segments
F0 rise or fall before/after obstruents is a
fortis-lenis cue (Kohler 1979)
F0 relative to F1 determines the vowel quality
(Traunmüller 1985)
Position of F0 turning points cues word
boundaries (DImperio 2000,Petrone 2008)
(...)

26
Summary of the intonational part
Intonation
Lexeme, Phoneme, Phone

We need not forget that the segmental string is
not (just) a troublemaker for the
coding of intonational units
The segmental contribution can the of two
different kinds
direct e.g., by intensity-based highlighting of
parts of the F0 course, by conveying
different sibilant or intrinsic (i.e.
vowel-based) pitches
indirect e.g., by articulatory metaphers of
attitudinal meanings, i.e. long, soft, and
light articulations for astonishment vs. short,
loud, dark articulations for conclusions
or superordinations, etc.

27
Place assimilation in French sibilant sequences

Based on a recent investigation of Niebuhr et al.
(2008) within the S2S research network
Starting from a corpus of read speech that
comprised 72 sentences that may be subdivided
into three subsets
(1) the 8 possible sibilant sequences across word
boundaries that result from the cross-combination
of the features 'alveolar', 'postalveolar' and
'voiced', 'voiceless', i.e. (a) /s?/, (b) /?s/,
(c) /z?/, (d) /?z/, (e) /s?/, (f) /?s/, (g) /?z/,
and (h) /z?/ ? placed in the symmetrical vowel
contexts /i/___/i/, /a/___/a/, and /u/___/u/ (
24).
(2) (a) /s?/, (b) /?s/, (c) /z?/, (d) /?z/ framed
by the 6 asymmetrical vowel contexts ( 24).
(3) each of the 4 individual sibilants /s/, /?/,
/z/, and /?/ paired across word boundaries with a
labial consonant (C) like /p/ or /v/ in the two
possible orders __C and C__ and framed by
symmetrical vowel contexts ? reference qualities
for the sibilants (24).
All 72 sentences were read 4 times in a
randomized order by 4 female native speakers of
French

28
Place assimilation in French sibilant sequences

Based on a recent investigation of Niebuhr et al.
(2008) within the S2S research network
Starting from a corpus of read speech that
comprised 72 sentences that may be subdivided
into three subsets
(1) /u?su/ Tu te couches sous l'drap
(2) /as?a/ C'est une classe chargée
(3) /uz?u/ J'ai vendu douze journaux
(4) /az?a/ C'est une phrase japonaise
(5) /a?Ca/ Tu te caches facilement
(6) /izCi/ Il a une devise vitale
(7) /aCsa/ Elle tape sa soeur

29
Place assimilation in French sibilant sequences

Measurements
Spectral
range and mean
of the CoG across
the whole section
Duration
of the whole section
of the individual
sibilants were
possible

30
Place assimilation in French sibilant sequences

Results in the temporal domain
If there were two spectrally separable sibilant
sections, the postalveolar was always the longer
one.
Overall, the
sequences
were around
twice as long
as the single
references,
alv.-postalv.
in tendency
even more
than
postalv.-alv.

31
Place assimilation in French sibilant sequences

Results in the frequency domain
The alveolar-postalveolar as well as the
postalveolar-alveolar sequences were both
spectrally
shifted in a comparable
way towards the
postalveolar references.

32
Place assimilation in French sibilant sequences

Conclusions
Place assimilation in French sibilant sequences
is a gradual rather than a categorical phenomenon
It is feature-determined, not direction-determined
. The target is postalveolar. Consequently, it is
regressive in alveolar-postalveolar and
progressive in postalveolar-alveolar sequences.
? Elle remâche sa viande /?s/ ? ?
? Cest une classe chargée /s?/ ? ?
The assimilated alveolar-postalveolar and
postalveolar-alveolar sequences can have
phonetically identical manifestations in terms of
both temporal and spectral values.
But are these sequences really ambiguous?

33
Place assimilation in French sibilant sequences

Nolan (1992) found for English that word-final
/d/s which were completely assimilated to
following word-initial /g/s (in terms of EPG
patterns) were still identified as /d/s by his
subjects.
He ascribed this effect to differences in the
preceding vowel.
Starting from this interesting observation, which
was not further pursued so far, we investigated
whether the vowels that preceded the assimilated
/s?/ and /?s/ sibilant sequences in French show
differences in phonetic details that can be used
by listeners to identify the even those following
sibilant sequences as /s?/ or /?s/ that are
ambiguously realized as ?.

34
Place assimilation in French sibilant sequences

This is what we found for vowel duration and
vowel intensity
The vowels /a, i, u/ were significantly longer
when they preceded /?s/ (on average 15-20ms, up
to 60ms).
The vowels /a, i, u/ were significantly louder
when they preceded /s?/ (on average 2-3dB, up to
5dB).

35
Place assimilation in French sibilant sequences

And in addition we found for voice quality
The vowels /a, i, u/ were significantly breathier
when preceded by /?s/.
The voice quality was represented by the harmonic
ratios H1/H2, based on narrow band DFT spectral
at three points in the vowel 20ms after onset,
centre, 20ms before offset.
On the other hand, the vowels before /s?/
sometimes show a short section of /h/-like
friction before the actual sibilant sets in (
different timing of breath?)

36
Place assimilation in French sibilant sequences
/s?/
/?s/
37
Place assimilation in French sibilant sequences
/s?/
/?s/
38
Summary of the assimilation part

Initial perceptual tests were done in which (a)
just the CV part of the first target syllable and
(b) just the sibilant section itself was played
to non-naïve native speakers of French. The
preliminary results show that the listeners were
only able for (a) to predict beyond change level
whether the upcoming sibilant sequence was /s?/
or /s?/.
So, the multi-parametric phonetic details in the
vowels preceding the sibilant sequences might
already be acoustic cues to the phonological
make-up of the sibilant sequence, even if the
phonetic realization of the latter is by itself
ambiguous.
This suggests that the term gradual has a
temporal implication beyond the sibilant sequence
itself.
Moreover, the fine phonetic differences in the
vowels were not found in the same way before
single, non-assimilated sibilant. This might be
taken as evidence that place assimilation within
French sibilant sequences represents a
re-organization rather than a (gradual)
substitution of phonological features, in line
with previous findings of many other assimilation
processes across languages.