Title: Phonetic details in prosodic phenomena
1Phonetic details in prosodic phenomena
- Oliver Niebuhr
- Presentation at the Laboratoire de Phonétique et
Phonologie, Paris 3 - January, 30th, 2009
- oliver.niebuhr_at_lpl-aix.fr
2Summary of German intonation (in terms of KIM)
- A number of attitudinal meanings are known to be
signalled in German by prosodic means - The speaker can convey that an information is
- (a) settled, concluding
- (b) presenting, open(ing)
- (c) astonishing
- Furthermore, speaker can
- (d) superordinate or
- (e) subordinate her/himself to the dialogue
partner - The specific interpretations of these attitudinal
meanings may vary depending on the semantic
composition and the linguistic structure of the
utterance - In negative contexts (a) may be interpreted as
resignation and (c) as disbelieving / taken
aback - (d) and (e) can convey statement and question
3Summary of German intonation (in terms of KIM)
- In the Kiel Intonation Model (KIM, Kohler 1991),
these attitudinal meanings have been assigned to
pitch-accent categories - The KIM distinguishes between 2 basic
phonological classes of pitch movements - rising-falling peaks
- G
- (falling)-rising valleys
- Co-occur with accented (i.e. perceptually
salient) syllables - Timing is phonological, not phonetic (?
alignment) ? synchronization - Relevant according to Kohler (1987, 1991) F0
maximum (peaks) or minimum (valleys) relative to
the accented-vowel boundaries (onsets)
4Summary of German intonation (in terms of KIM)
- early peaks ? F0 max. before acc.Von
settled, concl. -
- medial peaks ? F0 max. after acc.Von
presenting -
- late peaks ? F0 max. after acc.Voff
astonishing
- early/late valleys ? F0 min. before/after
- acc.Von subordination /
- ? questioning
Did you hear me
Answer hmmnot exactly.
Answer yes, sure.
5The signalling of early, medial, and late peaks
- The reference to the accented-vowel boundaries in
the KIM originates from peak-shift experiments by
Kohler (1987, 1991). - On the other hand, his experiments showed that
the location of the category boundary is shifted
for different stimulus utterances (Kohler 1991) - Sie hat ja gelogen lateral vowel
- Sie ist ja geritten fricative vowel
- Sie hat ja gejodelt approximant vowel
- Why?
? earlier boundary
? later boundary
6The signalling of early, medial, and late peaks
- Niebuhr (2006, 2007c) maybe, it is not the
segment boundary between CV in terms of a
spectral change (e.g., formant transitions) that
matters, but the increasing / decreasing
intensity into and out of the accented vowel - The intensity change is more abrupt for sequences
nasalvowel or lateralvowel than for
approximantvowel. - It also depends on the vowel quality itself.
- Starting from this idea,
- Two f0 peak shift series were resynthesized
- One using the stimulus utterance Sie war mal
Malerin - The other hand keeps exactly the F0 and intensity
patterns of the Malerin series, but on a
constant Schwa-like vowel quality ( HUM in
praat) - So, basically, the two stimulus series (Malerin
and HUM) differ just with regard to the
presence / absence of the segmental string. - Two parallel perception experiments with two
separate groups of subjects - Indirect identification for Malerin series
- AXB test for HUM series
7The signalling of early, medial, and late peaks
Indirect identification of inton. categories via
meaning
Context Stimulus
Test Stimuli
8The signalling of early, medial, and late peaks
9The signalling of early, medial, and late peaks
10The signalling of early, medial, and late peaks
11The signalling of early, medial, and late peaks
Malerin series
- The dynamics of the perceptual change from
early to medial decreases with decreasing
dynamics of the underlying intensity change. - The same effect shows up, if a less pointed F0
peak is shifted. - A comparable effect of the dynamics of the F0 and
intensity courses on the pitch-accent perception
can be found for peak-shift series from medial
to late, based on a manipulation of the
decreasing intensity at the VC boundary
12The signalling of early, medial, and late peaks
- Conclusions of Niebuhr (2006, 2007c) The picture
sketched by Kohler (1987, 1991) must be refined - The abruptness of the perceptual changes between
early, medial, and late is not determined
by the categories themselves. - The change from early to medial can be turned
into a gradual one - ? The change from medial to late can be made
categorical - The signalling of early, medial, and late
is based on an interplay between F0 and intensity
changes (or levels). - ? The findings support the central claim of the
KIM that the synchronization of the
(rising-falling) F0-peak contour relative to the
vowel boundaries is decisive for the pitch-accent
identification - The findings can explain to some extent, why
different alignment patterns of the F0 peaks are
found for different structures and segmental
compositions of the accent syllable and the
adjacent syllables. They also make sense in terms
of articulatory anchoring.
13The signalling of early, medial, and late peaks
14The signalling of early, medial, and late peaks
mmm
15The signalling of early, medial, and late peaks
- ? Implication Synchronization itself is not
phonological it is an effective, economic tool
that speakers can use to highlight different
parts of the pitch pattern by making use of the
intensity pattern that results from the segmental
string - Alternative, complementary strategies changing
the overall peak shape, changing intensity
levels, changing segment durations and
articulations (e.g., openness of the vowel,
sibilant / intrinsic pitch)
16The signalling of early, medial, and late peaks
- Intonational difference in focus phrase-final
(nuclear) high-rising valleys vs. terminal
falling peaks - Segments (Phonemes) in focus
- Phrase-final /?/
- Like /t/ aspiration, it can create sibilant
pitch and is therefore particularly likely to
vary systematically according to the peak-valley
difference. - Moreover /?/ is realized in German with rounding
?? this rounding does typically characterize
already the preceding vowel ? e.g., Tisch
t????. - So, is /?/ in the context of a high-rising valley
lighter than in the context of a terminal falling
peak? Is rounding involved in this effect? If so,
then the effect should already be noticeable in
the vowel preceding /?/.
17The signalling of early, medial, and late peaks
- Intonational difference in focus phrase-final
(nuclear) high-rising valleys vs. terminal
falling peaks - Segments (Phonemes) in focus
- Phrase-final /x/
- It is also able to convey pitch by means of
changes in the energy pattern of the noise
spectrum - Moreover following /u/, /x/ is realized as a
rounded velar fricative x?. - So, is /x/ in the context of a high-rising valley
lighter than in the context of a terminal falling
peak? Is rounding involved in this effect? If so,
then the effect should already be noticeable in
the vowel /u/ preceding /x/.
18The signalling of early, medial, and late peaks
- Intonational difference in focus phrase-final
(nuclear) high-rising valleys vs. terminal
falling peaks - Segments (Phonemes) in focus
- Phrase-final lt-ergt word endings
- It is realized as a vocoid sound ?, which is
known to show considerable dialectal variation
between ?? and ? - So, is it also influenced by the coinciding
intonation categories? If so, is it lighter (more
fronted and/or open) in connection with
high-rising valleys? - Phrase-final /?/
- Its phonetic quality is known to be strongly
context dependent in German - So, is /?/ lighter (more fronted and/or open) in
connection with high-rising valleys?
19The signalling of early, medial, and late peaks
- Pairs of target words (increases the n)
- Tisch, Fisch ? /?/ with preceding /?/
- Buch, Tuch ? /x/ with preceding /u/
- lecker, Bäcker ? lt-ergt realized as ?
- Tage, Schramme ? /?/
- Placed sentence- and hence phrase-finally in
contexts of high-rising valleys and terminal
falling peaks - Acoustic analysis, including
- F2 (based on LPC LTAS) at three points in the
vowels 20ms after onset, centre, 20ms before
offset - Centre-of-Gravity (CoG) determined every 7ms in
the fricatives and then averaged across the whole
fricative segment. - Segment durations of vowels and fricatives
sibilant pitch
intrinsic vowel pitch
20The signalling of early, medial, and late peaks
- Corpus recorded with quasi-spontaneous, informal
sounding speech, using an improved method of
Kohler und Niebuhr (2007) - This means
- Written dialogue texts with informal, everyday
contents/situations. - Target words are integrated sentence-finally
without highlighting. - The high-rising valleys and terminal falling
peaks as well as the corresponding accented
syllables are elicited solely by creating
appropriate semantic-pragmatic contexts. - Dialogues were produced by good friends.
- They were allowed to modify the texts according
to their own way of speaking (e.g., by
introducing or exchanging words and particles). - One of the speakers was the experimenter (me) he
tried to guide the subject with regard to
speaking style, and his productions were part of
the pragmatic context. - Every dialogue was produced 4 times in a row, and
only the last two productions were used for the
acoustic analysis. - So far, 5 subjects have been recorded (? n20),
10 are planned.
21The signalling of early, medial, and late peaks
- Results for Tisch and Fisch
- The sibilant /?/ is considerably lighter in the
contexts of the high-rising valley. This is
reflected in significantly different mean CoG
values. The fricative durations, however, do not
differ significantly. - Also the productions of /?/ differ depending on
the intonation context. That is, F2 (middle) is
significantly higher after the high-rising
valley. - Supported by perceptual analysis, this effect
involves de-rounding.
22The signalling of early, medial, and late peaks
- Results for Buch and Tuch
- The fricative /x/ is considerably lighter in the
contexts of the high-rising valley. This is
reflected in significantly different mean CoG
values. The fricative durations, however, do not
differ significantly. - Also the productions of /u/ differ slightly in
the way that the F2 (middle) tends to be higher
after the high-rising valley. - Supported by perceptual analysis, this is again
due to de-rounding.
peak
valley
23The signalling of early, medial, and late peaks
- Results for lecker and Bäcker
- The vocoid realizations of the word ending lt-ergt
are not consistently lighter in connection with
the high-rising valley. Such an effect i.e. a
higher F2 can only be observed in tendency
towards the end of the sound. - ? in connection with the high-rising valley lt-ergt
becomes a diphthongized ???, which also tends
to be longer than in connection of the terminal
falling peak.
peak
valley
650 1650
700 1300
24The signalling of early, medial, and late peaks
- Results for Tage and Schramme
- The productions of /?/ are lighter, i.e. they
show a significantly higher F2 at the centre and
towards the end of the vowel in connection with
the high-rising valley. - However, the vowel durations do not differ
significantly depending on the intonational
context.
peak
valley
25Summary of the intonational part
- We know that the F0 course contributes to the
coding of segments - F0 rise or fall before/after obstruents is a
fortis-lenis cue (Kohler 1979) - F0 relative to F1 determines the vowel quality
(Traunmüller 1985) - Position of F0 turning points cues word
boundaries (DImperio 2000,Petrone 2008) - (...)
26Summary of the intonational part
Intonation
Lexeme, Phoneme, Phone
- We need not forget that the segmental string is
not (just) a troublemaker for the - coding of intonational units
- The segmental contribution can the of two
different kinds - direct e.g., by intensity-based highlighting of
parts of the F0 course, by conveying - different sibilant or intrinsic (i.e.
vowel-based) pitches - indirect e.g., by articulatory metaphers of
attitudinal meanings, i.e. long, soft, and - light articulations for astonishment vs. short,
loud, dark articulations for conclusions - or superordinations, etc.
27Place assimilation in French sibilant sequences
- Based on a recent investigation of Niebuhr et al.
(2008) within the S2S research network - Starting from a corpus of read speech that
comprised 72 sentences that may be subdivided
into three subsets - (1) the 8 possible sibilant sequences across word
boundaries that result from the cross-combination
of the features 'alveolar', 'postalveolar' and
'voiced', 'voiceless', i.e. (a) /s?/, (b) /?s/,
(c) /z?/, (d) /?z/, (e) /s?/, (f) /?s/, (g) /?z/,
and (h) /z?/ ? placed in the symmetrical vowel
contexts /i/___/i/, /a/___/a/, and /u/___/u/ (
24). - (2) (a) /s?/, (b) /?s/, (c) /z?/, (d) /?z/ framed
by the 6 asymmetrical vowel contexts ( 24). - (3) each of the 4 individual sibilants /s/, /?/,
/z/, and /?/ paired across word boundaries with a
labial consonant (C) like /p/ or /v/ in the two
possible orders __C and C__ and framed by
symmetrical vowel contexts ? reference qualities
for the sibilants (24). - All 72 sentences were read 4 times in a
randomized order by 4 female native speakers of
French
28Place assimilation in French sibilant sequences
- Based on a recent investigation of Niebuhr et al.
(2008) within the S2S research network - Starting from a corpus of read speech that
comprised 72 sentences that may be subdivided
into three subsets - (1) /u?su/ Tu te couches sous l'drap
- (2) /as?a/ C'est une classe chargée
- (3) /uz?u/ J'ai vendu douze journaux
- (4) /az?a/ C'est une phrase japonaise
- (5) /a?Ca/ Tu te caches facilement
- (6) /izCi/ Il a une devise vitale
- (7) /aCsa/ Elle tape sa soeur
29Place assimilation in French sibilant sequences
- Measurements
- Spectral
- range and mean
- of the CoG across
- the whole section
- Duration
- of the whole section
- of the individual
- sibilants were
- possible
30Place assimilation in French sibilant sequences
- Results in the temporal domain
- If there were two spectrally separable sibilant
sections, the postalveolar was always the longer
one. - Overall, the
- sequences
- were around
- twice as long
- as the single
- references,
- alv.-postalv.
- in tendency
- even more
- than
- postalv.-alv.
31Place assimilation in French sibilant sequences
- Results in the frequency domain
- The alveolar-postalveolar as well as the
postalveolar-alveolar sequences were both
spectrally - shifted in a comparable
- way towards the
- postalveolar references.
32Place assimilation in French sibilant sequences
- Conclusions
- Place assimilation in French sibilant sequences
is a gradual rather than a categorical phenomenon - It is feature-determined, not direction-determined
. The target is postalveolar. Consequently, it is
regressive in alveolar-postalveolar and
progressive in postalveolar-alveolar sequences. - ? Elle remâche sa viande /?s/ ? ?
- ? Cest une classe chargée /s?/ ? ?
- The assimilated alveolar-postalveolar and
postalveolar-alveolar sequences can have
phonetically identical manifestations in terms of
both temporal and spectral values. - But are these sequences really ambiguous?
33Place assimilation in French sibilant sequences
- Nolan (1992) found for English that word-final
/d/s which were completely assimilated to
following word-initial /g/s (in terms of EPG
patterns) were still identified as /d/s by his
subjects. - He ascribed this effect to differences in the
preceding vowel. - Starting from this interesting observation, which
was not further pursued so far, we investigated
whether the vowels that preceded the assimilated
/s?/ and /?s/ sibilant sequences in French show
differences in phonetic details that can be used
by listeners to identify the even those following
sibilant sequences as /s?/ or /?s/ that are
ambiguously realized as ?.
34Place assimilation in French sibilant sequences
- This is what we found for vowel duration and
vowel intensity - The vowels /a, i, u/ were significantly longer
when they preceded /?s/ (on average 15-20ms, up
to 60ms). - The vowels /a, i, u/ were significantly louder
when they preceded /s?/ (on average 2-3dB, up to
5dB).
35Place assimilation in French sibilant sequences
- And in addition we found for voice quality
- The vowels /a, i, u/ were significantly breathier
when preceded by /?s/. - The voice quality was represented by the harmonic
ratios H1/H2, based on narrow band DFT spectral
at three points in the vowel 20ms after onset,
centre, 20ms before offset. - On the other hand, the vowels before /s?/
sometimes show a short section of /h/-like
friction before the actual sibilant sets in (
different timing of breath?)
36Place assimilation in French sibilant sequences
/s?/
/?s/
37Place assimilation in French sibilant sequences
/s?/
/?s/
38Summary of the assimilation part
- Initial perceptual tests were done in which (a)
just the CV part of the first target syllable and
(b) just the sibilant section itself was played
to non-naïve native speakers of French. The
preliminary results show that the listeners were
only able for (a) to predict beyond change level
whether the upcoming sibilant sequence was /s?/
or /s?/. - So, the multi-parametric phonetic details in the
vowels preceding the sibilant sequences might
already be acoustic cues to the phonological
make-up of the sibilant sequence, even if the
phonetic realization of the latter is by itself
ambiguous. - This suggests that the term gradual has a
temporal implication beyond the sibilant sequence
itself. - Moreover, the fine phonetic differences in the
vowels were not found in the same way before
single, non-assimilated sibilant. This might be
taken as evidence that place assimilation within
French sibilant sequences represents a
re-organization rather than a (gradual)
substitution of phonological features, in line
with previous findings of many other assimilation
processes across languages.