Title: UNDERSTANDING SPOKEN DISCOURSE:
1UNDERSTANDING SPOKEN DISCOURSE THE CONTRIBUTION
OF THREE INFORMATION CHANNELS
Andrej A. Kibrik (Institute of Linguistics RAN)
Ekaterina M. Elbert (CJSC GLITNIR Securities )
Background
Experiment
Material Russian TV serial Tajny sledstvija
Mysteries of the investigation (3 min. 20 sec.
long), preceded by a 8 minutes context (that
starts from the beginning of the series). The
excerpt fully consists of a conversation, to
ensure that we are testing the understanding of
discourse rather than of the film in general.
- Traditional point of view in linguistics (as well
as in general psychology) - Language is ultimately segmental, or verbal.
- Hierarchy of segmental units phonemes
morphemes words phrases clauses
sentences. - However
- There are other kinds of encoding, or information
channels, relevant for linguistic communication - Prosody, including (i)-(vii), conveys much
information imagine listening to a conversation
behind the wall - Pausing
- Accents
- Tone
- Tempo
- Register
- Reduction
- Phonations
- Loudness
- ............
- Body language, including (i)-(iv), also
constitutes an important visual channel
- Method The three channels verbal, prosodic,
and visual have been isolated from each other
and presented in all possible combinations (8
altogether). - The visual channel by itself is video alone
(without sound) - The verbal channel is subtitles running in
temporal alignment with the original film - The prosodic channel is the original audio
component with a superimposed filter creating the
effect of a conversation behind the wall.
Subjects 99 participants, divided into 8 groups.
¾ - women, ¼ - men. Native speakers of Russian.
Each group comprised 10 to 17 subjects
- Procedure Every subject was instructed to watch
the context and the experimental excerpt and then
answer a set of questions concerned with the
experimental excerpt alone. A subject was
supposed to choose only one answer out of four
listed variants. - 29 questions were originally included in
questionnaire - But six were later discarded as either trivial or
prone to guessing - 23 questions kept after the testing phase
- Directions for subjects
- Please watch an extract of a film (11 minutes)
and answer a set questions related to the last
portion of the extract - When answering questions, you need to choose only
one of the four answers - Questions are printed on separate sheets of
paper. After answering a question, please turn to
the next sheet Correcting previous answers is not
allowed - Please consider one question at a time, beginning
from the first question
Example of a question
What Tamara Stepanovna offers Masha before the
beginning of the conversation a. to take
off her coat b. to have a cup of tea ? c.
to have a seat d. to have a drink
Examples of experimental material Full variant
VerbalVisual Verbal
??, ???????, ????? ????, ??? ????????
???????????????
Results - the mean percentage of correct
answers in each of the eight experimental groups
Group number 1 2 3 4 5 6 7 8
Experimental material Original Sound Subtitlesvideo Prosodyvideo Subtitles Prosody Video Nothing (context only)
Information channels verbal prosodic visual verbal prosodic verbal visual prosodic visual verbal prosodic visual none
Mean of correct answers 87,4 70,4 73,9 51,2 72,0 51,1 61,7 38,3
Discussion
- Each of the three information channels, taken in
isolation, is quite informative. The percentages
in groups 5 through 7 are significantly higher
than the percentage in group 8. - The hierarchy of informativeness can be
represented as follows verbal gt visual gt
prosodic. - Combining the verbal channel with one additional
channel does not increase the percentage of
correct answers (compare group 5 with groups 2
and 3). Only inclusion of all three channels
(group 1) yields significantly better results
than the verbal channel by itself (group 5). - Adding the visual channel to the prosodic channel
does not result in increase in correct answers
(compare group 6 with group 4). - The combination prosodic plus visual (group 4)
displays significantly lower result than in other
pairs of channels (groups 2 and 3). Evidently,
this combination is not customary for subjects,
and they have trouble integrating information
from prosody and video. -
- The relative contribution of the three channels
- Assuming, for the sake of simplicity, that all
three channels are independent, - (725162185)/100
- Results
- Verbal channel 39 (721.8539),
- Prosodic channel 28 (51,11.8528),
- Visual channel 33 (61,71.8533),
39
Conclusions
- All information channels are highly significant ?
the traditional linguistic viewpoint is
erroneous. - The verbal channel is the leading one ? the
viewpoint popular in applied psychology is
erroneous. - Information from the prosodic and the visual
channels is primarily used through integration
with the verbal channel.
Further Directions
- More natural discourse material, such as usual
conversation - Further improvement of the questionnaire
- Main criterion increase the delta of correct
answers between groups 1 and 8 the current
difference between 38 and 87 is insufficient - Construction of a statistical model, assessing
the relative contribution of channels, including
possible relationships between channels - Search for correlation between type of question
and information channel
Acknowledgements O.V.Fedorova, S.A.Krejchi,
O.F.Krivnova, A.V.Proxorov, E.A.Iljushina, S.
Lando
Ekaterina_elbert_at_inbox.ru
Poster presented at the Cogsci 2008, Moscow,
Russia, June 20-25