Title: The Design of Multidimensional Sound Interfaces Michael Cohen
1The Design of Multidimensional Sound
InterfacesMichael Cohen Elizabeth M. Wenzel
8
- Presented by
- Andrew Snyder Thor Castillo
February 3, 2000
HFE760 - Dr. Gallimore
2Table of Contents
- Introduction How we localize sound
- Chapter 8
- Research
- Conclusion
3Introduction
- Ear Structure
- Binaural Beats - Demo
- Why are they important?
- Localization Cues
4Introduction
5Introduction
- Binaural Beats Demo
- Why are they Important?
6Introduction
- Localization Cues
- Humans use auditory localization cues to help
locate the position in space of a sound source.
There are eight sources of localization cues - interaural time difference
- head shadow
- pinna response
- shoulder echo
- head motion
- early echo response
- reverberation
- vision
7Introduction
- Localization Cues
- Interaural time difference describes the time
delay between sounds arriving at the left and
right ears. - This is a primary localization cue for
interpreting the lateral position of a sound
source.
8Introduction
- Localization Cues
- Head shadow is a term describing a sound having
to go through or around the head in order to
reach an ear. - The filtering effects of head shadowing cause one
to have perception problems with linear distance
and direction of a sound source.
9Introduction
- Localization Cues
- Pinna response desribes the effect that the
external ear, or pinna, has on sound. - Higher frequencies are filtered by the pinna in
such a way as to affect the perceived lateral
position, or azimuth, and elevation of a sound
source.
10Introduction
- Localization Cues
- Shoulder echo - Frequencies in the range of
1-3kHz are reflected from the upper torso of the
human body.
11Introduction
- Localization Cues
- Head motion - The movement of the head in
determining a location of a sound source is a key
factor in human hearing and quite natural.
12Introduction
- Localization Cues
- Early echo response and reverberation -Sounds in
the real world are the combination of the
original sound source plus their reflections from
surfaces in the world (floors, walls, tables,
etc.). - Early echo response occurs in the first 50-100ms
of a sounds life.
13Introduction
- Localization Cues
- Vision helps us quickly locate the physical
location of a sound and confirm the direction
that we perceive
14Chapter 8 Contents
- Introduction
- Characterization and Control of Acoustic Objects
- Research Applications
- Interface Control via Audio Windows
- Interface Issues Case Studies
15Introduction
- I/O generations and dimensions
- Exploring the audio design space
16Introduction
- I/O generations and dimensions
- First Generation - Early computer terminals
allowed only textual i/o Character-based user
interface (CUI) - Second Generation - As terminal technology
improved, user could manipulate graphical objects
Graphical User Interface (GUI) - Third Generation 3D graphical devices.
- 3D audio The sound has a spatial attribute,
originating, virtually or exactly, from an
arbitrary point with respect to the listener
This chapter focused on the third-generation of
aural sector.
17Introduction
- Exploring the audio design space
- Most people think that it would be easier to be
hearing- than sight- impaired, even though the
incidence of disability-related cultural
isolation is higher among the deaf than the
blind. - The development of user interfaces has
historically been focused more on visual modes
than aural. - Sound is frequently included and utilized to the
limits of its availability and affordability in
PCs. However, computer aided exploitation of
audio bandwidth is only now beginning to rival
that of graphics. - Because of the cognitive overload that results
from overburdening other systems (perhaps
especially the visual) there are strong
motivations for exploiting sound to its full
potential
18Introduction
- Exploring the audio design space
- This chapter reviews the evolving state of the
art of non-speech audio interfaces, driving both
spatial and non-spatial attributes. - This chapter will focus primarily on the
integration of these new technologies crafting
effective matches between projected user desires
and emerging technological capabilities.
19Characterization and Control of Acoustic Objects
Part of listening to a mixture of conversations
or music is being able to hear the individual
voices or musical instruments. This
synthesis/decomposition duality is the opposite
effect of masking instead of sounds hiding each
other, they are complementary and individually
perceivable. Audio imaging the creation of
sonic illusions by manipulation of stereo
channels. Stereo system sound comes from only
left and right transducers, whether headphones or
loudspeakers. Spatial sound involves technology
that allows sound to emanate from any direction.
(left-right, up-down, back-forth, and everything
in between)
20Characterization and Control of Acoustic Objects
- The cocktail party effectwe can filter sound
according to - position
- speaker voice
- subject matter
- tone/timbre
- melodic line and rhythm
21Characterization and Control of Acoustic Objects
- Spatial dimensions of sound
- Implementing spatial sound
- Non-spatial dimensions and auditory symbology
22Characterization and Control of Acoustic Objects
- Spatial dimensions of sound
- The goal of spatial sound synthesis is to project
audio media into space by manipulating sound
sources so that they assume virtual positions,
mapping the source channel into three-dimensional
space. These virtual positions enable auditory
localization. - Duplex Theory (Lord Rayleigh, 1907) human sound
localization is based on two primary cues to
location, interaural differences in time of
arrival and interaural differences in intensity.
23Characterization and Control of Acoustic Objects
- Spatial dimensions of sound
- There are several problems with the duplex
theory - Cannot account for the ability of subjects to
localized many types of sounds coming from many
different regions (ex. Sound along the median
plane) - When using duplex to generate sound cues in
headphones, the sound is perceived inside the
head - Most of the deficiencies with the duplex theory
are linked to the interaction of sound waves in
the pinnae (outer ears)
24Characterization and Control of Acoustic Objects
- Spatial dimensions of sound
- Peaks and valleys in the auditory spectrum can be
used as localization cues for elevation of the
sound source. Other cues are also necessary to
locate the vertical position of a sound source.
This is very important to researchers since it
has never been truly understood.
25Characterization and Control of Acoustic Objects
- Spatial dimensions of sound
- Localization errors in current sound generating
technologies is very common, some of the problems
that persist are - Locating sound on the vertical plane
- Some systems can cause a front ? back reversal
- Some systems can cause an up ? down reversal
- Judging distance from the sound source! Were
generally terrible at doing this anyways!!! - Sound localization can be dramatically improved
with a dynamic stimulus (can reduce amount of
reversals) - Allowing head motion
- Moving the location of the sound
- Researchers suggest that this can help
externalize sound!!!
26Characterization and Control of Acoustic Objects
- Implementing spatial sound
- Physically locating loudspeakers in the place
were each source is located, relative to the
listener. (Most direct forward) - Not portable Cumbersome
- Other approaches use analytic mathematical models
of the pinnae and other body structures in order
to directly calculate acoustic responses. - A third approach to accurate real-time
spatialization concentrates on digital sound
processors (DSP) techniques for synthesizing cues
from direct measurements of head related transfer
functions. (The author focuses on this type of
approach)
27Characterization and Control of Acoustic Objects
- Implementing spatial sound
- DSP The goals is to make sound spatializers
that give the impression that the sound is coming
from different sources and different locations. - Why? - A display that focuses on this technology
can exploit the human ability to quickly and
subconsciously locate sound sources. - Convolution Hardware and/or Software based
engines performs the convolution that filters the
sound in some DSPs
28Characterization and Control of Acoustic Objects
- Implementing spatial sound
- Crystal River Engineering Convolvotron
- Gehring Research Focal Point
- AKG CAP (Creative Audio Processor)
- Head Acoustics
- Roland Sound Space (RSS) Processor
- Mixels
29Characterization and Control of Acoustic Objects
- Implementing spatial sound
- Crystal River Engineering Convolvotron
- What is it? It is a convolution engine that
spatializes sound by filtering audio channels
with transfer functions that simulate positional
effects. - Alphatron Acoustetron II
- The technology is good except for time delays do
to computation of 30-40 ms (which can be picked
up by the ear if used with visual inputs)
30Characterization and Control of Acoustic Objects
- Implementing spatial sound
- Gehring Research Focal Point
- What is it? Focal Point comprises two binaural
localization technologies, Focal Point Type 1 and
2. - Focal Point 1 the original Focal Point
technology, utilizing time-domain convolution
with head related transfer function based impulse
responses for anechoic simulation. - Focal Point 2 a Focal Point implementation in
which sounds are preprocessed offline, creating
interleaved sound files which can then be
positioned in 3D in real-time upon playback.
31Characterization and Control of Acoustic Objects
- Implementing spatial sound
- AKG CAP (Creative Audio Processor)
- What is it? A kind of binaural mixing console.
The system is used to create audio recordings
with integrated Head Related Transfer Functions
and other 3D audio filters.
32Characterization and Control of Acoustic Objects
- Implementing spatial sound
- Head Acoustics
- What is it? A research company in Germany that
has developed a spatial audio system with an
eight-channel binaural mixing console using
anechoic simulations as well as a new version of
an artificial head
33Characterization and Control of Acoustic Objects
- Implementing spatial sound
- Roland Sound Space (RSS) Processor
- What is it? Roland has developed a system which
attempts to provide real-time spatialization
capabilities for both headphones and stereo
loudspeaker presentation. The basic RSS system
allows independent placement of up to four
sources using domain convolution. - What makes this system special is that it
incorporates a technique know as transaural
processing, or crosstalk cancellation between the
stereo speakers. This technique seems to allow an
adequate spatial impression to be achieved.
34Characterization and Control of Acoustic Objects
- Implementing spatial sound
- Mixels
- The number of channels in a system corresponds to
the degree of spatial polyphony, simultaneous
spatialized sound sources, the system can
generate. In the assumption that systems will
increase their capabilities enormously, via
number of channels, we label their number of
channels as Mixels. - By way of analogy to pixels and voxels, the
atomic level of sound is sometimes called mixels,
acronymic for sound mixing elements
35Characterization and Control of Acoustic Objects
- Implementing spatial sound
- Mixels
- But, rather than diving in deeply into more
spatial audio systems, the rest of the chapter
will concentrate on the nature of control
interfaces that will need to be developed to take
full advantage of these new capabilities.
36Characterization and Control of Acoustic Objects
- Non-spatial dimensions and auditory symbology
- Auditory icons - acoustic representations of
naturally occurring events that caricature the
action being represented - Earcons elaborated auditory symbols which
compose motifs into artificial non-speech
language, phrases distinguished by rhythmic and
tonal patterns
37Characterization and Control of Acoustic Objects
- Non-spatial dimensions and auditory symbology
- Filtears a class of cues that independent of
distance and direction. They are used to attempt
to expand the spectrum of how we used sound. Used
to create sounds with attributes attached to
them. Think of it as sonic typography placing
sound in space can be likened to putting written
information on a page. Filtears are dependant on
source and sink. - Example Imagine your telenegotiating with many
people. You can select attributes of a persons
voice. (distance from you, direction,
indoors-outdoors, whispers behind your ear, etc)
38Research Applications
- Virtual acoustic displays featuring spatial sound
can be thought of as enabling two performance
advantages - Situation Awareness Omnidirectional monitoring
via direct representation of spatial information
reinforces or replaces information in other
modalities, enhancing ones sense of presence or
realism. - Multiple Channel Segregation can improve
intelligibility, discrimination, selective
attention among audio sources.
39Research Applications
- Sonification
- Teleconferencing
- Music
- Virtual Reality and Architectural Acoustics
- Telerobotics and Augmented Audio Reality
40Research Applications
- Sonification
- Sonification can be thought of as auditory
visualization and can be used as a tool for
analysis, for example, presenting multivariate
data as auditory patterns. Because visual and
auditory channels can be independent from each
other, data can be mapped differently to each
mode of perception, and auditory mappings can be
used to discover relationships that are hidden in
the visual display.
41Interface Control via Audio Windows
- Audio Windows is an auditory-object manager.
- The general idea is to permit multiple
simultaneous audio sources, such as
teleconference, to coexist in a modifiable
display without clutter or user stress.
42Interface Design Issues Case Studies
- Veos and Mercury (written with Brian Karr)
- Handy Sound
- Maw
43Interface Design Issues Case Studies
- Veos and Mercury (written with Brian Karr)
- Veos - Virtual Environment Operating System
- Sound Render Implementation - A software package
that interfaces with a VR system (like Veos). - The Audio Browser - A hierarchical sound file
navigation and audition tool.
44Interface Design Issues Case Studies
- Handy Sound
- Handy Sound explores gestural control of an audio
window system. - Manipulating source position in Handy Sound
- Manipulating source quality in Handy Sound
- Manipulating sound volume in Handy Sound
- Summary - Handy sound demonstrates the general
possibilities of gesture recognition and spatial
sound in a multichannel conferencing environment.
45Interface Design Issues Case Studies
- Maw
- Developed as an interactive frontend for
teleconferencing, Maw allows the user to arrange
sources and sinks in a horizontal plane. - Manipulating source and sink positions in Maw
- Organizing acoustic objects in Maw
- Manipulating sound volume in Maw
- Summary
46Conclusion
47Sound authoring tools for future multimedia
systems
- Bezzi, Marco De Poli, Giovanni Rocchesso,
Davide - Univ di Padova, Padova, Italy
- Summary
- A framework for authoring non-speech sound
objects in the context of multimedia systems is
proposed. The goal is to design specific sound
and their dynamic behavior in such a way that
they convey dynamic and multidimensional
information. Sound are designed using a
three-layer abstraction model physically-based
description of sound identity, signal-based
description of sound quality, perception- and
geometry-based description of sound projection in
space. The model is validated with the aid of an
experimental tool where manipulation of sound
objects can be performed in three ways handling
a set of parameter control sliders, editing the
evolution in time of compound parameter settings,
via client applications sending their requests to
the sounding engine. Author abstract 26 Refs
In English Conference Information Proceedings
of the 1999 6th International Conference on
Multimedia Computing and Systems - IEEE ICMCS'99
Jun 7-Jun 11 1999 Florence, Italy Sponsored by
IEEE CS IEEE Circuit and Systems Society
48Interactive 3D sound hyperstories for blind
children
- Lumbreras, Mauricio Sanchez, Jaime
- Univ of Chile, Santiago, Chile
- Summary
- Interactive software is currently used for
learning and entertainment purposes. This type of
software is not very common among blind children
because most computer games and electronic toys
do not have appropriate interfaces to be
accessible without visual cues. This study
introduces the idea of interactive hyperstories
carried out in a 3D acoustic virtual world for
blind children. We have conceptualized a model to
design hyperstories. Through AudioDoom we have an
application that enables testing cognitive tasks
with blind children. The main research question
underlying this work explores how audio-based
entertainment and spatial sound navigable
experiences can create cognitive spatial
structures in the minds of blind children.
AudioDoom presents first person experiences
through exploration of interactive virtual worlds
by using only 3D aural representations of the
space. Author abstract 21 Refs In English
Conference Information Proceedings of the CHI
99 Conference CHI is the Limit - Human Factors
in Computing Systems May 15-May 20 1999
Pittsburgh, PA, USA Sponsored by ACM SIGCHI
49Any questions???
50References
- Modeling Realistic 3-D Sound Turbulence
- http//www-engr.sjsu.edu/duda/Duda.Reports.htmlR
1 - 3D Sound Aids for Fighter Pilots
- http//www.dsto.defence.gov.au/corporate/history/j
ubilee/sixtyyears18.html - 3D Sound Synthesis
- http//www.ee.ualberta.ca/khalili/3Dnew.html
- Binaural Beat Demo
- http//www.monroeinstitute.org/programs/bbapplet.h
tml - Begault, Durand R. "Challenges to the Successful
Implementation of 3-D Sound", NASA-Ames Research
Center, Moffett Field, CA, 1990. - Begault, Durand R. "An Introduction to 3-D Sound
for Virtual Reality", NASA-Ames Research Center,
Moffett Field, CA, 1992.
51References
- Burgess, David A. "Techniques for Low Cost
Spatial Audio", UIST 1992. - Foster, Wenzel, and Taylor. "Real-Time Synthesis
of Complex Acoustic Environments" Crystal River
Engineering, Groveland, CA. - Stuart Smith. "Auditory Representation of
Scientific Data", Focus on Scientific
Visualization, H. Hagen, H. Muller, G.M. Nielson,
eds. Springer-Verlag. 1993. - Stuart, Rory. "Virtual Auditory Worlds An
Overview", VR Becomes a Business, Proceedings of
Virtual Reality 92, San Jose, CA, 1992. - Takala, Tapio and James Hahn. "Sound Rendering".
Computer Graphics, 26, 2, July 1992.
52One last thing
- For those who want to have a little fun, try
this - http//www.cs.indiana.edu/picons/javoice/index.ht
ml -
-
- http//ourworld.compuserve.com/homepages/Peter_Me
ijer/javoice.htm
53The Design of Multidimensional Sound
InterfacesMichael Cohen Elizabeth M. Wenzel
8
- Presented by
- Andrew Snyder Thor Castillo
February 3, 2000
HFE760 - Dr. Gallimore