The Design of Multidimensional Sound Interfaces Michael Cohen - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

The Design of Multidimensional Sound Interfaces Michael Cohen

Description:

Title: Human Decision Making Author: tcastill Last modified by: tcastill Created Date: 9/29/1999 12:37:16 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:280
Avg rating:3.0/5.0
Slides: 54
Provided by: tcas6
Learn more at: http://cecs.wright.edu
Category:

less

Transcript and Presenter's Notes

Title: The Design of Multidimensional Sound Interfaces Michael Cohen


1
The Design of Multidimensional Sound
InterfacesMichael Cohen Elizabeth M. Wenzel
8
  • Presented by
  • Andrew Snyder Thor Castillo

February 3, 2000
HFE760 - Dr. Gallimore
2
Table of Contents
  • Introduction How we localize sound
  • Chapter 8
  • Research
  • Conclusion

3
Introduction
  • Ear Structure
  • Binaural Beats - Demo
  • Why are they important?
  • Localization Cues

4
Introduction
  • Ear Structure

5
Introduction
  • Binaural Beats Demo
  • Why are they Important?

6
Introduction
  • Localization Cues
  • Humans use auditory localization cues to help
    locate the position in space of a sound source.
    There are eight sources of localization cues
  • interaural time difference
  • head shadow
  • pinna response
  • shoulder echo
  • head motion
  • early echo response
  • reverberation
  • vision

7
Introduction
  • Localization Cues
  • Interaural time difference describes the time
    delay between sounds arriving at the left and
    right ears.
  • This is a primary localization cue for
    interpreting the lateral position of a sound
    source.

8
Introduction
  • Localization Cues
  • Head shadow is a term describing a sound having
    to go through or around the head in order to
    reach an ear.
  • The filtering effects of head shadowing cause one
    to have perception problems with linear distance
    and direction of a sound source.

9
Introduction
  • Localization Cues
  • Pinna response desribes the effect that the
    external ear, or pinna, has on sound.
  • Higher frequencies are filtered by the pinna in
    such a way as to affect the perceived lateral
    position, or azimuth, and elevation of a sound
    source.

10
Introduction
  • Localization Cues
  • Shoulder echo - Frequencies in the range of
    1-3kHz are reflected from the upper torso of the
    human body.

11
Introduction
  • Localization Cues
  • Head motion - The movement of the head in
    determining a location of a sound source is a key
    factor in human hearing and quite natural.

12
Introduction
  • Localization Cues
  • Early echo response and reverberation -Sounds in
    the real world are the combination of the
    original sound source plus their reflections from
    surfaces in the world (floors, walls, tables,
    etc.).
  • Early echo response occurs in the first 50-100ms
    of a sounds life.

13
Introduction
  • Localization Cues
  • Vision helps us quickly locate the physical
    location of a sound and confirm the direction
    that we perceive

14
Chapter 8 Contents
  • Introduction
  • Characterization and Control of Acoustic Objects
  • Research Applications
  • Interface Control via Audio Windows
  • Interface Issues Case Studies

15
Introduction
  • I/O generations and dimensions
  • Exploring the audio design space

16
Introduction
  • I/O generations and dimensions
  • First Generation - Early computer terminals
    allowed only textual i/o Character-based user
    interface (CUI)
  • Second Generation - As terminal technology
    improved, user could manipulate graphical objects
    Graphical User Interface (GUI)
  • Third Generation 3D graphical devices.
  • 3D audio The sound has a spatial attribute,
    originating, virtually or exactly, from an
    arbitrary point with respect to the listener
    This chapter focused on the third-generation of
    aural sector.

17
Introduction
  • Exploring the audio design space
  • Most people think that it would be easier to be
    hearing- than sight- impaired, even though the
    incidence of disability-related cultural
    isolation is higher among the deaf than the
    blind.
  • The development of user interfaces has
    historically been focused more on visual modes
    than aural.
  • Sound is frequently included and utilized to the
    limits of its availability and affordability in
    PCs. However, computer aided exploitation of
    audio bandwidth is only now beginning to rival
    that of graphics.
  • Because of the cognitive overload that results
    from overburdening other systems (perhaps
    especially the visual) there are strong
    motivations for exploiting sound to its full
    potential

18
Introduction
  • Exploring the audio design space
  • This chapter reviews the evolving state of the
    art of non-speech audio interfaces, driving both
    spatial and non-spatial attributes.
  • This chapter will focus primarily on the
    integration of these new technologies crafting
    effective matches between projected user desires
    and emerging technological capabilities.

19
Characterization and Control of Acoustic Objects
Part of listening to a mixture of conversations
or music is being able to hear the individual
voices or musical instruments. This
synthesis/decomposition duality is the opposite
effect of masking instead of sounds hiding each
other, they are complementary and individually
perceivable. Audio imaging the creation of
sonic illusions by manipulation of stereo
channels. Stereo system sound comes from only
left and right transducers, whether headphones or
loudspeakers. Spatial sound involves technology
that allows sound to emanate from any direction.
(left-right, up-down, back-forth, and everything
in between)
20
Characterization and Control of Acoustic Objects
  • The cocktail party effectwe can filter sound
    according to
  • position
  • speaker voice
  • subject matter
  • tone/timbre
  • melodic line and rhythm

21
Characterization and Control of Acoustic Objects
  • Spatial dimensions of sound
  • Implementing spatial sound
  • Non-spatial dimensions and auditory symbology

22
Characterization and Control of Acoustic Objects
  • Spatial dimensions of sound
  • The goal of spatial sound synthesis is to project
    audio media into space by manipulating sound
    sources so that they assume virtual positions,
    mapping the source channel into three-dimensional
    space. These virtual positions enable auditory
    localization.
  • Duplex Theory (Lord Rayleigh, 1907) human sound
    localization is based on two primary cues to
    location, interaural differences in time of
    arrival and interaural differences in intensity.
     

23
Characterization and Control of Acoustic Objects
  • Spatial dimensions of sound
  • There are several problems with the duplex
    theory
  • Cannot account for the ability of subjects to
    localized many types of sounds coming from many
    different regions (ex. Sound along the median
    plane)
  • When using duplex to generate sound cues in
    headphones, the sound is perceived inside the
    head
  • Most of the deficiencies with the duplex theory
    are linked to the interaction of sound waves in
    the pinnae (outer ears)

24
Characterization and Control of Acoustic Objects
  • Spatial dimensions of sound
  • Peaks and valleys in the auditory spectrum can be
    used as localization cues for elevation of the
    sound source. Other cues are also necessary to
    locate the vertical position of a sound source.
    This is very important to researchers since it
    has never been truly understood.

25
Characterization and Control of Acoustic Objects
  • Spatial dimensions of sound
  • Localization errors in current sound generating
    technologies is very common, some of the problems
    that persist are
  • Locating sound on the vertical plane
  • Some systems can cause a front ? back reversal
  • Some systems can cause an up ? down reversal
  • Judging distance from the sound source! Were
    generally terrible at doing this anyways!!!
  • Sound localization can be dramatically improved
    with a dynamic stimulus (can reduce amount of
    reversals)
  • Allowing head motion
  • Moving the location of the sound
  • Researchers suggest that this can help
    externalize sound!!!

26
Characterization and Control of Acoustic Objects
  • Implementing spatial sound
  • Physically locating loudspeakers in the place
    were each source is located, relative to the
    listener. (Most direct forward)
  • Not portable Cumbersome
  • Other approaches use analytic mathematical models
    of the pinnae and other body structures in order
    to directly calculate acoustic responses.
  • A third approach to accurate real-time
    spatialization concentrates on digital sound
    processors (DSP) techniques for synthesizing cues
    from direct measurements of head related transfer
    functions. (The author focuses on this type of
    approach)

27
Characterization and Control of Acoustic Objects
  • Implementing spatial sound
  • DSP The goals is to make sound spatializers
    that give the impression that the sound is coming
    from different sources and different locations.
  • Why? - A display that focuses on this technology
    can exploit the human ability to quickly and
    subconsciously locate sound sources.
  • Convolution Hardware and/or Software based
    engines performs the convolution that filters the
    sound in some DSPs

28
Characterization and Control of Acoustic Objects
  • Implementing spatial sound
  • Crystal River Engineering Convolvotron 
  • Gehring Research Focal Point
  • AKG CAP (Creative Audio Processor)
  • Head Acoustics
  • Roland Sound Space (RSS) Processor
  • Mixels

29
Characterization and Control of Acoustic Objects
  • Implementing spatial sound
  • Crystal River Engineering Convolvotron 
  • What is it? It is a convolution engine that
    spatializes sound by filtering audio channels
    with transfer functions that simulate positional
    effects.
  • Alphatron Acoustetron II
  • The technology is good except for time delays do
    to computation of 30-40 ms (which can be picked
    up by the ear if used with visual inputs)

30
Characterization and Control of Acoustic Objects
  • Implementing spatial sound
  • Gehring Research Focal Point
  • What is it? Focal Point comprises two binaural
    localization technologies, Focal Point Type 1 and
    2.
  • Focal Point 1 the original Focal Point
    technology, utilizing time-domain convolution
    with head related transfer function based impulse
    responses for anechoic simulation.
  • Focal Point 2 a Focal Point implementation in
    which sounds are preprocessed offline, creating
    interleaved sound files which can then be
    positioned in 3D in real-time upon playback.

31
Characterization and Control of Acoustic Objects
  • Implementing spatial sound
  • AKG CAP (Creative Audio Processor)
  • What is it? A kind of binaural mixing console.
    The system is used to create audio recordings
    with integrated Head Related Transfer Functions
    and other 3D audio filters.

32
Characterization and Control of Acoustic Objects
  • Implementing spatial sound
  • Head Acoustics
  • What is it? A research company in Germany that
    has developed a spatial audio system with an
    eight-channel binaural mixing console using
    anechoic simulations as well as a new version of
    an artificial head

33
Characterization and Control of Acoustic Objects
  • Implementing spatial sound
  • Roland Sound Space (RSS) Processor
  • What is it? Roland has developed a system which
    attempts to provide real-time spatialization
    capabilities for both headphones and stereo
    loudspeaker presentation. The basic RSS system
    allows independent placement of up to four
    sources using domain convolution.
  • What makes this system special is that it
    incorporates a technique know as transaural
    processing, or crosstalk cancellation between the
    stereo speakers. This technique seems to allow an
    adequate spatial impression to be achieved.

34
Characterization and Control of Acoustic Objects
  • Implementing spatial sound
  • Mixels
  • The number of channels in a system corresponds to
    the degree of spatial polyphony, simultaneous
    spatialized sound sources, the system can
    generate. In the assumption that systems will
    increase their capabilities enormously, via
    number of channels, we label their number of
    channels as Mixels.
  • By way of analogy to pixels and voxels, the
    atomic level of sound is sometimes called mixels,
    acronymic for sound mixing elements

35
Characterization and Control of Acoustic Objects
  • Implementing spatial sound
  • Mixels
  • But, rather than diving in deeply into more
    spatial audio systems, the rest of the chapter
    will concentrate on the nature of control
    interfaces that will need to be developed to take
    full advantage of these new capabilities.

36
Characterization and Control of Acoustic Objects
  • Non-spatial dimensions and auditory symbology
  • Auditory icons - acoustic representations of
    naturally occurring events that caricature the
    action being represented
  • Earcons elaborated auditory symbols which
    compose motifs into artificial non-speech
    language, phrases distinguished by rhythmic and
    tonal patterns

37
Characterization and Control of Acoustic Objects
  • Non-spatial dimensions and auditory symbology
  • Filtears a class of cues that independent of
    distance and direction. They are used to attempt
    to expand the spectrum of how we used sound. Used
    to create sounds with attributes attached to
    them. Think of it as sonic typography placing
    sound in space can be likened to putting written
    information on a page. Filtears are dependant on
    source and sink.
  • Example Imagine your telenegotiating with many
    people. You can select attributes of a persons
    voice. (distance from you, direction,
    indoors-outdoors, whispers behind your ear, etc)

38
Research Applications
  • Virtual acoustic displays featuring spatial sound
    can be thought of as enabling two performance
    advantages
  • Situation Awareness Omnidirectional monitoring
    via direct representation of spatial information
    reinforces or replaces information in other
    modalities, enhancing ones sense of presence or
    realism.
  • Multiple Channel Segregation can improve
    intelligibility, discrimination, selective
    attention among audio sources.

39
Research Applications
  • Sonification
  • Teleconferencing
  • Music
  • Virtual Reality and Architectural Acoustics
  • Telerobotics and Augmented Audio Reality

40
Research Applications
  • Sonification
  • Sonification can be thought of as auditory
    visualization and can be used as a tool for
    analysis, for example, presenting multivariate
    data as auditory patterns. Because visual and
    auditory channels can be independent from each
    other, data can be mapped differently to each
    mode of perception, and auditory mappings can be
    used to discover relationships that are hidden in
    the visual display.

41
Interface Control via Audio Windows
  • Audio Windows is an auditory-object manager.
  • The general idea is to permit multiple
    simultaneous audio sources, such as
    teleconference, to coexist in a modifiable
    display without clutter or user stress.

42
Interface Design Issues Case Studies
  • Veos and Mercury (written with Brian Karr)
  • Handy Sound
  • Maw

43
Interface Design Issues Case Studies
  • Veos and Mercury (written with Brian Karr)
  • Veos - Virtual Environment Operating System
  • Sound Render Implementation - A software package
    that interfaces with a VR system (like Veos).
  • The Audio Browser - A hierarchical sound file
    navigation and audition tool.

44
Interface Design Issues Case Studies
  • Handy Sound
  • Handy Sound explores gestural control of an audio
    window system.
  • Manipulating source position in Handy Sound
  • Manipulating source quality in Handy Sound
  • Manipulating sound volume in Handy Sound
  • Summary - Handy sound demonstrates the general
    possibilities of gesture recognition and spatial
    sound in a multichannel conferencing environment.

45
Interface Design Issues Case Studies
  • Maw
  • Developed as an interactive frontend for
    teleconferencing, Maw allows the user to arrange
    sources and sinks in a horizontal plane.
  • Manipulating source and sink positions in Maw
  • Organizing acoustic objects in Maw
  • Manipulating sound volume in Maw
  • Summary

46
Conclusion
  • Real world examples

47
Sound authoring tools for future multimedia
systems
  • Bezzi, Marco De Poli, Giovanni Rocchesso,
    Davide
  • Univ di Padova, Padova, Italy
  • Summary
  • A framework for authoring non-speech sound
    objects in the context of multimedia systems is
    proposed. The goal is to design specific sound
    and their dynamic behavior in such a way that
    they convey dynamic and multidimensional
    information. Sound are designed using a
    three-layer abstraction model physically-based
    description of sound identity, signal-based
    description of sound quality, perception- and
    geometry-based description of sound projection in
    space. The model is validated with the aid of an
    experimental tool where manipulation of sound
    objects can be performed in three ways handling
    a set of parameter control sliders, editing the
    evolution in time of compound parameter settings,
    via client applications sending their requests to
    the sounding engine. Author abstract 26 Refs
    In English Conference Information Proceedings
    of the 1999 6th International Conference on
    Multimedia Computing and Systems - IEEE ICMCS'99
    Jun 7-Jun 11 1999 Florence, Italy Sponsored by
    IEEE CS IEEE Circuit and Systems Society

48
Interactive 3D sound hyperstories for blind
children
  • Lumbreras, Mauricio Sanchez, Jaime
  • Univ of Chile, Santiago, Chile
  • Summary
  • Interactive software is currently used for
    learning and entertainment purposes. This type of
    software is not very common among blind children
    because most computer games and electronic toys
    do not have appropriate interfaces to be
    accessible without visual cues. This study
    introduces the idea of interactive hyperstories
    carried out in a 3D acoustic virtual world for
    blind children. We have conceptualized a model to
    design hyperstories. Through AudioDoom we have an
    application that enables testing cognitive tasks
    with blind children. The main research question
    underlying this work explores how audio-based
    entertainment and spatial sound navigable
    experiences can create cognitive spatial
    structures in the minds of blind children.
    AudioDoom presents first person experiences
    through exploration of interactive virtual worlds
    by using only 3D aural representations of the
    space. Author abstract 21 Refs In English
    Conference Information Proceedings of the CHI
    99 Conference CHI is the Limit - Human Factors
    in Computing Systems May 15-May 20 1999
    Pittsburgh, PA, USA Sponsored by ACM SIGCHI

49
Any questions???
50
References
  • Modeling Realistic 3-D Sound Turbulence
  • http//www-engr.sjsu.edu/duda/Duda.Reports.htmlR
    1
  • 3D Sound Aids for Fighter Pilots
  • http//www.dsto.defence.gov.au/corporate/history/j
    ubilee/sixtyyears18.html
  • 3D Sound Synthesis
  • http//www.ee.ualberta.ca/khalili/3Dnew.html
  • Binaural Beat Demo
  • http//www.monroeinstitute.org/programs/bbapplet.h
    tml
  • Begault, Durand R. "Challenges to the Successful
    Implementation of 3-D Sound", NASA-Ames Research
    Center, Moffett Field, CA, 1990.
  • Begault, Durand R. "An Introduction to 3-D Sound
    for Virtual Reality", NASA-Ames Research Center,
    Moffett Field, CA, 1992.

51
References
  • Burgess, David A. "Techniques for Low Cost
    Spatial Audio", UIST 1992.
  • Foster, Wenzel, and Taylor. "Real-Time Synthesis
    of Complex Acoustic Environments" Crystal River
    Engineering, Groveland, CA.
  • Stuart Smith. "Auditory Representation of
    Scientific Data", Focus on Scientific
    Visualization, H. Hagen, H. Muller, G.M. Nielson,
    eds. Springer-Verlag. 1993.
  • Stuart, Rory. "Virtual Auditory Worlds An
    Overview", VR Becomes a Business, Proceedings of
    Virtual Reality 92, San Jose, CA, 1992.
  • Takala, Tapio and James Hahn. "Sound Rendering".
    Computer Graphics, 26, 2, July 1992.

52
One last thing
  • For those who want to have a little fun, try
    this
  • http//www.cs.indiana.edu/picons/javoice/index.ht
    ml
  •  
  •  
  • http//ourworld.compuserve.com/homepages/Peter_Me
    ijer/javoice.htm

53
The Design of Multidimensional Sound
InterfacesMichael Cohen Elizabeth M. Wenzel
8
  • Presented by
  • Andrew Snyder Thor Castillo

February 3, 2000
HFE760 - Dr. Gallimore
Write a Comment
User Comments (0)
About PowerShow.com