Title: KIPA Game Engine Seminars
1KIPA Game Engine Seminars
Day 13
- Jonathan Blow
- Ajou University
- December 10, 2002
2Audio Overview
- Systems tend to have a separate audio processor
- PCs and audio card like Soundblaster, or on-board
audio - Xbox and Nvidias sound chip
- GameCube and audio processor with ARAM
3Analogy to Graphics Hardware
- Off-CPU graphics processing
- Texture memory
- Upload things to texture memory and tell
processor to use them - Some sound hardware supports this model
- Creatives cards with SoundFont
- Some APIs built around this model
- OpenAL
4However
- Support for SoundFonts style hardware is not
widespread - OpenAL is not widely adopted
- I suggest that this is because this method
introduces unneeded complexity, while not solving
real problems - I dont like OpenAL
5Lets think about audio
- CD-quality audio (44100 Hz), stereo, 16-bit per
channel, is pretty good. - Thats 176400 bytes per second
- For a 60fps game, thats 2940 bytes per frame
(not much!) - If outputting Dolby 5.1, of course there will be
more data - PCI bus bandwidth is 133MB 1066MB/sec
- Audio is .2 of bus bandwidth or less
6Lets think about graphics
- 32MB of textures, 60fps almost 2 gigabytes of
texture data per second - So its obvious why we dont stream texture data
to the hardware every frame - But nobody likes having to upload textures to the
hardware on PCs - It only creates code complications, bugs
- So we should avoid those complications in audio,
where we dont need them
7The problem with OpenAL
- They want to be too much like OpenGL
- Theyre extendin the texture on video hardware
analogy to a place where its not appropriate - This makes their API harder to use than it ought
to be
8GameCube with ARAM
- Means you basically must follow the upload data
to graphics hardware analogy - Sound is harder to code here than on the PC
- You need to make a system that streams from the
dvd drive into ARAM, and that is annoying
9Audio Processing
103D Sound Processing
- Doesnt seem to work very well on consumer
hardware - Overview of HRTF (Head-Related Transfer Function)
- General overview of distance-based processing
methods - So far 3D Sound has been mostly marketing hype
- Though maybe sometime in the future it will be
done well - Techniques like Dolby 5.1 have much more concrete
benefit
11FIR and IIR filters
- Finite Impulse Resposne,
- Infinite Impulse Response
- Diagram on whiteboard of what these mean
- FIR filters can produce linear phase
- If filter kernel is symmetric
- IIR filters do not produce linear phase
- Since linear phase is not so important in audio,
IIR filters find much more use than in graphics
12Specific Effects
13Low-Pass Filter,High-Pass Filter
- Convolve sound with symmetric filter kernel (FIR)
- Kernel can be fairly small
- Since kernel is symmetric we can save some CPU
time (show on whiteboard)
14Slow down, Speed up sound
- This is a resampling problem
- (Discuss what happens with low-quality
resampling) - To speed up, low-pass filter and then downsample
- To slow down by an even multiple, fill signal
with zeroes then low-pass filter - To slow down by some other multiple, you can
write a generalized resampler
15Delay sound(like for HRTFs)
- Integer sample delay is easy, you just change the
offset you use to mix - For non-integer delay, you need some kind of
filter - Either explicitly resample, or use an all-pass
filter
16Per-Frequency Pitch Shifting(Doppler, etc)
- Fourier version (move the spectrum)
- Time-domain version (low-pass filter, multiply by
complex exponential) - But our signal becomes no loner real-valued!
- What exactly does that mean?
17Sound effect shifting
- Shifting a sound effect is more complicated than
shifting individual frequencies - Natural sounds contain harmonics
- Multiples of a fundamental frequency
- If the fundamental frequency is shifted by ?, the
harmonics need to be shifted by 2?, 3?, etc
18My preferred sound architecture
- Central mixer that streams audio to the hardware
- Declare streaming channels
- None of this static circular buffer stuff
- Samples can be added to these channels at any
time - Chain filters onto channels to add various effects
19Preferred Architecture (2)
- Sound channels are immediate mode as much as
possible - You dont give them poniters to objects in the
game you have to push position and orientation
data on them in order to spatialize, they dont
pull - Two reasons for this
- Orthogonality
- Control over when changes happen
20Coding IssuesDealing with volume, pan, etc
- Cant let them change discontinuously
- This would cause a pop!
- Need to interpolate them over time
- This adds latency! (whiteboard discussion)
21Coding IssuesSending sound to the hardware
- Usually need to pre-fill far enough ahead on a
circular buffer - This adds latency!
22Coding issuesSound system responsiveness
- Dont want the audio to skip when the frame rate
drops - Audio probably needs to be in its own thread
- Unfortunate since I dont like multiple threads.
- Synchronization between audio and main thread
23Sound system responsiveness (2)
- Also when we are spooling sound from disk, we
want that to be responsive - Should be in its own thread also
- Unclear whether its a good idea to pack this
into the same thread as the low-level audio
(application-dependant)
24Audio Mixing
- Even if output is 16-bit, you want to use 32-bit
integers while mixing the inputs - This reduces the amount of added noise
- For maximum accuracy we want sounds stored so
that they use the whole 16 bits - Example of overflow that is saved by 32-bit
numbers
25How sound volume is controlled
- 1/r2 attenuation
- Min volume / Max volume distances
- Paradox of infinite loudness
- Modeling sound as pressure (in Pascals) not
loudness (in Decibels) - Emitter of finite size radiating a certain
pressure density
26Overview of engine soundsystem implementation
- Class hierarchy
- Win32-specific piece does the talking to
DirectSound - Cross-platform piece does most of the control
27Audio Control Flow
- No callbacks
- Buffer notifications, etc
- Absolutely everything is a push model
- If you want sound B to happen after sound A, you
are responsible for noticing that A is almost
done and that its time to play B - Or else you can decide far ahead of time to just
append B and forget about it
28In-Engine Audio Format Support
- .wav loading
- .ogg loading / streaming
- Discussion of .ogg vs .mp3
- Can use compressed audio for even small samples,
to reduce download time - Though maybe you uncompress them to a sjmple
format the first time the game runs
29Supported sampling rates
- 44100 Hz only
- This is unlike most engines / audio APIs (which
will try to resample for you if you provide
things at a different rate) - I wanted things as simple as possible
- Sometimes this might cause development
bottlenecks since you might have to resample any
acquired audio (like off the internet) before
using it - So I might change this decision in the future