Frequency response adaptation in binaural hearing - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Frequency response adaptation in binaural hearing

Description:

Title: Why do concert halls sound different and how can we design them to sound better? Author: david griesinger Last modified by: david griesinger – PowerPoint PPT presentation

Number of Views:112
Avg rating:3.0/5.0
Slides: 34
Provided by: davidg356
Category:

less

Transcript and Presenter's Notes

Title: Frequency response adaptation in binaural hearing


1
Frequency response adaptation in binaural hearing
  • David Griesinger
  • Cambridge MA USA
  • www.DavidGriesinger.com

2
Introduction
  • This paper proposes fundamental questions about
    the properties of human hearing (the topic of
    this conference)
  • How do we localize sounds in the up/down and
    front/back planes?
  • Are the methods used different for different
    individuals?
  • Can binaural recordings made for one individual
    be made to work for another individual without
    head-tracking?
  • Given the extremely non-uniform transfer of sound
    pressure from the soundfield to a human eardrum,
    how can we accurately perceive a frequency
    balance as flat
  • Does a frequency balanced pink noise from a
    frontal loudspeaker sound balanced in frequency?
  • If not, are commercial recordings, which are
    equalized using loudspeakers, actually frequency
    balanced?
  • If not in what ways are they biased?

3
Better binaural technology
To answer these questions the author constructed
an accurate physical model of his own hearing,
all the way to the eardrum. The pinna compliance
is modeled by cutting away the inside of the
casting. The eardrum impedance is modeled with a
resistance tube. The ear canal is an accurate
silicon cast all the way to the eardrum.
Tiny probe microphones were also built with a
very soft tip. This allows binaural recording of
performances at the authors eardrums, correct
headphone calibration, and verification of the
accuracy of the dummy head model.
4
A perplexing Discrepancy
  • Recordings made with this technology provide
    excellent localization accuracy.
  • But at least initially the timbre of the playback
    through carefully calibrated headphones seems
    incorrect.
  • The frequencies around 3kHz seem too strong, and
    the bass is usually weaker than my memory of the
    performance.
  • Checking and re-checking the calibrations has
    convinced me the recordings and the playback are
    correct.
  • It is my memory of the performance that is
    flawed.
  • The most reasonable explanation is that we
    continuously adapt to the frequency balance of
    sounds around us. We remember the timbre after
    such adaptation has taken place.

5
A simple model of human hearing
Over a long period of time the brain builds
spectral maps for the features that define
up/down and back/front information in HRTFs.
When a sound is heard these features are compared
to the maps, and a localization is found.
6
A simple model of human hearing-2
When a match has been found, the perceptible
features of the particular HRTF are removed,
again from a fixed spectral map. But this
spectrum is altered by a relatively short time
constant adaptive equalizer, with acts to make
all frequency bands equally perceived. The time
constant of this mechanism for the author is
about 5 minutes. It may be shorter for some
individuals.
7
An example
  • The author once noticed a gliding whistle while
    walking under an overhead ventilator slot that
    emitted broadband noise.
  • Walking rapidly (3.5mph) under that noise source
    produced a gliding whistle, somewhat like a
    Doppler shift.
  • This is the uncorrected sound of the vertical
    HRTFs
  • In spite of the lack of timbre correction the
    sound was correctly localized even at much
    higher speeds.
  • No timbre shift was perceived when walking slowly
    under the slot (lt2mph).
  • When there is sufficient time our brains correct
    the timbre but this correction takes time in
    this case a fraction of a second.

8
Headphone listening
When we listen to binaural recordings with
headphones the whole process is broken.
Headphones match individuals very poorly (as we
will see). None of the spectral features match
the fixed HRTF maps. The brain is confused, and
the subject perceives the sound inside the
head. But the adaptive equalizer is still active
and after a time period the sound is perceived
as frequency balanced.
9
Consequences of adaptation for sound engineers.
  • Tonmeisters talk about being familiar with a
    particular loudspeaker or studio.
  • They claim they can make an accurately balanced
    recording with these tools.
  • A logical conclusion is that the timbre of
    loudspeakers or playback equipment is irrelevant.
  • As long as you are familiar with it everything
    is fine.
  • But the conclusion is clearly false.
  • A recent book by Floyd Toole details the changes
    in the frequency content of popular records as
    fashion in monitor loudspeakers changed.
  • All sound reinforcement engineers are aware of
    how much intelligibility can increase when a
    sound system is equalized. This typically
    involves a treble boost above 1000Hz.
  • Absolute frequency balance matters.

10
Upward Masking
Sound enters basilar membrane at the oval window.
High frequencies excite the membrane near the
entrance, passing through it and exiting through
the second window below. Low frequencies travel
further down the spiral, until they excite the
membrane and pass through. Strong low frequencies
disturb the high frequency portion of the
membrane, causing the well know phenomenon of
upward masking.
Upward masking is a purely mechanical effect, and
it cannot be compensated by adaptive
equalization. The high frequencies are simply
not detected. Intelligibility is frequently low
in acoustic spaces because there is little low
frequency absorption, and the LF acoustic power
is boosted. We adapt to the frequency imbalance,
and say the sound is OK but unintelligible
11
Upward masking and mixing
  • A consequence of upward masking is that elements
    in a mix that are audible in one studio or set of
    loudspeakers may be masked in another.
  • Recordings mixed over headphones can be seriously
    in error.
  • Most headphones boost the treble, raising the
    apparent clarity.
  • As an engineer I learned early to mistrust the
    balance between direct and reverberation over
    headphones
  • The best I could do was make the recording much
    dryer than I, or my clients, preferred and hope
    for the best.
  • One an always make the recording more reverberant
  • Making it dryer is much more difficult!
  • Can we find a way to correct headphone errors?

12
Accurate binaural recordings
If safe, comfortable probe microphones are
available, it is possible to make accurate
binaural recordings. First we measure the
headphone response at the eardrum response H.
We can then record with the same probe
microphones. If we equalize the recording with
the inverse of H, H, the recording will play
back with perfect fidelity.
13
Playback of binaural over speakers
If we want to play back the binaural recording
over speakers, or if we want to play loudspeaker
music over headphones, we need to measure the
spectrum of a carefully equalized loudspeaker at
the eardrums of the listener. This is the
spectrum S. We then equalize the binaural
recording with S, and we can play it over
speakers. Equalizing the phones with HS allows
playback of both binaural and loudspeaker mixed
music. HS is the inverse of the free-field
earphone response
14
Binaural equalization in practice
  • Note the two previous slides made no attempt to
    equalize the probe microphone(s).
  • With those schemes, the response of the probe
    cancels in the final result.
  • In practice, the probe response is complicated
    and difficult to invert.
  • The author carefully measures the impulse
    response of the probes with a BK 4133 as a
    reference.
  • The responses are inverted in the frequency
    domain with Matlab. With care minimal pre-echo
    is produced.
  • All measurements with the probes are first
    convolved with this inverse function.
  • Second order parametric filters are combined to
    produce the other equalization filters.
  • Parametric filters can be easily inverted, and
    sound better than mathematical inverse filters to
    the author

15
Probe Equalization
This graph shows the frequency response and time
response of the digital inverse of the two probes
as measured against a BK 4133 microphone. Matlab
is used to construct the precise digital inverse
of the probe response, both in frequency and in
time. The resulting probe response is flat from
25Hz to 17kHz. In general, I prefer NOT to use
a mathematical inverse response, as these
frequently contain audible artifacts. I
minimized these artifacts here by carefully
truncating the measured response as a function of
frequency.
16
Adaptive Timbre how do we perceive pink noise
as flat
  • Pink noise sounds plausibly pink even on this
    sound system.
  • Lets add a single reflection and listen for a
    few minutes without other sounds
  • The result at first sounds colored, with an
    identifiable pitch component.
  • The pitch component gradually reduces its
    loudness.
  • But now play the unaltered noise again.
  • The unaltered noise now has a pitch,
    complementary to the pitch from the reflection.

17
Some demos of eardrum recordings
  • These recordings have been equalized for
    loudspeaker reproduction. You may be able to
    judge clarity and intelligibility over near-field
    loudspeakers.
  • Accurate headphone reproduction requires
    headphone equalization
  • If probes are available the method described here
    will work,
  • A method which uses equal loudness curves will be
    described later in this paper.
  • opera balcony 2, seat 11
  • Moderate intelligibility, reverberant sound
  • opera balcony 3, seat 12
  • Poor intelligibility, very reverberant
  • opera standing room
  • Deep under balcony 2 good intelligibility
  • A concert hall row 8 (quite close)
  • Very good sound. Not so good further back.

18
The need for eardrum measurements
  • Almost all current binaural research uses HRTF
    and headphones with a blocked or partially
    blocked ear canal.
  • There is an assumption (without proof) that such
    measurements accurately reproduce the sound
    pressure at the eardrum.
  • The assumption is blatantly false. To quote
    Hammershoi and Moller
  • The most immediate observation is that the
    variation in sound transmission from the
    entrance of the ear canal to the eardrum from
    subject to subject is rather highThe presence of
    individual differences has the consequence that
    for a certain frequency the transmission differs
    as much as 20dB between subjects.
  • 20dB is a significant difference in response!
  • In spite of the data, Hammershoi and Muller
    recommend using measurements at the entrance to
    the ear canal!
  • The recommendation can be disproved by a single
    subject

19
HRTFs from blocked ear canals
Here are pictures of a partially blocked canal
and a fully blocked canal. The following data
applies to the fully blocked measurements, but
the partially blocked measurements are similar.
20
Blocked measurements vs eardrum
  • To compare the two measurement methods, I
    equalize the blocked measurement of a single HRTF
    to the same HRTF measured at the eardrum. I
    chose the HRTF at azimuth 15 degrees left, and 0
    degrees elevation.
  • The needed equalization requires at least 3
    parametric sections.
  • Red is the right ear, blue is the left ear

21
HRTF differences blocked to eardrum
Twenty different HRTFs were measured with a
blocked canal, equalized by the above EQ, and the
difference between them and the open ear canal
are plotted. This data supports Hammershoi and
Mullers contention that that the directional
properties of the measured HRTFs are preserved by
the blocked measurement, at least to a frequency
of 7kHz. Note the vertical scale is -30dB. The
errors at 7-10k are significant.
22
Headphone response differences
Using the same method, I measured three
headphones. Blue is the AKG 701, red is the AKG
240, and Cyan is the Sennheiser 250 The curves
plot the difference between the blocked and
unblocked measurement, with the measured HRTF at
azimuth 15, elevation 0 as a reference. The
vertical scale is -30dB. Errors of at least
10dB exist at midband.
23
More headphones
Blue and old but excellent noise protection
earphone by Sharp. Red Ipod earbuds. The
error in the blocked measurements are large
enough to prevent accurate localization of
binaural recordings.
24
Analysis
  • The previous curves are NOT the frequency
    response of the headphones under test. They show
    the ERRORs that occur when a blocked ear canal
    measurement is used instead of the eardrum
    pressure.
  • Because the scale of the plots is -30dB the
    difference curves look better than they really
    are. Errors of 10dB in frequency ranges vital
    for timbre are present for almost all the
    examples shown.
  • We can conclude that it is possible to use
    recordings from dummy heads that lack accurate
    ear canals IF AND ONLY IF it is possible to
    equalize them, either by comparison to a
    reference with ear canals, or by equalizing them
    to sort-of flat for a frontal sound source. If
    this is done, we must also equalize the
    headphones at the eardrum for the same source.
  • We can with more assurance conclude that it is
    NOT possible to equalize headphones with a
    measurement system that does NOT include an
    accurate ear canal model.
  • Both KEMAR and HATS do not qualify.
  • Measurement systems with true ear canals are a
    very good thing
  • In addition I have found that for many earphones
    it is vital to have a pinna model with identical
    compliance to a human ear.
  • Particularly on-ear headphones alter the concha
    volume and drastic changes in the frequency
    response can result if the compliance is not
    accurate.
  • Pinna are complex structures with variable
    compliance so this is tricky!

25
Headphone calibration through equal loudness
contours
  • There is a non-invasive method of headphone
    calibration to an individual.
  • IEC publication 268-7 and German Standard DIN
    45-619 recommend loudness comparison using 1/3
    octave noise instead of physical measurement for
    headphones.
  • These recommendations were superseded by diffuse
    field measurements as suggested by Theile.
  • Should these methods be revived? I believe the
    answer is yes.

26
Equal Loudness
Top ISO equal loudness curves for 80dB and 60dB
SPL these are the average from many individuals,
so features in them are broadened. Bottom
(blue/red) averaged frontal response over a -5
degree cone in front of the author, measured at
the eardrums. The loudspeaker was equalized to
200Hz. Bottom - black/cyan the same
measurement for the authors dummy head with no
equalization. The difference in eardrum impedance
above 8kHz boosts the response of the dummy but
this can be removed by equalization.
27
Equal Loudness 2
  • We can measure equal loudness curves because the
    ear does not adapt when the stimulus is narrow
    band either noise or tone.
  • The differences between the top and bottom curves
    in the previous slide can be attributed to the
    properties of the middle ear and the inner ear.
  • Thus equal loudness curves are a method of
    measuring the effective frequency response an
    individuals hearing system in the absence of
    short-term adaptation to the environment.
  • They represent our sensitivity to timbre in a
    quiet environment, or before adaptation takes
    place.
  • Their extreme lack of flatness is proof of the
    existence, and effectiveness, of adaptation.

28
Loudness matching experiments
  • The author wrote a Windows program that presents
    a subject with alternating bands of 1/3 octave
    noise, one at 500Hz, and the other at a test
    frequency
  • The subject matches the loudness of the two bands
    by adjusting the test band up and down.
  • In use, the equal loudness curves from 500Hz to
    12kHz for a carefully equalized frontal
    loudspeaker are obtained for this subject.
  • The subject then repeats the experiment with a
    pair of headphones over a frequency range of 30Hz
    to 12kHz.
  • In this case the balance between the two ears is
    also tested and corrected.
  • The difference of the loudspeaker and headphone
    measurements becomes the ideal headphone
    correction for this individual.
  • This program can be used to test the variation in
    response of a particular headphone over a wide
    range of individuals.
  • Subjects report that the resulting equalization
    is very pleasant, and binaural recordings made
    with the authors ears reproduce well without
    head tracking.
  • Music recorded for loudspeakers is judged
    identical in timbre in both the headphones and
    the loudspeaker.
  • The equalization is also identical in timbre to a
    large high-quality stereo sound system.

29
Results for 10 individuals
About 10 students from Helsinki University
participated in the test. The top left graph
shows the equal loudness contours from the
loudspeaker for each subject. The other curves
show the difference between this curve and the
equal loudness curves for four different
headphones. It was hoped that the Stax 303 phones
would show less individual variation. This was
not the case. (blue left ear, red right cyan
authors left ear)
The Philips phones were an insert type. These
also showed large variation among individuals.
30
The dip at 3kHz for all subjects
  • All subjects show a dip in the loudspeaker equal
    loudness curve at 3kHz.
  • This corresponds to a universal peak in the
    response of the concha and ear canal at this
    frequency.
  • It is this ear sensitivity peak that causes the
    most trouble with our memory of timbre.
  • When we first play an accurately calibrated
    binaural recording particularly of a speaking
    voice or a chorus this peak in the loudness is
    highly noticeable and unpleasant.
  • Once we adapt, everything is OK again.

31
Comments on these results.
  • The experiment is equivalent to equalizing
    headphones for a frontal, free-field response.
  • This is at variance with the current standard for
    diffuse field equalization.
  • In the authors experience the free field
    equalization is far more useful than the diffuse
    field equalization, and gives better results on
    loudspeaker recorded music.
  • These recordings are intended to be heard in a
    room where the direct sound is frontal, and
    dominant.
  • After doing the experiment the subjects were
    given the opportunity to listen to music both
    with the frontal equalization and with their own
    equal loudness equalization. (the speaker curves
    were not subtracted)
  • The authors binaural recordings were perceived
    with better localization with the free-field
    equalization. (These recordings were equalized
    for free-field reproduction.)
  • Many subjects preferred their own equal loudness
    equalization for other material.
  • This equalization requires no adaptation to a
    recording that has an accurately flat frequency
    response.
  • The sound can be quite seductive.

32
Some Speculation
  • Equal loudness curves have two prominent
    features the increase insensitivity around 3kHz,
    and the decrease in sensitivity at low
    frequencies.
  • Music that has been recorded with frequency
    linear microphones and not post-processed often
    seems lacking in bass and harsh in the midrange
    both on loudspeakers and on eardrum-equalized
    headphones.
  • The author speculates that an unconscious
    collusion between loudspeaker designers and
    recording engineers routinely boosts the bass,
    and tweaks the 3kHz region on commonly available
    recordings.
  • It is common to boost the bass 10dB at 60Hz in
    automobiles.
  • Floyd Tooles findings that the loudspeakers that
    are closest to frequency linear are preferred in
    blind listening tests may be biased by the choice
    of recordings used in the tests.
  • The spectrum of choral music in the authors
    unprocessed recordings shows a 3dB peak around
    3kHz.
  • This peak is generally absent in vocalists on pop
    music. Perhaps they use a different singing
    technique and perhaps the equalization has been
    adjusted closer to an equal-loudness curve.

33
Conclusions
  • Experiments and observation suggest that human
    hearing uses a combination of fixed spectral maps
    to perceive the localization of a sound, and then
    corrects the HRTF timbre with a similar map.
  • These fixed maps are combined with a relatively
    rapid AGC system that tends to equalize loudness
    across frequency bands.
  • The existence of equal loudness curves show that
    for narrow band signals adaptation does not take
    place. When a new, unknown broadband signal is
    first heard, the ear hears the timbre that
    reflects the equal loudness calibration. But
    this timbre is replaced in a short time with a
    more balanced timbre, and this balanced timbre is
    remembered.
  • It is likely that given the opportunity to
    equalize a recording to their own taste using
    loudspeakers with a flat frequency response,
    recording engineers will be sorely tempted to
    move toward their own equal loudness curve.
  • The temptation is dangerous but probably
    harmless. We can see that individual loudness
    curves can be rather different particularly at
    low frequencies.
  • But adaptation will continue to work when the
    recording is played back, and if the response
    does not match that of the listener, they will
    soon not notice the difference.
Write a Comment
User Comments (0)
About PowerShow.com