A1261526405xlEHr - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

A1261526405xlEHr

Description:

... what is measured is how intense a 1000-Hz tone must be to sound equally loud ... Details can become very complex (Moore, 189-192) Temporal models ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 22
Provided by: hos1
Category:

less

Transcript and Presenter's Notes

Title: A1261526405xlEHr


1
CS 551/651 Structure of Spoken Language Lecture
11 Overview of Sound Perception, Part
II John-Paul Hosom Fall 2008
2
Equal-Loudness Contours
  • The subjective loudness of a sine wave may be
    determined as a function of frequency, creating
    an equal-loudness contour
  • Instead of measuring loudness directly, what is
    measured is how intense a 1000-Hz tone must be to
    sound equally loud as the test frequency.
  • Or, how intense a test frequency must be in order
    to sound as loud as a 1000-Hz tone
  • The perceived equal loudness at each frequency
    can be plotted
  • Results are easily influenced by test conditions
    exact shapes should not be taken too
    seriously (Moore, p. 54)
  • These contours are only for steady sounds at a
    single frequency such sounds are almost never of
    practical interest.

3
Equal-Loudness Contours
minimum audible field
Equal-loudness contours, from Moore p.55
4
Critical Bands
  • Fletcher (1940) conducted an experiment in which
    there was bandpass noise and a single sine wave.
    The frequency of the sine wave was always at the
    center frequency of the noise, and the power
    density of the noise was fixed. The bandwidth of
    the noise was varied, and for each bandwidth the
    minimum intensity at which the sine wave could be
    perceived was determined. (With increasing
    bandwidth, the total energy of the noise
    increased).

noise
noise
sine wave
sine wave
power (dB)
0 Hz
Nyquist
0 Hz
Nyquist
5
Critical Bands
6
Critical Bands
  • The result was that the threshold for minimum
    detectability increased with bandwidth up to a
    limit after that limit, the noise could increase
    in bandwidth and power without affecting the
    perception of the sine wave
  • This implies a band-pass filter structure to
    sound perception sound is filtered by a bank of
    overlapping band-pass filters called auditory
    filters. This filter structure is based on the
    structure of the BM.
  • The bandwidth of the filter is related to the
    bandwidth of the noise at which the sine-wave
    threshold no longer increases. This bandwidth is
    called the critical bandwidth (CB)
  • Within one band, if the signal-to-noise ratio
    exceeds some fixed threshold, the signal (sine
    wave) will be heard, independent of whats
    happening in other bands. This threshold may
    vary from person to person, typically 0.4

7
Critical Bands
  • The auditory filters can be approximated by
    rectangular filters, but better determination of
    the filter shape is possible.
  • The critical bandwidth at a particular frequency
    can be estimated using the formulawhere P is
    the intensity of the signal, N0 is the intensity
    of the noise over a 1-Hz range, K is the
    threshold of detectability (usually 0.4), and W
    is the CB.
  • For example, the CB at 1000 Hz is 160 Hz
    however, in reality rectangular filters are not
    accurate the shape changes with frequency and
    amplitude
  • Better approximations of the auditory filters
    look like this

8
Critical Bands
The shape of auditory filters, as a function of
frequency (as determined from masking
experiments) and amplitude (as determined from
notched-noise experiments) (From Moore, p.
105, 110)
9
Critical Bands
  • In general, CBs are nearly symmetric below levels
    of 45 dB at higher energies, bandwidth increases
    and cutoff skirt sharper above the center
    frequency.
  • Cutoff skirt of 65 dB/oct. at 500 Hz to 100
    dB/oct. at 8 kHz.
  • For frequencies lt 500 Hz, CB is about 100 Hz for
    frequencies gt 500 Hz, CB increases with
    frequency, bandwidth of 700 Hz near 4 kHz
  • For frequencies above 1000 Hz, ratio of frequency
    to bandwidth roughly constant, Q5 or 6.
  • If two sounds occur within one critical band, the
    sound with much higher energy dominates
    perception masks the other sound.
  • From critical bands, models of perceptual
    non-linear warpings of the frequency scale have
    been developed, namely the Bark and Mel scales

10
Mel Scale and Bark Scale
Mel scale Bark scale
11
Masking
  • Masking is a phenomenon in which perception of
    one sound is obscured by the presence of another
    sound
  • Masking can occur in both the time and frequency
    domains, and prevents us from computing perceived
    loudness of a complex tone as the sum of its
    components.
  • Frequency masking can be shown by fixing one
    sound (sine wave at a given frequency and
    intensity) and varying a second sine waves
    intensity to determine at what intensity it can
    be perceived.
  • For example, given one tone at 1200 Hz and 80 dB,
    how loud does a second tone at X Hz have to be in
    order to be heard? This threshold can be
    plotted, as well as what the perceived sound is
    like when above the threshold

12
Masking
(1929)
From OShaugnessy, p. 126
13
Masking
  • Masking occurs in both time and frequency
    domains two successive signals within the same
    CB may show masking effects
  • For example, noise will mask a following tone if
    the noise is sufficiently loud and the delay
    between noise and tone is short (forward masking)
  • Energy needed to mask the tone increases with
    delay and duration of tone beyond 100 - 200 msec
    no masking occurs
  • Backward masking occurs (a tone may be masked by
    a subsequent noise), but only within a 20 msec
    window.
  • Backward masking may not be purely involuntary
    trained subjects may show no backward masking
    (Moore, p. 129)

14
Masking Complex Stimuli
  • These CB and masking experiments typically use
    simple stimuli sine waves, clicks, and/or
    band-pass white noise
  • Processing of complex sounds not as simple as
    extrapolating effects of simple sounds
  • For example, phase-locking is suppressed in
    regions between speech formants, enhancing the
    effect of formants.
  • Complex sounds are louder if they occur over more
    than one critical band
  • Two tones, both individually below the threshold
    of hearing, may be perceived when played
    simultaneously roughly speaking, the total
    energy of all tones within a CB determines the
    threshold.
  • In general, most phenomena can be understood in
    terms of bands of overlapping band-pass filters

15
Temporal Integration
  • Absolute thresholds and loudness depend on
    duration of the stimulus
  • For durations lt 200 msec, intensity necessary for
    detection increases with shorter duration
  • For detecting short-duration tones, the threshold
    for detection is the product of the tone
    intensity relative to an intensity-detection
    threshold for longer sounds, and a
    time-integration constant (I - IL) ? ?
    detection threshold (energy)where I is the
    intensity of the short stimulus, IL is the
    intensity at which the stimulus is detected with
    duration gt 200 msec, and ? is the integration
    time of the auditory system.
  • Some experiments show that ? varies with
    frequency other experiments show that it
    doesnt at any rate, ? has a value between 150
    and 375 msec.

16
Temporal Integration
  • Detecting change in intensity is slightly
    different
  • Usually a two-alternative forced-choice (2AFC)
    test is used(a) two successive stimuli are
    presented, which differ in one target
    aspect,(b) the subject indicates which of the
    two stimuli have more of the target aspect
    (e.g. loudness, amplitude modulation) The point
    at which 75 of responses are correct is the
    threshold for detecting the change.
  • The threshold for detecting intensity has the
    following model
  • For band-pass filtered noise, the ratio of ?I/I
    is constantthis is an example of Webers law,
    which states that the smallest detectable change
    is proportional to the magnitude of the stimulus.
    Usually, ?L is 0.5 to 1 dB

17
Temporal Integration
  • For pure tones, Webers law doesnt quite hold a
    plot of ?I to I yields a line with slope 0.9
    instead of 1.0 (discrimination improves slightly
    at higher intensity levels)
  • The time-domain window sizes for computing I and
    ?I are not specified by the model within one
    frequency band, a large window for computing I
    and smaller window for computing ?I can show
    relatives energy changes within the band

18
Pitch Perception
  • Pitch is the perceived main frequency of a sound
    closely related to objective measure of F0.
  • Timing Theory of PitchLower frequencies have
    pitch estimated based on phase-locking
    (time-synchronous firing of neurons)
  • Place Theory of PitchAll frequencies have pitch
    estimated based on location on BM at which
    neurons fire most frequently
  • Pitch can be perceived even when the fundamental
    frequency is not present pitch determination
    based on higher harmonics (e.g. telephone-band
    speech which has energy only gt 300 Hz)
  • Harmonics in F1 region especially important for
    pitch
  • Both timing and place are important timing
    allows fine resolution, place allows pitch
    perception in higher frequencies

19
Pitch Perception
  • Frequency discrimination of pure tones the
    smallest detectable change in frequency increases
    with frequency
  • Tests done using two-alternative forced choice
    (2AFC) which has higher pitch?

20
Pitch Perception
  • Zwickers place model detection of pitch of pure
    tones is equivalent to detecting change in
    excitation level on low-frequency side of
    excitation pattern
  • However, this model doesnt account for all
    observed phenomena, lending support to the
    phase-lock time theory in which pitch is measured
    directly by the inverse of the time between
    neural firings (at least for frequencies lt 4 kHz)

21
Pitch Perception
  • Pitch perception of complex tones depends on
    harmonics of the fundamental if clicks occur
    with harmonics every 200 Hz, and if all harmonics
    other than those at 1800, 2000, and 2200 Hz are
    filtered out (removed), the perceived pitch is
    still 200 Hz.
  • Pattern recognition modelsBasic Idea find a
    fundamental frequency which has harmonics that
    match the existing harmonics. Details can become
    very complex (Moore, 189-192)
  • Temporal modelsBasic Idea the pitch value is
    determined by the periodicity of the total
    waveform containing harmonics in other words,
    determined by constructive interference of
    several harmonics in time domain.
  • No model alone accounts for all phenomena of
    perceived pitch
Write a Comment
User Comments (0)
About PowerShow.com