Overview of Real-Time Pitch Tracking Approaches - PowerPoint PPT Presentation

About This Presentation
Title:

Overview of Real-Time Pitch Tracking Approaches

Description:

Least-Squares fitting (Choi 97) Maximum Likelihood (McAulay 86, Puckette 98) Other approaches... Least-Squares fitting approach ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 23
Provided by: thomas319
Category:

less

Transcript and Presenter's Notes

Title: Overview of Real-Time Pitch Tracking Approaches


1
Overview of Real-Time Pitch Tracking Approaches
  • Music information retrieval seminar
  • McGill University
  • Francois Thibault

2
Presentation Goals
  • Describe the requirements of RT pitch tracking
    algorithm for musical applications
  • Briefly introduce key developments in RT pitch
    tracking algorithms
  • Provide insight on what techniques might be more
    suitable for a given application

3
Pitch tracking requirements in musical context
  • Must often function in real-time
  • Minimal output latency
  • Accuracy in the presence of noise
  • Frequency resolution
  • Flexibility and adaptability to various musical
    requirements
  • Pitch range
  • Dynamic range

4
Overview of techniques
  • Time-domain methods
  • Autocorrelation Function (Rabiner 77)
  • Average Magnitude Difference Function (AMDF)
  • Fundamental Period Measurement (Kuhn 90)
  • Frequency-domain methods
  • Cepstrum (Noll 66)
  • Harmonic Product Spectrum (Schroeder 68)
  • Constant-Q transform (Brown 92)
  • Least-Squares fitting (Choi 97)
  • Maximum Likelihood (McAulay 86, Puckette 98)
  • Other approaches

5
Autocorrelation method
  • Based on the fact that periodic signal will
    correlate strongly with itself offset by the
    fundamental period
  • Measures to which extent a signal correlates with
    a time-shifted version of itself
  • The time shifts which display peaks in the ACF
    corresponds to likely period estimate

6
Autocorrelation Pros/Cons
  • Simple implementation (good for hardware)
  • Can handle poor quality signals (phase
    insensitive)
  • Often requires preprocessing (spectral
    flattening)
  • Poor resolution for high frequencies
  • Analysis parameters hard to tune
  • Uncertainty between peaks generated by formants
    and periodicity of sound can lead to wrong
    estimation

7
AMDF
  • Again based on the idea that a periodic signal
    will be similar to itself when shifted by
    fundamental period
  • Similar in concept to ACF, but looks at
    difference with time shifted version of itself
  • The time shifts which display valleys correspond
    to likely period estimates

8
AMDF Pros/Cons
  • Poor frequency resolution
  • Even simpler implementation then ACF (good for
    hardware)
  • Less computationally expensive then ACF
  • Combination of AMDF and ACF yields result more
    robust to noise (Kobayashi 95)

9
Fundamental Period Measurement approach
  • Signal is first ran through bank of half-octave
    bandpass filters
  • If filters are sharp enough, the output of one
    filter should display the input waveform freed
    of its upper partials (nearly sinusoidal)
  • It is up to a decision algorithm to decide which
    filter output corresponds to fundamental
    frequency
  • Time between zero crossings of that filter output
    determines period

10
FPM Pros/Cons
  • Easy implementation (hardware and software)
  • Efficiency of computation
  • Decision algorithm highly dependent on thresholds
  • But, automatic threshold setting provided for
    most situations

11
Cepstrum approach
  • Tool often used in speech processing
  • Cepstrum is defined as power spectrum of
    logarithm of the power spectrum
  • Clearly separate contribution of vocal tract and
    excitation
  • A strong peak is displayed in the excitation part
    (high cepstral region) at the fundamental
    frequency
  • Use a peak picker on cepstrum and translate
    quefrency into fundamental frequency

12
Cepstrum Pros/Cons
  • Less confusion between candidates than in ACF
  • Proven method, especially suitable for signal
    easily characterized by source-filter models
    (e.g. voice)
  • Relatively computationally intensive (2 FFTs)

13
Harmonic Product Spectrum approach
  • Measures the maximum coincidence of harmonics for
    each spectral frame
  • Resulting periodic correlation array is searched
    for maximum which should correspond to
    fundamental frequency
  • Algorithm ran for octave correction

14
HPS Pros/Cons
  • Simple to implement
  • Does well under wide variety of conditions
  • Poor low frequency resolution
  • Computing complexity augmented by zero padding
    required for interpolation of low frequencies
  • Requires post-processing for error correction

15
Constant-Q transform approach
  • First computes the Constant-Q transform to obtain
    constant pattern in log frequency domain (Q
    fc/bw)
  • Compute the cross-correlation with a fixed comb
    pattern (ideal partial positions for given
    fundamental frequency)
  • Peak-pick the result to obtain fundamental
    frequency

16
Constant-Q Pros/Cons
  • Complexity of constant-Q reduced but still
    (Brown and Puckette 91)
  • Sensitive to octave errors
  • Other peaks could be candidates

17
Least-Squares fitting approach
  • Perform least-squares spectral analysis --gt
    minimize error by fitting sinusoids to the signal
    segment
  • Strong sinusoidal components are identified as
    sharp valleys in least-square error signal
  • Relatively few evaluation of the error signal are
    required to identify a valley
  • Fundamental frequency is obtained as average of
    partial frequencies over their partial number
  • Uses rectangular windowing to provide faster
    response

18
LS fitting Pros/Cons
  • Operates on shorter frame segments
  • Best option for real-time applications with
    minimum latency requirements
  • Efficient evaluation scheme allows reasonable
    computation complexity

19
Maximum Likelihood
  • Maximum likelihood algorithm searches trough a
    set of possible ideal spectra and chooses closest
    match (Noll 69)
  • Was adapted to sinusoidal modeling theory, by
    finding best fit for harmonic partials sets to
    the measured model (McAulay 86)
  • Enhance discrimination by suppressing partials of
    small amplitude values

20
ML Pros/Cons
  • Inherits high computational requirement from
    sinusoidal modeling
  • Very robust estimation
  • Allows guess of fundamental frequency even with
    several partials missing.

21
Other approaches
  • Neural Nets (Barnar 91)
  • Hidden Markov Models (Doval 91)
  • Parrallel processing approaches (Rabiner 69)
  • Fourier of Fourier transforms (Marchand 2001)
  • Two-way mismatch model (Cano 98)
  • Subharmonic to harmonic ratio (Sun 2000)

22
Conclusions
  • Lot of research still Motivated by speech
    telecommunication
  • Abundant literature since 1950
  • Complete and objective performance overviews
    seems missing
  • Combination of techniques in parallel processing
    seems foreseeable with todays fast computers
Write a Comment
User Comments (0)
About PowerShow.com