Audio Fingerprinting Overview: RARE Algorithms, Resources - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Audio Fingerprinting Overview: RARE Algorithms, Resources

Description:

Title: Audio Fingerprinting Overview Author: Chris Burges Last modified by: Chris Burges Created Date: 4/9/2003 7:07:08 PM Document presentation format – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 16
Provided by: ChrisB138
Category:

less

Transcript and Presenter's Notes

Title: Audio Fingerprinting Overview: RARE Algorithms, Resources


1
Audio Fingerprinting OverviewRARE Algorithms,
Resources
  • Chris Burges, John Platt,
  • Jon Goldstein, Erin Renshaw

http//msrweb/cburges/rare.htm
2
Lets agree on names
  • A fingerprint is a vector that represents a
    given audio clip. It lives in a database with a
    lot of other fingerprints.
  • A confirmation fingerprint is a second
    fingerprint used to confirm a match.
  • A trace is generated from audio every 186 ms.
    Its computed exactly the same way as a
    fingerprint.

3
Design of the Funnel
Analyze a Stream
Find 64 good projections of 6 seconds of audio
6 sec
6 sec
good projection
6 sec of Song B

64 floats / frame
In Database?
Confirmed?
6 sec of Song A
0
0
0
1
6 sec of distorted Song A
If 1, declare match
Good projections maximize d2/ d1
4
Feature Extraction
(186 ms)
5
De-Equalization
De-equalize by flattening the log spectrum.
Before
After
6
De-Equalization Details
  • Goal Remove slow variation in frequency space

7
Perceptual Thresholding
Remove coefficients that are below a perceptual
threshold to lower unwanted variance.
inaudible to human
audible to human
8
Project to 64 Floats
9
Bitvector yields 50x Speedup
10
Example Architecture
Client
.
.
.
Server
Internet
Client
Client
Lookup
Optional Pruning
Feature Extraction
Audio stream identity
Audio stream
11
Client Resources
  • Computing traces takes approx 10 CPU on 750 MHz
    P3.
  • However we can get speedup over the current DCT,
    since were only modifying the first 6
    coefficients O(Nlog(N)) ? O(6N).
  • Total data loaded by client is 2.1MB.

12
Client Side Options
  • What can be done on the client side to off-load
    the server lookup? Three ideas (in addition to
    only querying untagged music, and adding ID3 tags
    when found)
  • Leverage Zipfs law (if it holds!)
  • Reduce rate at which traces are sent
  • Prune traces on the client

13
Client Side Pruning Local Lookup
Zipfs Law
log( times played)
log(rank)
Having a database of fingerprints for e.g. the
top 10,000 songs would significantly reduce
server load, but we dont know by how much. Also
requires updates (e.g. weekly?)
14
Client Options, cont.
  • Can reduce sampling by factor of 2 (from 186 to
    372 ms) at some (likely small) loss in accuracy.
    This would halve both client CPU and server load.

15
Client Side Pruning Margin Trees
  • Using a tree built from first 24 components
  • No overpopulating, but flip 5 most error-prone
    bits in each trace
  • Gets a factor 2 reduction in throughput at 0.5
    increase in false neg. for very noisy data
  • Number of nodes in tree (for 254,885
    fingerprints) was found to be 1,531,508
  • Requires updates (e.g. weekly?)

16
A note on the code
  • Upper bound 22,000 lines of C.
  • File- and stream-based versions use the same
    libraries.
Write a Comment
User Comments (0)
About PowerShow.com