Title: Audio Fingerprinting Overview: RARE Algorithms, Resources
1Audio Fingerprinting OverviewRARE Algorithms,
Resources
- Chris Burges, John Platt,
- Jon Goldstein, Erin Renshaw
http//msrweb/cburges/rare.htm
2Lets agree on names
- A fingerprint is a vector that represents a
given audio clip. It lives in a database with a
lot of other fingerprints. - A confirmation fingerprint is a second
fingerprint used to confirm a match. - A trace is generated from audio every 186 ms.
Its computed exactly the same way as a
fingerprint.
3Design of the Funnel
Analyze a Stream
Find 64 good projections of 6 seconds of audio
6 sec
6 sec
good projection
6 sec of Song B
64 floats / frame
In Database?
Confirmed?
6 sec of Song A
0
0
0
1
6 sec of distorted Song A
If 1, declare match
Good projections maximize d2/ d1
4 Feature Extraction
(186 ms)
5De-Equalization
De-equalize by flattening the log spectrum.
Before
After
6De-Equalization Details
- Goal Remove slow variation in frequency space
7Perceptual Thresholding
Remove coefficients that are below a perceptual
threshold to lower unwanted variance.
inaudible to human
audible to human
8Project to 64 Floats
9Bitvector yields 50x Speedup
10Example Architecture
Client
.
.
.
Server
Internet
Client
Client
Lookup
Optional Pruning
Feature Extraction
Audio stream identity
Audio stream
11Client Resources
- Computing traces takes approx 10 CPU on 750 MHz
P3. - However we can get speedup over the current DCT,
since were only modifying the first 6
coefficients O(Nlog(N)) ? O(6N). - Total data loaded by client is 2.1MB.
12Client Side Options
- What can be done on the client side to off-load
the server lookup? Three ideas (in addition to
only querying untagged music, and adding ID3 tags
when found)
- Leverage Zipfs law (if it holds!)
- Reduce rate at which traces are sent
- Prune traces on the client
13Client Side Pruning Local Lookup
Zipfs Law
log( times played)
log(rank)
Having a database of fingerprints for e.g. the
top 10,000 songs would significantly reduce
server load, but we dont know by how much. Also
requires updates (e.g. weekly?)
14Client Options, cont.
- Can reduce sampling by factor of 2 (from 186 to
372 ms) at some (likely small) loss in accuracy.
This would halve both client CPU and server load.
15Client Side Pruning Margin Trees
- Using a tree built from first 24 components
- No overpopulating, but flip 5 most error-prone
bits in each trace - Gets a factor 2 reduction in throughput at 0.5
increase in false neg. for very noisy data - Number of nodes in tree (for 254,885
fingerprints) was found to be 1,531,508 - Requires updates (e.g. weekly?)
16A note on the code
- Upper bound 22,000 lines of C.
- File- and stream-based versions use the same
libraries.