Title: PFASC a Parallel Framework for Audio Similarity Clustering
1Music Information Retrieval With Condor
Scott McCaulayJoe Rinkovsky Pervasive Technology
Institute Indiana University
2Overview
- PFASC is a suite of applications developed at IU
to perform automated similarity analysis of audio
files - Potential applications include organization of
digital libraries, recommender systems, playlist
generators, audio processing - PFASC is a project in the MIR field, an extension
and adaptation of traditional Text Information
Retrieval techniques to sound files - Elements of PFASC, specifically the file by file
similarity calculation, have proven to be a very
good fit with Condor
3What Well Cover
- Condor at Indiana University
- Background on Information Retrieval and Music
Information Retrieval - The PFASC project
- PFASC and Condor, experience to date and results
- Summary
4Condor at IU
- Initiated in 2003
- Utilizes 2350 Windows Vista machines from IUs
Student Technology Clusters - Minimum 2GB memory, 100 Mb network
- Available to students at 42 locations on the
Bloomington campus 24 x 7 - Student use is top priority, Condor jobs are
suspended immediately on use
5Costs to Support Condor at IU
- Annual marginal annual cost to support Condor
Pool at IU is lt 15K - Includes system administration, head nodes, file
servers - Purchase and support of STC machines are funded
from Student Technology Fees
6Challenges to Making Good use of Condor
Resources at IU
- Windows environment
- Research computing environment at IU is geared to
Linux, or to exotic architectures - Ephemeral resources
- Machines are moderately to heavily used at all
hours, longer jobs are likely to be preempted - Availability of other computing resources
- Local users are far from starved for cycles,
limited motivation to port
7Examples of Applications Supported on Condor at IU
- Hydra Portal (2003)
- Job submission portal
- Suite of Bio apps, Blast, Meme, FastDNAml
- Condor Render Portal (2006)
- Maya, Blender video rendering
- PFASC (2008)
- Similarity analysis of audio files
8Information Retrieval - Background
- Science of organizing documents for search and
retrieval - Dates back to 1880s (Hollerith)
- Vannevar Bush, first US presidential science
advisor, presages hypertext in As We May Think
(1945)
- The concept of automated text document analysis,
organization and retrieval was met with a good
deal of skepticism until the 1990s. Some critics
now grudgingly concede that it might work
9Calculating SimilarityThe Vector Space Model
- Each feature found in a file is assigned a weight
based on the frequency of its occurrence in the
file and how common that feature is in the
collection - Similarity between files is calculated based on
common features and their weights. If two files
share features not common to the entire
collection, their similarity value will be very
high - This vector space model (Salton) is the basis of
many text search engines, and also works well
with audio files - For text files, features are words or character
strings. For Audio files, features are prominent
frequencies within frames of audio or sequences
of frequencies across frames.
10Some Digital Audio History
- Uploaded to Compuserve 10/1985
- one of the most popular downloads at the time!
- 10 seconds of digital audio
- Time to download (300 baud) 20 minutes
- Time to load 20 minutes (tape) 2 minutes (disk)
- Storage space 42K
- From this to Napster in less than 15 years
11Explosion of Digital Audio
- Digital audio today similar to text 15 years ago
- Poised for 2nd phase of the digital audio
revolution? - Ubiquitous, easy to create, access, share
- Lack of tools to analyze, search or organize
12How can we organize this enormous and growing
volume of digital audio data for discovery and
retrieval?
13Whats done today
- Pandora - Music Genome Project
- expert manual classification of 400 attributes
- Allmusic
- manual artist similarity classification by
critics - last.fm Audioscrobbler
- collaborative filtering from user playlists
- iTunes Genius
- collaborative filtering from user playlists
14Whats NOT done today
- Any analysis (outside of research) of similarity
or classification based on the actual audio
content of song files
15Possible Hybrid Solution
Automated Analysis
User Behavior
Manual Metadata
- Classification/Retrieval system could use
elements of all three methods to improve
performance
16Music Information Retrieval
- Applying traditional IR techniques for
classification, clustering, similarity analysis,
pattern matching, etc. to digital audio files - Recent field of study, has accelerated with the
inception of the ISMIR conference in 2000 and
MIREX evaluation in 2004.
17Common Basis of an MIR System
- Select very small segment of audio data, 20-40ms
- Use fast Fourier transform (FFT) to convert to
frequency data - This frame of audio becomes the equivalent of a
word in a text file for similarity analysis
- The output of this feature extraction process
is input to various analysis or classification
processes - PFASC additionally combines prominent frequencies
from adjacent frames to create temporal sequences
as features
18PFASC as an MIR Project
- Parallel Framework for Audio Similarity
Clustering - Initiated at IU in 2008
- Team includes School of Library and Information
Science (SLIS), Cognitive Science, School of
Music and Pervasive Technologies Institute (PTI) - Have developed MPI-based feature extraction
algorithm, SVM classification, vector space
similarity analysis, some preliminary
visualization. - Wish list includes graphical workflow, job
submission portal, use in MIR classes
19PFASC Philosophy and Methodology
- Provide an end-to-end framework for MIR, from
workflow to visualization - Recognize temporal context as an critical element
of audio and a necessary part of feature
extraction - Simple concept, simple implementation, one highly
configurable algorithm for feature extraction - Dynamic combination and tuning of results from
multiple runs, user controlled weighting - Make good use of available cyberinfrastructure
- Support education in MIR
20PFASC Feature Extraction Example
- Summary of 450 files classified by genre, showing
most prominent frequencies across spectrum
21PFASC Similarity Matrix Example
- Audio file summarized as a vector of feature
values, similarity calculated between vectors - Value is between 0.0 and 1.0, 0.0 no
commonality, 1.0 files are identical - In the above example, same genre files had
similarity scores 3.352 times higher than
different genre files
22Classification vs. Clustering
- Most work in MIR involves classification, e.g.
genre classification, an exercise that may be
arbitrary and limited in value - Calculating similarity values among all songs in
a library may be more practical for music
discovery, playlist generation, grouping by
combinations of selected features - Calculating similarity is MUCH more
computationally intensive than categorization,
comparing all songs in a library of 20,000 files
requires 200 million comparisons
23Using Condor for Similarity Analysis
- Good fit for IU Condor resources, a very large
number of short duration jobs - Jobs are independent, can be restarted and run in
any order - Large number of available machines provides great
wall clock performance advantage over IU
supercomputers
24PFASC Performance and Resources
- A recent run of 450 jobs completed in 16 minutes.
Time to run in serial on a desktop machine would
have been about 19 hours - Largest run to date contained 3,245 files, over 5
million song-to-song comparisons, completed in
less than eight hours, would have been over 11
days on a desktop - Queue wait time for 450 processors on IUs Big
Red is typically several days, for 3000
processors it would be up to a month
25Porting to Windows
26Visualizing Results
27Visualizing Results
28PFASC Contributors
- Scott McCaulay (Project Lead)
- Ray Sheppard (MPI Programming)
- Eric Wernert (Visualization)
- Joe Rinkovsky (Condor)
- Steve Simms (Storage Workflow)
- Kiduk Yang (Information Retrieval)
- John Walsh (Digital Libraries)
- Eric Isaacson (Music Cognition)
29Thank you!