PFASC a Parallel Framework for Audio Similarity Clustering presentation

About This Presentation

Transcript and Presenter's Notes

Title: PFASC a Parallel Framework for Audio Similarity Clustering

1
Music Information Retrieval With Condor
Scott McCaulayJoe Rinkovsky Pervasive Technology
Institute Indiana University
2
Overview

PFASC is a suite of applications developed at IU
to perform automated similarity analysis of audio
files
Potential applications include organization of
digital libraries, recommender systems, playlist
generators, audio processing
PFASC is a project in the MIR field, an extension
and adaptation of traditional Text Information
Retrieval techniques to sound files
Elements of PFASC, specifically the file by file
similarity calculation, have proven to be a very
good fit with Condor

3
What Well Cover

Condor at Indiana University
Background on Information Retrieval and Music
Information Retrieval
The PFASC project
PFASC and Condor, experience to date and results
Summary

4
Condor at IU

Initiated in 2003
Utilizes 2350 Windows Vista machines from IUs
Student Technology Clusters
Minimum 2GB memory, 100 Mb network
Available to students at 42 locations on the
Bloomington campus 24 x 7
Student use is top priority, Condor jobs are
suspended immediately on use

5
Costs to Support Condor at IU

Annual marginal annual cost to support Condor
Pool at IU is lt 15K
Includes system administration, head nodes, file
servers
Purchase and support of STC machines are funded
from Student Technology Fees

6
Challenges to Making Good use of Condor
Resources at IU

Windows environment
Research computing environment at IU is geared to
Linux, or to exotic architectures
Ephemeral resources
Machines are moderately to heavily used at all
hours, longer jobs are likely to be preempted
Availability of other computing resources
Local users are far from starved for cycles,
limited motivation to port

7
Examples of Applications Supported on Condor at IU

Hydra Portal (2003)
Job submission portal
Suite of Bio apps, Blast, Meme, FastDNAml
Condor Render Portal (2006)
Maya, Blender video rendering
PFASC (2008)
Similarity analysis of audio files

8
Information Retrieval - Background

Science of organizing documents for search and
retrieval
Dates back to 1880s (Hollerith)
Vannevar Bush, first US presidential science
advisor, presages hypertext in As We May Think
(1945)

The concept of automated text document analysis,
organization and retrieval was met with a good
deal of skepticism until the 1990s. Some critics
now grudgingly concede that it might work

9
Calculating SimilarityThe Vector Space Model

Each feature found in a file is assigned a weight
based on the frequency of its occurrence in the
file and how common that feature is in the
collection
Similarity between files is calculated based on
common features and their weights. If two files
share features not common to the entire
collection, their similarity value will be very
high
This vector space model (Salton) is the basis of
many text search engines, and also works well
with audio files
For text files, features are words or character
strings. For Audio files, features are prominent
frequencies within frames of audio or sequences
of frequencies across frames.

10
Some Digital Audio History

Uploaded to Compuserve 10/1985
one of the most popular downloads at the time!
10 seconds of digital audio
Time to download (300 baud) 20 minutes
Time to load 20 minutes (tape) 2 minutes (disk)
Storage space 42K
From this to Napster in less than 15 years

11
Explosion of Digital Audio

Digital audio today similar to text 15 years ago
Poised for 2nd phase of the digital audio
revolution?
Ubiquitous, easy to create, access, share
Lack of tools to analyze, search or organize

12
How can we organize this enormous and growing
volume of digital audio data for discovery and
retrieval?
13
Whats done today

Pandora - Music Genome Project
expert manual classification of 400 attributes
Allmusic
manual artist similarity classification by
critics
last.fm Audioscrobbler
collaborative filtering from user playlists
iTunes Genius
collaborative filtering from user playlists

14
Whats NOT done today

Any analysis (outside of research) of similarity
or classification based on the actual audio
content of song files

15
Possible Hybrid Solution
Automated Analysis
User Behavior
Manual Metadata

Classification/Retrieval system could use
elements of all three methods to improve
performance

16
Music Information Retrieval

Applying traditional IR techniques for
classification, clustering, similarity analysis,
pattern matching, etc. to digital audio files
Recent field of study, has accelerated with the
inception of the ISMIR conference in 2000 and
MIREX evaluation in 2004.

17
Common Basis of an MIR System

Select very small segment of audio data, 20-40ms
Use fast Fourier transform (FFT) to convert to
frequency data
This frame of audio becomes the equivalent of a
word in a text file for similarity analysis

The output of this feature extraction process
is input to various analysis or classification
processes
PFASC additionally combines prominent frequencies
from adjacent frames to create temporal sequences
as features

18
PFASC as an MIR Project

Parallel Framework for Audio Similarity
Clustering
Initiated at IU in 2008
Team includes School of Library and Information
Science (SLIS), Cognitive Science, School of
Music and Pervasive Technologies Institute (PTI)
Have developed MPI-based feature extraction
algorithm, SVM classification, vector space
similarity analysis, some preliminary
visualization.
Wish list includes graphical workflow, job
submission portal, use in MIR classes

19
PFASC Philosophy and Methodology

Provide an end-to-end framework for MIR, from
workflow to visualization
Recognize temporal context as an critical element
of audio and a necessary part of feature
extraction
Simple concept, simple implementation, one highly
configurable algorithm for feature extraction
Dynamic combination and tuning of results from
multiple runs, user controlled weighting
Make good use of available cyberinfrastructure
Support education in MIR

20
PFASC Feature Extraction Example

Summary of 450 files classified by genre, showing
most prominent frequencies across spectrum

21
PFASC Similarity Matrix Example

Audio file summarized as a vector of feature
values, similarity calculated between vectors
Value is between 0.0 and 1.0, 0.0 no
commonality, 1.0 files are identical
In the above example, same genre files had
similarity scores 3.352 times higher than
different genre files

22
Classification vs. Clustering

Most work in MIR involves classification, e.g.
genre classification, an exercise that may be
arbitrary and limited in value
Calculating similarity values among all songs in
a library may be more practical for music
discovery, playlist generation, grouping by
combinations of selected features
Calculating similarity is MUCH more
computationally intensive than categorization,
comparing all songs in a library of 20,000 files
requires 200 million comparisons

23
Using Condor for Similarity Analysis

Good fit for IU Condor resources, a very large
number of short duration jobs
Jobs are independent, can be restarted and run in
any order
Large number of available machines provides great
wall clock performance advantage over IU
supercomputers

24
PFASC Performance and Resources

A recent run of 450 jobs completed in 16 minutes.
Time to run in serial on a desktop machine would
have been about 19 hours
Largest run to date contained 3,245 files, over 5
million song-to-song comparisons, completed in
less than eight hours, would have been over 11
days on a desktop
Queue wait time for 450 processors on IUs Big
Red is typically several days, for 3000
processors it would be up to a month

25
Porting to Windows
26
Visualizing Results
27
Visualizing Results
28
PFASC Contributors

Scott McCaulay (Project Lead)
Ray Sheppard (MPI Programming)
Eric Wernert (Visualization)
Joe Rinkovsky (Condor)
Steve Simms (Storage Workflow)
Kiduk Yang (Information Retrieval)
John Walsh (Digital Libraries)
Eric Isaacson (Music Cognition)

29
Thank you!

Write a Comment

User Comments (0)

About PowerShow.com

PFASC a Parallel Framework for Audio Similarity Clustering PowerPoint PPT Presentation