PFASC a Parallel Framework for Audio Similarity Clustering PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: PFASC a Parallel Framework for Audio Similarity Clustering


1
Music Information Retrieval With Condor
Scott McCaulayJoe Rinkovsky Pervasive Technology
Institute Indiana University
2
Overview
  • PFASC is a suite of applications developed at IU
    to perform automated similarity analysis of audio
    files
  • Potential applications include organization of
    digital libraries, recommender systems, playlist
    generators, audio processing
  • PFASC is a project in the MIR field, an extension
    and adaptation of traditional Text Information
    Retrieval techniques to sound files
  • Elements of PFASC, specifically the file by file
    similarity calculation, have proven to be a very
    good fit with Condor

3
What Well Cover
  • Condor at Indiana University
  • Background on Information Retrieval and Music
    Information Retrieval
  • The PFASC project
  • PFASC and Condor, experience to date and results
  • Summary

4
Condor at IU
  • Initiated in 2003
  • Utilizes 2350 Windows Vista machines from IUs
    Student Technology Clusters
  • Minimum 2GB memory, 100 Mb network
  • Available to students at 42 locations on the
    Bloomington campus 24 x 7
  • Student use is top priority, Condor jobs are
    suspended immediately on use

5
Costs to Support Condor at IU
  • Annual marginal annual cost to support Condor
    Pool at IU is lt 15K
  • Includes system administration, head nodes, file
    servers
  • Purchase and support of STC machines are funded
    from Student Technology Fees

6
Challenges to Making Good use of Condor
Resources at IU
  • Windows environment
  • Research computing environment at IU is geared to
    Linux, or to exotic architectures
  • Ephemeral resources
  • Machines are moderately to heavily used at all
    hours, longer jobs are likely to be preempted
  • Availability of other computing resources
  • Local users are far from starved for cycles,
    limited motivation to port

7
Examples of Applications Supported on Condor at IU
  • Hydra Portal (2003)
  • Job submission portal
  • Suite of Bio apps, Blast, Meme, FastDNAml
  • Condor Render Portal (2006)
  • Maya, Blender video rendering
  • PFASC (2008)
  • Similarity analysis of audio files

8
Information Retrieval - Background
  • Science of organizing documents for search and
    retrieval
  • Dates back to 1880s (Hollerith)
  • Vannevar Bush, first US presidential science
    advisor, presages hypertext in As We May Think
    (1945)
  • The concept of automated text document analysis,
    organization and retrieval was met with a good
    deal of skepticism until the 1990s. Some critics
    now grudgingly concede that it might work

9
Calculating SimilarityThe Vector Space Model
  • Each feature found in a file is assigned a weight
    based on the frequency of its occurrence in the
    file and how common that feature is in the
    collection
  • Similarity between files is calculated based on
    common features and their weights. If two files
    share features not common to the entire
    collection, their similarity value will be very
    high
  • This vector space model (Salton) is the basis of
    many text search engines, and also works well
    with audio files
  • For text files, features are words or character
    strings. For Audio files, features are prominent
    frequencies within frames of audio or sequences
    of frequencies across frames.

10
Some Digital Audio History
  • Uploaded to Compuserve 10/1985
  • one of the most popular downloads at the time!
  • 10 seconds of digital audio
  • Time to download (300 baud) 20 minutes
  • Time to load 20 minutes (tape) 2 minutes (disk)
  • Storage space 42K
  • From this to Napster in less than 15 years

11
Explosion of Digital Audio
  • Digital audio today similar to text 15 years ago
  • Poised for 2nd phase of the digital audio
    revolution?
  • Ubiquitous, easy to create, access, share
  • Lack of tools to analyze, search or organize

12
How can we organize this enormous and growing
volume of digital audio data for discovery and
retrieval?
13
Whats done today
  • Pandora - Music Genome Project
  • expert manual classification of 400 attributes
  • Allmusic
  • manual artist similarity classification by
    critics
  • last.fm Audioscrobbler
  • collaborative filtering from user playlists
  • iTunes Genius
  • collaborative filtering from user playlists

14
Whats NOT done today
  • Any analysis (outside of research) of similarity
    or classification based on the actual audio
    content of song files

15
Possible Hybrid Solution
Automated Analysis
User Behavior
Manual Metadata
  • Classification/Retrieval system could use
    elements of all three methods to improve
    performance

16
Music Information Retrieval
  • Applying traditional IR techniques for
    classification, clustering, similarity analysis,
    pattern matching, etc. to digital audio files
  • Recent field of study, has accelerated with the
    inception of the ISMIR conference in 2000 and
    MIREX evaluation in 2004.

17
Common Basis of an MIR System
  • Select very small segment of audio data, 20-40ms
  • Use fast Fourier transform (FFT) to convert to
    frequency data
  • This frame of audio becomes the equivalent of a
    word in a text file for similarity analysis
  • The output of this feature extraction process
    is input to various analysis or classification
    processes
  • PFASC additionally combines prominent frequencies
    from adjacent frames to create temporal sequences
    as features

18
PFASC as an MIR Project
  • Parallel Framework for Audio Similarity
    Clustering
  • Initiated at IU in 2008
  • Team includes School of Library and Information
    Science (SLIS), Cognitive Science, School of
    Music and Pervasive Technologies Institute (PTI)
  • Have developed MPI-based feature extraction
    algorithm, SVM classification, vector space
    similarity analysis, some preliminary
    visualization.
  • Wish list includes graphical workflow, job
    submission portal, use in MIR classes

19
PFASC Philosophy and Methodology
  • Provide an end-to-end framework for MIR, from
    workflow to visualization
  • Recognize temporal context as an critical element
    of audio and a necessary part of feature
    extraction
  • Simple concept, simple implementation, one highly
    configurable algorithm for feature extraction
  • Dynamic combination and tuning of results from
    multiple runs, user controlled weighting
  • Make good use of available cyberinfrastructure
  • Support education in MIR

20
PFASC Feature Extraction Example
  • Summary of 450 files classified by genre, showing
    most prominent frequencies across spectrum

21
PFASC Similarity Matrix Example
  • Audio file summarized as a vector of feature
    values, similarity calculated between vectors
  • Value is between 0.0 and 1.0, 0.0 no
    commonality, 1.0 files are identical
  • In the above example, same genre files had
    similarity scores 3.352 times higher than
    different genre files

22
Classification vs. Clustering
  • Most work in MIR involves classification, e.g.
    genre classification, an exercise that may be
    arbitrary and limited in value
  • Calculating similarity values among all songs in
    a library may be more practical for music
    discovery, playlist generation, grouping by
    combinations of selected features
  • Calculating similarity is MUCH more
    computationally intensive than categorization,
    comparing all songs in a library of 20,000 files
    requires 200 million comparisons

23
Using Condor for Similarity Analysis
  • Good fit for IU Condor resources, a very large
    number of short duration jobs
  • Jobs are independent, can be restarted and run in
    any order
  • Large number of available machines provides great
    wall clock performance advantage over IU
    supercomputers

24
PFASC Performance and Resources
  • A recent run of 450 jobs completed in 16 minutes.
    Time to run in serial on a desktop machine would
    have been about 19 hours
  • Largest run to date contained 3,245 files, over 5
    million song-to-song comparisons, completed in
    less than eight hours, would have been over 11
    days on a desktop
  • Queue wait time for 450 processors on IUs Big
    Red is typically several days, for 3000
    processors it would be up to a month

25
Porting to Windows
26
Visualizing Results
27
Visualizing Results
28
PFASC Contributors
  • Scott McCaulay (Project Lead)
  • Ray Sheppard (MPI Programming)
  • Eric Wernert (Visualization)
  • Joe Rinkovsky (Condor)
  • Steve Simms (Storage Workflow)
  • Kiduk Yang (Information Retrieval)
  • John Walsh (Digital Libraries)
  • Eric Isaacson (Music Cognition)

29
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com