Toward Automatic Music Audio Summary Generation from Signal Analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Toward Automatic Music Audio Summary Generation from Signal Analysis

Description:

Toward Automatic Music Audio Summary Generation from Signal Analysis ... The choice of the duration on which the modeling is performed, determines the ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 10
Provided by: ind112
Category:

less

Transcript and Presenter's Notes

Title: Toward Automatic Music Audio Summary Generation from Signal Analysis


1
Toward Automatic Music Audio Summary Generation
from Signal Analysis
Patricia Signé
  • Seminar Communications Engineering
  • 11. December 2007

2
Agenda
  • Introduction
  • State of the art
  • Static vs Dynamic features
  • Automatic Music Audio Summary generation
  • Extraxtion of information from the signal
  • Representation by states Multipass approach
  • Conclusion
  • Questions

3
Introduction
  • Recent topic of interest driven by commercial
    needs (browsing of online music) documentation
    (browsing over archives) as well as music
    information retrieval.
  • Storage of audio summary has been normalized, e.g
    SDS of the MPEG-7 standard
  • Set of tools allowing the storage of sequential
    or hierarchical summaries
  • Only few techniques exist allowing the automatic
    generation of audio summary. Big contrast to
    video and text where multiple methods or
    approaches exist for the automatic summary
    generation.
  • Summary can be parameterized at three levels
  • The type of the source
  • The goal of the summary
  • The output format

4
State of the art
  • Sequences approach
  • A similatity matrix applied to well-chosen
    features allows a visual representation of the
    structural information of a piece of music
    (Footes work on similarity matrix).
  • Signal features used in this study are the Mel
    Frequency Cepstral Coefficients (MFCC).
  • If a specific segment of music ranging from times
    t1 to t2 is repeated later in the music from t3
    to t4, the succession of features between both
    time periods is supposed to be identical.
  • A key point of the actual works stands in the use
    of static features (MFCC) as signal observation.

5
Static vs Dynamic features
  • Static features represent the signal around a
    given time, but does not model any temporal
    evolution. It implies that when looking for
    repeated patterns in the music , the necessity to
    find identical evolution of the features or to
    average features over a period of time in order
    to get states.
  • Dynamic features model directly the temporal
    evolution of the spectral shape over a fixed time
    duration. The choice of the duration on which the
    modeling is performed, determines the kind of
    information that we will be able to derive from
    signal analysis.
  • Features extraction

6
Static vs Dynamic features
  • Using static features implies that, when looking
    for repeated patterns in the piece of music there
    is the necessity to find an identical evolution
    of the features.
  • Advantages of using dynamic features
  • The above mentioned problem of static features is
    solved with dynamic ones, i.e if some arrangement
    of the music masks the repetition of the initial
    melody sequence, repeated patterns will still be
    recognized.
  • For an appropriate choice of the modelings time
    duration, the search for repeated patterns in the
    music can be far easier
  • The amount of data can be greatly reduced for a
    4 minute long music, the size of the similarity
    matric is around 3400024000 in the case of the
    MFCC, it can be only 240240 in the case of the
    dynamic features.

7
Automatic Music Audio Summary generation
  • Consider the musical piece as a succession of
    states. Each state representing a somehow similar
    information found in different parts of the
    piece,.
  • States we are looking for are specific for each
    piece of music, no supervised learning is
    possible to find them.
  • Use for the automatic audi summary generation an
    human like segmentation and structuring approach
    by subsequently analysing the data to process it.
  • From the signal data
  • Dynamic features extraction first listening
    allows the detection of variations in the music
    without knowing if a specific part is repeated
    later. This segmentation defines a set of
    templates which we call potential states
  • Finding the structure by using previously created
    templates
  • templates are compared to reduce redundancies
  • the reduced set of templates is used as
    initialization for a K-Means algorithm
  • The middle states, which are the output of the
    K-Mean algorithm are used for the initialisation
    of the Hidden Markov Model learning.
  • finally, the optimal representation of the piece
    as a HMM state sequence is obtained by the
    application of the viterbi algorithm.

8
Automatic Music Audio Summary generation
HMM
Sequence chart
9
conclusion
  • Automatic generation of music audio summary from
    signal analysis without using any other
    informations
  • Consider the musical piece as a succession of
    states. Each state representing a somehow similar
    information found in different parts of the
    piece,
  • Audio signal
  • Derive dynamic features representing time
    evolution of the energy content in various
    frequency bands.
  • From this observation derive a representation of
    the music on terms of states,
  • Thanks!
Write a Comment
User Comments (0)
About PowerShow.com