Title: Computer Methods, Memory Models and Melodic Expectation
1Computer Methods, Memory Models and Melodic
Expectation
- Rob Turetsky
- MUS G6250 Music Cognition
- Prof. Alfred Lerdahl
- May 5, 2003
2Talk Organization
- Motivation
- Melodic Expectations Overview
- Rule / Gestalt Based
- Automatic Structure Detection
- Memory Based
- Computational Model of Memory
- Theory based on Physiology
- Wrap-up and future directions
3Motivation
- Personal Motivation
- Techno Clubs and Jazz Improv
- My research Machine listening
- Applications
- Computer Composition Assistant
- Improved Feature Extraction Algorithms
- Blue Sky Model and understand the mechanics of
information representation and processing in the
brain
4Overview Melodic Expectations
- Asks the question what comes next?
- Schenker Expectation and retrospection will
reshape the meaning of what we hear - Note not what should come next
- Inherently tied with musical structure
- Three main camps
- Rules / Gestalt approaches
- Memory based / ecological approaches
- Physiological approach
- The main debate What is innate vs. what is
learned over time?
5Common Themes
- Distinction between step and leap
- Narmour, Larson Step small interval, leap
large interval - Lerdahl, Krumhansl Defined by tonal models
- Gap filling / Skip reversal
- Rules innate property of cognition
- Ecological learned over time because of
limitations of instrument range, playability - Memory learned over time b/c of exposure to
music - Reality Probably a combination of all three
6Rules / Gestalt Approximations
- Attempt to model high level cognitive functions
with rules - Three main principles
- Heuristics (LJ, Narmour, Krumhansl, etc)
- Physical Analogies (gravity, magnetism, inertia)
- Gestalt Principles (good continuation, etc)
- Separation between style (top-down) and reflexive
(bottom-up)
7Expectation Physical Analogies
- Larson 2002 Music as metaphor tonic chords
are like magnets, swing - Operations on Alphabets
- Gravity, Magnetism, Inertia
8Narmour Gestalt generated rules
- Narmour (1990) breaks musical implication into 3
simple rules - Similarity A A -gt A
- Differentiation A B -gt C
- Closure (nonformal)
- Syntactic parametric scale an automatic input
system that determines what is similar or
different, closural or non-closural function
and the extend to which a melodic pattern is open
or closed - Similar or Different based on interval size
9Narmour Archetypes
- Bases define five musical archetypes
- Process (P) or Iteration (I)
- Reversal (R)
- Registral Return (aba)
- Dyad
- Monad
- Also 5 archetypal derivatives
- Closure????
10Narmour Closure as Structure
- Closure Termination, blunting, inhibition or
weakening the melodic implication - Closure strongly linked with musical structure
identification - Rules of production when closed, the initial
and terminal tones of P R leads to dyads that
may imply P R on a higher level - Narmour promises One cannot apply any rule of
the I-R model mechanistically - However, structure can be detected by computer
11Structure Why is it so tough to
find?Char/Word/Phrase Boundaries
Text
Video
Audio?
12Audio Features 1 FFT
- Automatic Pitch Extraction / Transcription is an
unsolved problem - Use the FFT (Fast Fourier Transform)
- Idea every audio signal is built up of sinusoids
different frequency and phase. - Whats cool about it We can see f0 and the
entire overtone series
13Comparing Timbre MFCC
- Whats the big idea?
- Model speech as source filter
- Decorrelate feature components
- Simple harmonic series appears as single pitch
pulse, multiple pitches are cloudy - MFCCs can be used for timbre modeling (De Poli
and Prandoni, 1997) - Useful when wanting to compare instrumentation
instead of pitches
14The Similarity Matrix
- Pioneered by Foote, 2001
- Measure self similarity of every window in a song
with every other window - Theory Windows of same section will have similar
features. Windows of different sections will
have features. - Off diagonal lines correspond to repeated
sections - Novelty Score - measure of newness
correlation with checkerboard matrix. - Section breaks are peaks in the Novelty Score.
i
j
cos(i, j)
Novelty Score
15The Problem with Rules
- Is pitch explicitly recognized in the brain?
- Largely unsupported by experiments
- Where is the boundary between innate grouping
principles and ecological - In other words what depends on the limitations of
instruments? On culture? - Schenkerian Analysis Also rules, but based on
reducing everything to previously heard patterns.
16Scheirer 96 Top Down vs. Bottom Up
17Memory Models of Melodic Expectation
- Experimental - Von Hippel 2002 Musicians expect
gap filling, non-musicians dont. - Trained musicians innately build up heuristics
that might be useful based on experience - Experts learn by pattern classification
- Theoretical Bod 2002 Memory captures things
rules cant - Classifier trained on folksong database
outperforms rules based engine when good
continuation and melodic structure disagree
18Complementary Memory in Music
- Dowling, et al 2002 Recall of musical phrases
improves over time (2-30 min) - Short term (STM) and long term memory (LTM) both
exposed to new information - Over time, more weight is given to LTM
19Short Term vs. Long Term Memory
- Long Term Memory (Neocortex)
- Stable, high capacity storage of knowledge
- Cues by semantics
- Short Term Memory (Hippocampus)
- Rapid storage of new memories
- Associative cues (fast recall, low capacity)
- Explicit memory Episodic, semantic,
encyclopedic, spatial
20Catastrophic Interference - Example
- Focused learning will allow you to remember new
facts fast, but can ruin relationships youve
already built
21McClelland 94 Why two memories?
- STM uses sequential learning
- Fast Train each data point as it arrives
- Destructive Catastrophic interference on already
formed memories - LTM uses interleaved learning
- Stable Every new fact does not risk forming bad
relationships - Slow To ensure stability, you must retrain
entire network on every fact stored - STMLTM are complementary.
22STM vs. LTM machine learning models, signal flow
Perceptual Input
Teaches during downtime
Recalls as needed
Fast, Sequential Hippocampus (Hopfield Net)
Slow, Interleaved Neocortex (Pseudo-semantic Net)
23Physiological Models
- Problem with connectionist models Brain is much
more complex then simple machine learning
structure - Lets take a step back.
- Rules in the brain theory and future
- Innate There will be a low-level structure in
the brain that performs this analysis - Ecological Expect to see a vast change because
of the sequencer / sampler - Learned There will be vast differences across
cultures
24Analog in the visual world
- P. Sajda in the BME Department at Columbia!
25Hastily Written Conclusion
- Gestalt and group principles can be implemented
in intermediate perceptual circuits for (innate)
bottom-up processing - Memory models serve for top-down processing based
on experience, such as heuristics or recognition
of previously seen patterns - Computers can be used to model these predictions
in some way, shape or form.