An Introduction to Functional Data Analysis presentation

About This Presentation

Transcript and Presenter's Notes

Title: An Introduction to Functional Data Analysis

1
An Introduction to Functional Data Analysis

Jim Ramsay
McGill University

2
Overview

Well use three case studies to see what is meant
by functional data, and to consider some
important issues in the analysis of functional
data
Human growth data
US nondurable goods manufacturing index
Thirty years of Montreal weather

3
Human Growth From data to functions
4

We need repeated and regular access to subjects
for up to 20 years.
Height changes over the day, and must be measured
at a fixed time.
Height is measured in supine position in infancy,
followed by standing height. The change involves
an adjustment of about 1 cm.
Measurement error is about 0.5 cm in later years,
but is rather larger in infancy.
Measurements are not taken at equally spaced
points in time.

5
Challenges to functional modeling

We want smooth curves that fit the data as well
as is reasonable.
We will want to look at velocity and
acceleration, so we want to differentiate twice
and still be smooth.
In principle the curves should be monotone i.
e., have a positive derivative.

6
The monotonicity problem

The tibia of a newborn measured daily shows us
that over the short term growth takes places in
spurts.
This babys tibia grows as fast as 2 mm/day!
How can we fit a smooth monotone function?

7
Weighted sums of basis functions

We need a flexible method for constructing curves
to fit the data.
We begin with a set of basic functional building
blocks fk(t), called basis functions.
Our fitting function x(t) is a weighted sum of
these

8
B-splines for growth data

Order 4 splines look smooth, but their second
derivatives are rough.
We use order 6 B-splines because we want to
differentiate the result at least twice.
We place a knot at each of the 31 ages.
The total number of basis functions order
number of interior knots. 35 in this case.

9
Isnt using 35 basis functions to fit 31
observations a problem?

Yes. We will fit each observation exactly.
This will ignore the fact that the measurement
error is typically about 0.5 cm.
But well fix this up later, when we look at
roughness penalties.

10
Okay, lets see what happens

These two Matlab commands define the basis and
fit the data
hgtbasis
create_bspline_basis(1,18, 35, 6, age)
hgtfd
data2fd(hgtfmat, age, hgtbasis)

11
Why we need to smooth

Noise in the data has a huge impact on derivative
estimates.

12
Please let me smooth the data!

This command sets up 12 B-spline basis functions
defined by equally spaced knots. This gives us
about the right amount of fitting power given the
error level.
hgtbasis
create_bspline_basis(1,18, 12, 6)

These are velocities are much better.
They go negative on the right, though.

14
Lets see some accelerations

These acceleration curves are too unstable at the
ends.
We need something better.

15
A measure of roughness

What do we mean by smooth?
A function that is smooth has limited curvature.
Curvature depends on the second derivative. A
straight line is completely smooth.

16
Total curvature

We can measure the roughness of a function x(t)
by integrating its squared second derivative.
The second derivative notation is D2x(t).

17
Total curvature of acceleration

Since we want acceleration to be smooth, we
measure roughness at the level of acceleration

18
The penalized least squares criterion

We strike a compromise between fitting the data
and keeping the fit smooth.

19
How does this control roughness?

Smoothing parameter ? controls roughness.
When ? 0, only fitting the data matters.
But as ? increases, we place more and more
emphasis on penalizing roughness.
As ? ? 8, only roughness matters, and functions
having zero roughness are used.

We can either smooth at the data fitting step, or
smooth a rough function.
This Matlab command smooths the fit to the data
obtained using knots at ages. The roughness of
the fourth derivative is controlled.
lambda 0.01
hgtfd smooth_fd(hgtfd, lambda, 4)

21
Accelerations using a roughness penalty

These accelerations are much less variable at the
extremes.

22
The corresponding velocities look good, too
23
How did you choose ??

We smooth just enough to obtain tolerable
roughness in the estimated curves (accelerations
in this case), but not so much as to lose
interesting variation.
There are data-driven methods for choosing ?, but
they offer only a reasonable place to begin
exploring.
But this is inevitably involves judgment.

24
What about monotonicity?

The growth curves should be monotonic.
The velocities should be non-negative.
Its hard to prevent linear combinations of
anything from breaking rules like monotonicity.
We need an indirect approach to constructing a
monotonic model.

25
A differential equation for monotonicity

Any strictly monotonic function x(t) must satisfy
a simple linear differential equation

The reason is simple Because of strict
monotonicity, the first derivative Dx(t) will
never be 0, and function w(t) is therefore
simply D2x(t)/Dx(t).
26
The solution of the differential equation

Consequently, any strictly monotonic function
x(t) must be expressible in the form

This suggests that we transform the monotone
smoothing problem into one of estimating function
w(t), and constants ß0 and ß1.
27
What we have learned from the growth data

We can control smoothness by either using a
restricted number of basis functions, or by
imposing a roughness penalty.
Roughness penalty methods generally work better
than simple basis expansions.
Differential equations can play a useful role in
defining constrained functions.

28
Phase-Plane Plotting the Nondurable Goods Index
29

Nondurable goods last less than two years Food,
clothing, cigarettes, alcohol, but not personal
computers!!
The nondurable goods manufacturing index is an
indicator of the economics of everyday life.
The index has been published monthly by the US
Federal Reserve Board since 1919.
It complements the durable goods manufacturing
index.

30
What we want to do

Look at important events.
Examine the overall trend in the index.
Have a look at the annual or seasonal behavior of
the index.
Understand how the seasonal behavior changes over
the years and with specific events.

31
The log nondurable goods index
32
Events and Trends

Short term
1929 stock market crash
1937 restriction of money supply
1974 end of Vietnam war, OPEC oil crisis
Medium term
Depression
World War II
Unusually rapid growth 1960-1974
Unusually slow growth 1990 to present
Long term increase of 1.5 per year

33
The evolution of seasonal trend

We focus on the years 1948 to 1999
We estimate long- and medium-term trend by spline
smoothing, but with knots too far apart to
capture seasonal trend
We subtract this smooth trend to leave only
seasonal trend

34
Smoothing the data
We want to represent the data yj by a smooth
curve x(t). The curve should have at least two
smooth derivatives. We use spline smoothing,
penalizing the size of the 4th derivative.
A function Pspline in S-PLUS is available by ftp
from ego.psych.mcgill.ca/pub/ramsay/FDAfuns
35
Three years of typical trend 1964-1966
36
Seasonal Trend

Typically three peaks per year
The largest is in the fall, peaking at the
beginning of October
The low point is mid-December

37
Non-seasonal trend is in red
38
Seasonal trend data nonseasonal trend
39
Phase-Plane Plots

Looking at seasonal trend itself does not reveal
as much as looking at the interplay between
Velocity or its first derivative, reflecting
kinetic energy in the system.
Acceleration or its second derivative, reflecting
potential energy.
The phase-plane diagram plots acceleration
against velocity.
For purely sinusoidal trend, the plot would be an
ellipse.

40
Position of a swinging pendulum
41
Phase-plane plot for pendulum
42
Phase-plane plot for 1964

There are three large loops separated by two
small loops or cusps
Spring cycle mid-January into April
Summer cycle May through August
Fall cycle October through December

43
A look at the years 1929-1931.
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
1929 through 1931

The stock market crash shows up as a large
negative surge in velocity.
Subsequent years nearly lose the fall production
cycle, as people tighten their belts and spend
less at Christmas.

48
What happened in 1937-1938?
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
1937 and 1938

The Treasury Board, fearing that the economy was
becoming overheated again, clamped down on the
money supply. The effect was catastrophic, and
nearly wiped out the fall cycle.
This new crash was even more dramatic than that
of 1929, but was forgotten because of the
outbreak of World War II.

53
What about World War II?
54
(No Transcript)
55

During World War II, the seasonal cycle became
very small, since the war, and the production
that fed it, lasted all year long.
Now look at three pivotal years, 1974 to 1976,
when the Vietnam War ended and the OPEC oil
crisis happened. Watch the shrinking of the fall
cycle.

56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
What about today?
60
(No Transcript)
61
These days

Over the last ten years the size of all three
cycles have become much smaller.
Why?
Is variation now smoothed out by information
technology?
Are the aging baby boomers spending less?
Are personal computers, video games, and other
electronic goods really durable?
Has manufacturing now moved off shore?

62
Conclusions

We can separate long- and medium-term trends from
seasonal trends by smoothing.
Phase-plane plots are great ways to inspect
seasonality.
Derivatives were used in two ways to penalize
roughness, and to reflect the dynamics of
manufacturing.

63
Trends in Seasonality

We see by inspection that seasonal trends change
systematically over time, and can also change
abruptly.
We first estimate the principal components of
seasonal variation, using a version of principal
components analysis adapted to functional data,
and sensitive only to effects periodic over one
year.

64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
The Components

Relative sizes of spring and summer cycles (53)
Joint size of spring and summer cycles (25)
Size of fall cycle (11)

68
Plotting Component Scores

We can compute scores at each year for these
three principal components, sometimes called
empirical orthogonal functions.
Plotting the evolution of these scores over the
51 years shows some interesting structural
changes in the economics of everyday life.

69
(No Transcript)
70
(No Transcript)
71
Wrap-up

Phase-plane plots are good for inspecting
seasonal quasi-harmonic trends
Principal components analysis reveals main
components of variation in seasonal trend.
Plotting component scores shows how trend has
evolved.

This was joint with work with James B. Ramsey,
Dept. of Economics, New York University, and is
reported in
Ramsay, J. O. and Ramsey, J. B. (2001) Functional
data analysis of the dynamics of the monthly
index of non-durable goods production. Journal
of Econometrics, 107, 327-344.

73
Phase and Amplitude Variation in Montreal Weather

34 years of daily temperatures, 1961-1994
inclusive
Values are averages of daily maximum and minimum
12410 observations in tenths of a degree Celsius
Available for Montreal and 34 other Canadian
weather stations

74
(No Transcript)
75

We know that there are two kinds of variation in
these data
Amplitude variation day-to-day and year-to-year
variation in temperature at events such as the
depth of winter.
Phase variation the timing of these events --
the seasons arrive early in some years, and late
in others.

76
Goals

Separate phase variation from amplitude variation
by registering the series to its strictly
periodic image.
Estimate components of variation due to amplitude
and phase variation.

77
Smoothing

The registration process requires that we smooth
the data two ways
With an unconstrained smooth that removes the
day-to-day variation, but leaves longer-term
variation unchanged.
With a strictly periodic smooth that eliminates
all but strictly periodic trend.

78
Unconstrained smooth

Raw data are represented by a B-spline expansion
using 500 basis functions of order 6.
Knot about every 25 days.
The standard deviation of the raw data about this
smooth, adjusted for degrees of freedom, is 1.52
degrees Celsius.

79
(No Transcript)
80
Periodic smooth

The basis is Fourier, with 9 basis functions
judged to be enough to capture most of the
strictly periodic trend for a period of one year.
The standard deviation of the raw about data
about this smooth is 2.18 deg C.
Compare this to 2.07 deg C. for the unconstrained
smooth.

81
(No Transcript)
82

Plotting the unconstrained B-spline smooth minus
the constrained Fourier smooth reveals some
striking discrepancies.
We focus on Christmas, 1989. The Ramsays spent
the holidays in a chalet in the Townships, and
awoke to 37 deg C. No skiing, car dead,
marooned!
This temperature would still be cold in
mid-January, but less unusual.

83
(No Transcript)
84
Registration

Let the unconstrained smooth be x(t) and the
strictly periodic smooth be x0(t).
We need to estimate a nonlinear strictly
increasing smooth transformation of time h(t),
called a warping function, such that a fitting
criterion is minimized.

85
Fitting criterion
The fitting criterion was the smallest eigenvalue
of the matrix
This criterion measures the extent to which a
plot of xh(t) against x0(t) is linear, and thus
whether the two curves are in phase.
86
The warping function h(t)

Every smooth strictly monotone function h(t) such
that h(0) 0 can be represented as

We represent unconstrained function w(v) by a
B-spline expansion. Constant C is determined by
constraint h(T) T.
87
The deformation u(t) h(t) - t

Plotting this allows us to see when the seasons
come early (negative deformation) or late
(positive deformation).

88
(No Transcript)
89

Mid-winter for 1989-1990 arrived about 25 days
early.
The next step is to register the temperature data
by computing x(t) xh(t). The registered
curve x(t) contains only amplitude variation.
Registration was done by Matlab function
registerfd, available by ftp from
ego.psych.mcgill.ca/pub/ramsay/FDAfuns

90
(No Transcript)
91
Amplitude variation

The standard deviation of the difference between
the unconstrained smooth and the strictly
periodic smooth is 2.15 C.
The standard deviation of the difference between
the registered smooth and the period smooth is
1.73 C.
(2.152 1.732)/2.152 .35, the proportion of
the variation due to phase.

The standard deviation of the raw data around the
registered smooth is 2.13 C, compared with 2.07 C
for the unregistered smooth.
About 10 of the total variation is due to phase.

93
Conclusions

Phase variation is an important part of weather
behavior.
Statisticians seldom think about phase variation,
and classical time series methods ignore it
completely.
Phase variation needs more attention, and
registration is an essential tool.

94
Unique Aspects of Functional Data Analysis

The data are smooth, so we can use derivatives in
various ways.
Differential equations can play a big role.
Events in functional data occur over different
time scales.
Time itself may be an elastic medium, and vary
over functional observations.

95
Finding out More

Ramsay, J. O. and Silverman, B. W. (1997, 2004)
Functional Data Analysis. Springer.
Ramsay, J. O. and Silverman, B. W.
(2002) Applied Functional Data Analysis. Springer
Visit the FDA website www.psych.mcgill.ca/misc/fd
a/
Software in Matlab, R and S-PLUS available at
ego.psych.mcgill.ca/pub/ramsay

Write a Comment

User Comments (0)

About PowerShow.com

An Introduction to Functional Data Analysis PowerPoint PPT Presentation