Methods for selecting - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Methods for selecting

Description:

Dip.di Fisica 'E.Fermi' ... Extreme Value Theory is well established for univariate data. ... kurtosis. The Generating Function for detecting extremes ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 26
Provided by: environn
Category:

less

Transcript and Presenter's Notes

Title: Methods for selecting


1
Methods for selecting extreme events in
multivariate data
Alberto Bernacchia
Dip.di Fisica E.Fermi, Universita di Roma La
Sapienza
2
Summary
  • How to deal with extremes and multivariate data?
  • Extreme Value Theory is well established for
    univariate data.
  • For multivariate data, there is no finite
    parameterization of extreme value distributions
    (Coles 2001)
  • One possible approach is dimensionality
    reduction Principal Component Analysis (PCA) is
    a widely used technique, but is mostly affected
    by the center rather than the extremes of the
    density
  • Recently, NonLinear PCA (NLPCA) has been applied
    to geophysical data (Hsieh 2001), but results are
    neither robust nor reproducible
  • I define a new method for detecting extreme
    events (only) of multivariate densities,
    eliminating the drawbacks of NLPCA

3
Non Linear PCA
Kramer (1991)
  • It determines low-dimensional nonlinear
    representations of data
  • The projection over the nonlinear manifold is
    continuous
  • It reduces to PCA by tuning a control parameter

Hsieh (2001)
Feed-forward neural network
Bottleneck
Cost function
4
Non Linear PCA
  • It determines low-dimensional nonlinear
    representations of data
  • The projection over the nonlinear manifold is
    continuous
  • It reduces to PCA by tuning a control parameter

Hsieh (2001)
ENSO Sea Surface Temperatures
5
Non Linear PCA
  • It determines low-dimensional nonlinear
    representations of data
  • The projection over the nonlinear manifold is
    continuous
  • It reduces to PCA by tuning a control parameter

Hsieh (2001)
Lorenz attractor
6
Drawbacks
  • In order to be computationally accessible, it
    requires a preliminary dimensionality reduction,
    leaving part of the information

7
Drawbacks
  • Despite lowering the computational cost, the
    global solution (minimum) is often not
    accessible, and results cannot be rigorously
    reproduced

Local minima
Lorenz attractor
8
Drawbacks
  • The solution strongly depends on two parameters,
    leaving opened a fundamental ambiguity

Christiansen (2005)
NH wintertime geopotential height
9
Summary 2
  • NonLinear PCA (NLPCA) determines a
    low-dimensional (highly informative) nonlinear
    fit to data, but is ambiguous and frail
  • I want to find a robust dimensionality reduction
    method which accounts not for the entire body of
    data, but just for the extremes, i.e. a method
    able to detect a subspace of data where the
    extreme events are dense

10
The Generating Function for detecting extremes
Estimated Generating Function
Working assumption the vectors maximizing G, at
fixed modulus, are the directions of the extreme
events
modulus
unit vector
11
The Generating Function for detecting extremes
development in cumulants of the projection
variance
skewness
kurtosis
(mean0)
small y ? PCA
few data points contribute the sum
large y ? Finite-size effects
12
The Generating Function for detecting extremes
The value of y is fixed by a tolerance
finite-size error e2
For uncorrelated Normal data, it is given by
For exponentially correlated Normal data, one
must solve
13
An application El Nino Southern Oscillation
Sea Surface Temperatures (monthly anomalies),
1949-1999
27 x 9 grid (step 5o)
14
An application El Nino Southern Oscillation
Exponential correlations over short periods
15
An application El Nino Southern Oscillation
set e2 0.1
El Nino
La Nina
y 2.1
two solutions
Note the method is applied to the entire space
and not just to 1PC and 2PC
16
An application El Nino Southern Oscillation
Comparison with the NLPCA solution
La Nina
El Nino
Hsieh (2001)
17
An application El Nino Southern Oscillation
El Nino
NLPCA, Hsieh (2001)
Maximum of Generating Function
18
An application El Nino Southern Oscillation
La Nina
NLPCA, Hsieh (2001)
Maximum of Generating Function
19
Generalized Extreme Value fit of projected data
60 blocks, 10 data points each
El Nino
x -0.150.21
La Nina
x -0.300.13
20
Conclusions
  • Interesting directions in the space of data are
    obtained by the local maxima of the (biased)
    generating function in n dimensions
  • These directions are supposed to point towards
    the extreme events of the underlying (if any)
    stationary probability distribution
  • The method is computationally cheap, and has no
    free parameter (once the tolerance error is
    fixated)
  • La Nina is characterized by a finite lower bound
    of temperatures.
  • El Nino seems to have a finite tail, but the
    result is weakly significant

21
Application to Lorenz attractor
22
Application to Lorenz attractor
set e2 0.1
y ?
two solutions
23
Application to Lorenz attractor
set e2 0.1
y 3.7
two solutions
24
Drawbacks
  • For nearly isotropic data, it fits poorly even
    normal distributions

Christiansen (2005)
25
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com