EXTREMEVALUE ANALYSIS: FOCUSING ON THE FIT AND THE CONDITIONS, WITH HYDROLOGICAL APPLICATIONS - PowerPoint PPT Presentation

About This Presentation
Title:

EXTREMEVALUE ANALYSIS: FOCUSING ON THE FIT AND THE CONDITIONS, WITH HYDROLOGICAL APPLICATIONS

Description:

'EXTREME-VALUE ANALYSIS: FOCUSING ON THE FIT AND THE CONDITIONS, WITH HYDROLOGICAL APPLICATIONS' ... 95% quantile. AR(1) sequences. Sample mean. 95% quantile ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 38
Provided by: Fuji249
Category:

less

Transcript and Presenter's Notes

Title: EXTREMEVALUE ANALYSIS: FOCUSING ON THE FIT AND THE CONDITIONS, WITH HYDROLOGICAL APPLICATIONS


1
EXTREME-VALUE ANALYSIS FOCUSING ON THE FIT AND
THE CONDITIONS, WITH HYDROLOGICAL APPLICATIONS
Dávid Bozsó, Pál Rakonczai, András
Zempléni Eötvös Loránd University, Budapest
4th Conference on Extreme Value
AnalysisProbabilistic and Statistical Models and
their Applications
2
Table of contents
  • Goodness of fit procedures
  • Checking the conditions D, D(un) and D(un).
  • Multivariate problems
  • Copulas
  • Simulations
  • Goodness of fit tests for copulas
  • Time dependence
  • Hydrological applications

3
Generalized Pareto distribution
  • Peaks over a sufficiently high threshold u can be
    modeled by the generalized Pareto distribution
    (under mild conditions)

  • Appropriate threshold selection is very important

4
Goodness of fit in univariate threshold models
  • Usual goodness-of-fit tests (Chi-squared,
    Kolmogorov-Smirnov) are not sensitive for the
    tails
  • A better alternative is the Anderson-Darling test
  • ,
  • where the discrepancies near the tails get
    larger weights. Its computation

5
Goodness of fit - continued
  • Modification often the focus is on one tail only
  • For maximum
  • (Zempléni, 2004)
  • Computation
  • Critical values can be simulated (like in
    Choulakian and Stevens, 2001)

6
Finding thresholds
  • Theoritical results related to GPD are doubly
    asymptotic, since not only the sample size but
    the threshold has to converge to infinity as well
  • How can we find suitable thresholds?
  • Suggestion
  • Increase the threshold level step by step
  • Fit the GPD (by ML method for example) and
    perform AD-type tests in all of the cases
  • Select some levels, for which the fit is
    acceptable
  • For more details, see Bozsó et al, 2005

7
Hydrological applications
  • Daily water level data from several stations
    along the river Tisza were given (time span more
    than 100 years)
  • As an illustration we have chosen Szeged station,
    but in fact we have repeated the suggested
    procedures (almost) automatically for all the
    stations
  • In later parts of the talk we shall also use data
    from Csenger (river Szamos)

8
Finding thresholds
9
Focusing on the conditions
  • So far
  • Threshold selection
  • Fit a GPD model for data over the selected
    threshold
  • for iid data
  • Dependence is present
  • Possible long range dependence?
  • Are the return levels affected by it?

10
Condition D and D(un)
11
How to check condition D ?
  • Set p1 and r1 in the definition of condition D
    and choose threshold u as the level of interest,
    e.g. 400 or 430 cm in our example
  • Calculate
  • for each lag l1,,1000

12
Applications daily water level data
  • 400 cm 80 quantile
  • 430 cm 83 quantile
  • Compare with d(l) for well-known sequences
  • iid, normally distributed sequence
  • AR(1) series

13
Applications daily water level data
  • Hydrological data (level 430
    cm)
  • Normal iid sequences
  • Sample mean
  • 95 quantile
  • AR(1) sequences
  • Sample mean
  • 95 quantile
  • Simulation study confirms our hypothesis,
    empirical data is in the 95 confidence interval

14
Condition D(un)
  • Practical procedure select a sequence (un),
    calculate
  • and plot it as a function of k

15
Applications daily water level data
  • Hydrological data
  • Normal iid sequence
  • Sample mean
  • Ynmax(Xn,Xn1), where X2 has a standard normal
    distribution
  • Sample mean

16
Multivariate models
  • Copulas are very useful tools for investigating
    dependence among the coordinates of multivariate
    observations
  • The marginal distributions and the dependence
    structure can be modeled separately!
  • Which parametric models to use for the
    hydrological applications? (in two dimensions)

17
Hydrological applications
  • Water level peaks measured in two different
    stations are shown (peaks were coupled to each
    other if occured nearer than one month)
  • With the help of the earlier algorithm we can
    choose threshold levels (blue lines) and fit GPD
    to the marginals

Only those peaks are used, which are extremal in
both coordinates!
18
QQ-Plot for marginals
19
Empirical copula
  • After transforming the data into uniform
    marginals the empirical copula is obtained
  • Which parametric copula is the most adequate
    for the given application?

20
Conceivable copulas in 2D
  • Elliptical copulas
  • Gauss
  • Student-t
  • Archimedian copulas
  • Gumbel
  • Clayton
  • Other copulas
  • Frechet

21
Simulation - Gauss
22
Simulation Student-t
23
Simulation Clayton I.
24
Simulation Clayton II.
25
Simulation Gumbel
26
Goodness of fit for copulas
  • Cramér-von Mises and Kolmogorov-Smirnov
    functionals of might be used to test the
    null hypotesis
  • A simple approach, which is based on the
    multivariate probability integral transformation
    of F, is defined by
  • where (U1,...,Ud ) is a vector of uniform
    variables having C as their joint distribution

27
Visual comparison
  • Genest et al (2003) proposed a graphical
    procedure for model selection through the visual
    comparison of the non-parametric estimate Kn(.)
    of K to the parametric estimate K(?n,.)
  • ,where
  • The better the fit is, the closer the graphs of
    these functions are
  • Question how to define the distance between the
    graphs?

28
Weighted quadratic differences
29
Which weights to use?
  • In order to compare which test statistics
    performs better at detecting discrepancies in the
    upper tail we applied the following algorithm
  • 1. Simulate a sample from a parametric copula
  • 2. Randomly choose two not concordant points
    (x,y) near the right tail and permute their
    coordinates so that the new points x,y are
    concordant (the marginals do not change) but
    the copula changes
  • 3. Perform the three versions of the test for the
    modified data set
  • 4. Repeat steps 2 and 3, and investigate which
    statistics is faster in detecting the changes

30
The data and its permutations
The number in the title gives the number
of changed pairs
31
Detecting changes
  • In general the tests based on weigthed squared
    deviation perform better than the original one..
  • Among the two weighted tests, the modified
    version is more sensible!

32
Simulation results
  • We recorded how many steps the different tests
    needed to detect the changes during the
    replications
  • As expected, the modified weights were the best!

33
Time dependence
  • Has the dependence structure of the observations
    changed in the last century?
  • Windows of 80 years with a step size of 5 years
    were used to detect possible changes
  • Firstly we have to decide which copula to use

34
Time dependence
  • In all of the three cases the Gumbel copula
    seems to be better than Frechet!

35
Simulated critical values
36
Applications for the hydrological data set time
dependence
  • The only (marginally) significant value is marked
    with
  • A simulation study may be used for detecting
    changes in the dependence
  • structure

37
References
  • Bozsó, D., Rakonczai, P. and Zempléni, A. (2005).
    Floods on river Tisza and some of its affluents.
    Extreme-value modelling in practice. Statisztikai
    Szemle, accepted for publication. (In Hungarian.)
  • Choulakian, V. and Stephens, M.A. (2001).
    Goodness-of-fit tests for the genaralized Pareto
    distribution. Technometrics 43, 478-484.
  • DAgostino, R.B. and Stephens, M.A. (1986).
    Goodnes-of-fit Techniques. Marcell Dekker.
  • Genest, C. Quessy, J.-F. and Rémillard, B.
    (2003). Goodnes-of-fit Procedures for Copula
    Models Based on the Integral Probability
    Transformation. GERAD.
  • Leadbetter, M. R. - Lindgren, G. and Rootzen, H.
    (1983). Extremes and Related Properties of Random
    Sequences and Processes, Springer.
  • Zempléni, A. (2004). Goodness-of-fit test in
    extreme value applications. Discussion paper No.
    383, SFB 386, Statistische Analyse Diskreter
    Strukturen, TU München.
Write a Comment
User Comments (0)
About PowerShow.com