Course on statistical modelling of Extreme Values RClim - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Course on statistical modelling of Extreme Values RClim

Description:

It has built-in functions and contributed libraries that can do most modern ... (e.g. mean, variance, min, max, skewness, quantile) for a given p x q x n ... – PowerPoint PPT presentation

Number of Views:554
Avg rating:3.0/5.0
Slides: 30
Provided by: dag4
Category:

less

Transcript and Presenter's Notes

Title: Course on statistical modelling of Extreme Values RClim


1
Course on statistical modelling of Extreme Values
RClim
  • Dag Johan Steinskog
  • NERSC - 28-31 August 2006

2
Outline
  • General about RClim
  • Useful links
  • Short about the methods
  • Graphical methods
  • Exercises

3
Why R?
  • Its one of the most powerful high-level
    languages for doing statistical analysis
  • It has built-in functions and contributed
    libraries that can do most modern statistical
    methods
  • It allows you to learn up-to-date statistical
    practices from its comprehensive online help
  • It enables you to communicate easily with
    statisticians
  • It runs on most operating systems (UNIX, PC, Mac)
  • It is freely available 26Mb download from
  • www.r-project.org

4
What is RClim?
  • Initiative by C. A. S. Coelho, C. A. T. Ferro, D.
    B. Stephenson and D. J. Steinskog.
  • Part of Delivery WP4.3 in ENSEMBLE (EU-project)
  • Goals
  • to develop statistical methods for doing spatial
    extremes.
  • read/write large gridded fields in netcdf format
  • do nice geographical contour maps
  • do general climate analysis at many grid points
  • Development
  • Spring 2005 Initiative started
  • Spring 2006 Completed as it is today
    implemeted in KNMIs Climate Explorer
  • August 2006 Paper submitted to Journal of
    Climate
  • Future development Will be updated and expanded
    in the future, including methods for doing daily
    data

5
How to get started with RClim?
  • Webpage http//www.met.reading.ac.uk/cag/rclim/
  • Packages that must be installed
  • rNetCDF
  • evd
  • ismev
  • maps
  • mapdata
  • mapproj
  • Source the rclim.txt to install all functions.
  • Can be found at http//www.nersc.no/dagjs/RCours
    e/rclim.txt

6
Short note about the methods
  • Four groups of methods included in RClim
  • Read/write data from/to netcdf files
  • General statistics
  • Extreme value statistics
  • Graphical methods
  • Tip Read and check out the examples in each of
    the functions.

7
Graphical methods
  • bluered - Generate a colour scale for plotting.
  • identifyfield - Identify points by clicking on an
    image plot of the first time slice of a given p x
    q x n three-dimensional array, or an image plot
    of a p x q two-dimensional matrix. Also apply a
    user specified function to each selected points
    of the three-dimensional array.
  • moviefield - Produce a movie of a given p x q x n
    three-dimensional array.
  • plotmap - Map two p x q dimensional data matrix
    in either cylindrical equidistant latitude and
    longitude projection or stereographic projection.
  • plotpc - Plot EOFs and Principal Components (PCs,
    time series) that are output of the eof.r
    function.
  • projmap - Generate a pixel plot of a matrix on a
    specified map projection stereographic only so
    far.
  • zoom - Zoom in on either a p x q two-dimensional
    map (matrix) or a p x q x n three-dimensional
    array (field), by selecting two corners that
    define a rectangular region of the space and a
    subset of time slices.

8
Example of graphical methods
9
General statistics
  • anomalyfield - Calculate anomalies by subtracting
    mean annual cycle or long-term mean of a
    three-dimensional p x q x n array of monthly mean
    values.
  • applyfield - Compute basic statistics (e.g. mean,
    variance, min, max, skewness, quantile) for a
    given p x q x n three-dimensional array.
  • applyfieldnew - Same as above, but also allows
    specification of fraction of non-missing values
    for the computation of the statistics. Grid
    points with larger fraction of missing values
    than specified are excluded.
  • corfield - Compute correlation between each point
    of a p x q x n three-dimensional array, and a
    vector of length n.
  • covfield - Compute covariance between each point
    of a p x q x n three-dimensional array, and a
    vector of length n.
  • detrend - Remove time trend from a time series.
  • detrendfield - Remove time trend at each grid
    point of a p x q x n three-dimensional array.

10
General statistics cont.
  • eof - Compute EOFs and PCs of a p x q x n
    three-dimensional array.

Read/write data
  • netcdfinfo - Extract data structure, names of
    variables and dimension information form netcdf
    file.
  • netcdfread - Extract longitude vector, latitude
    vector, time vector and data array from a netcdf
    file.
  • netcdfwrite - Write two-dimensional map or
    three-dimensional array of data, longitude
    vector, latitude vector and time vector into a
    netcdf file.

11
Extreme Value Statistics
  • acs - Compute average cluster size for a given
    three-dimensional p x q x n array of excesses.
    First two dimensions p and q are space dimensions
    (e.g. longitude and latitude). Third dimension n
    is time.

12
Extreme Value Statistics
  • xpareto - Compute shape and scale parameters of a
    Generalized Pareto Distribution for a given p x q
    x n three-dimensional array.
  • which is defined for zgt0 and 1?z/sgt0, where sgt0
    is the scale parameter and ? is the shape
    parameter. In this presentation maximum
    likelihood estimation is used to estimate the
    parameters.
  • Shape form of tail
  • Scale variability of extremes

13
Extreme Value Statistics
14
Extreme Value Statistics
  • xparetotvt - Fit Generalized Pareto distribution
    with time-varying threshold at each grid point
    for a given p x q x n three-dimensional array of
    montly data.

Shape parameter time varying threshold
15
Extreme Value Statistics
  • boundexcesses - Compute upper bound of excesses
    for a given 2 x p x q array of Generalized Pareto
    distribution parameters. First index of the first
    dimension of the array represents the scale
    parameter. Second index of the first dimension of
    the array represents the shape parameter.
  • Defined as s/?
  • Regions with null and positive shape parameters
    have no bound (i.e. have infinite upper tail).

16
Extreme Value Statistics
17
Extreme Value Statistics
  • returnperiod - Compute return period for a given
    p x q matrix of excesses and a given 2 x p x q
    array of Generalized Pareto distribution
    parameters.

18
Extreme Value Statistics
  • tvt - Compute time-varying threshold for a given
    monthly time series.

19
Extreme Value Statistics
  • xdependence - Compute extreme dependence measures
    between a given p x q x n three-dimensional array
    and a given time series of length n.
  • Assume that we are interested e.g. in
    investigating how extreme temperature at one
    place are related to extreme temperature at
    another place
  • The statistics provides a measure of extreme
    dependence for asymptotically dependent
    distributions.

20
Extreme Value Statistics
  • However, X fails to provide information of
    discrimination for asymptotically independent
    distributions (Coles, 2001).
  • Alternative method suggested
  • Defined for the threshold on the range 0ltult1. The
    statistics ranges from -1 to 1.

21
Extreme Value Statistics
  • xdependence1 - Same as above, but also allows
    specification of fraction of non-missing values
    for the computation of the statistics. Grid
    points with larger fraction of missing values
    than specified are excluded.

22
Extreme Value Statistics
  • xexcess - Compute mean excess and variance of
    excess for a given n x p x q three-dimensional.
  • xindex - Compute the intervals estimator for the
    extremal index, an index for time clusters, for a
    given time series and threshold.
  • xindexfield - Compute the intervals estimator for
    the extremal index at each grid point of a p x q
    x n three-dimensional array and with a given
    threshold.
  • mygpd.fit - Same as gpd.fit function from ismev
    package but with standard error calculation
    disactivated.
  • Maximum-likelihood fitting for the GPD model,
    including generalized linear modelling of each
    parameter.

23
Extreme Value Statistics
  • xgev - Compute location, shape and scale
    parameters of a Generalized Extreme Value
    Distribution for block annual maxima or minima of
    a given p x q x n three-dimensional array.

24
Extreme Value Statistics
  • xparetotvtcov - Fit Generalized Pareto
    distribution with time-varying threshold at each
    grid point for a given p x q x n
    three-dimensional array of montly data. Allows
    linear modelling of the parameters.
  • The relationship between extremes and factors
    (e.g. time and ENSO) can be examined by modelling
    the shape and scale parameters of the GP
    distribution as functions of these factors.
  • For instance the following model can be used to
    analyse how the variability of summer
    temperature excesses is related to ENSO

25
Extreme Value Statistics
26
Simple demonstartion of Rclim
  • Simple study of monthly mean 2 metre temperature
  • ERA 40 reanalysis data from September 1957 to
    August 2002
  • Simple plotting and EVD analysis

27
Exercises
  • Exercise 1
  • Simplified study of the heat wave in Europe
    summer 2003.
  • Based on paper by Coelho et al. (to be submitted
    in August 2006)
  • Text www.nersc.no/dagjs/RCource/ex1.pdf
  • Exercise 2
  • Simplified study of the heat wave in Svalbard
    spring 2006
  • Text www.nersc.no/dagjs/RCource/ex2.pdf

28
Useful links
  • R software
  • http//www.r-project.org
  • Rclim in general
  • http//www.met.reading.ac.uk/cag/rclim/
  • Notes and exercises for this course
  • http//www.nersc.no/dagjs/RCourse

29
Thanks!
"It seems that the rivers know the theory. It
only remains to convince the engineers of the
validity of this analysis! Emil Gumbel
Write a Comment
User Comments (0)
About PowerShow.com