P presentation | free to view

About This Presentation

Transcript and Presenter's Notes

Title: P

1
Nonlinear time series for modelling river flows
and their extremes

Péter Elek and László Márkus
Dept. Probability Theory and Statistics
Eötvös Loránd University
Budapest, Hungary

2
River Tisza and its aquifer
3
Tisza produces devastating floods
4
(No Transcript)
5
Water discharge at Vásárosnamény(We have 5 more
monitoring sites)from1901-2000
6
Basic properties of the series

Series exhibit
slight upward trend in mean
strong seasonal component both in mean and
variance
Seasonal components can be removed by a local
polinomial fitting (LOESS) procedure
The distribution of the standardized series are
still periodic.
The distribution is skewed.

7
Empirical and smoothed seasonal components
8
Bi-monthly distributions
9
Autocorrelation function is slowly decaying
10
How long does the river remember...?

Long memory - known ever since Hursts original
work on the Aswan dam (1952) of the Nile River
Short and long memory - autocorrelations die out
at exponential versus hyperbolic rate
Equivalent for long memory - spectral density has
pole at zero, meaning it tends to infinity at
zero at polynomial speed.

11
Long memory processes

Short and long memory - autocorrelations die out
at exponential versus hyperbolic rate
Equivalent for long memory - spectral density has
pole at zero, meaning it tends to infinity at
zero at polynomial speed.

12
Indicators of long memory

Nonparametric statistics
Rescaled adjusted range or R/S
Classical
Los (test)
Taqqus graphical (robust)
Variance plot
Log-periodogram (Geweke-Porter Hudak)

Classic R/S biased, lack of distribution theory,
no test or confidence intervals, sensitive for
short mem component if present.
Los modification - convergence to Brownian
Bridge. Test of long memory often accepts short
memory when long is present, but reliably rejects
long mem when only short is present.
Taqqus graphical R/S get out the most of the
data drop blocks from the beginnig, reestimate
R/S from the shorter data, log-log plot all
together and fit a straight line (regression)

14
(No Transcript)
15
Variance plot

The variance of the mean tends to zero as n2H-2
Estimate the variance of means from the
aggregated processes
Fit a straight line on the log-log scale
Determine the slope from a linear regression

16
The log - spectrum

The order of the pole of the spectral density is
1-2H
On the log-log scaleplot of the periodogram this
means a straight line of steepness 1-2H
The celebrated Geweke - Porter Hudak statistics
uses log(sin2(?/2)) instead of the log of the
frequencies.

17
(No Transcript)
18
Linear long-memory model fractional
ARIMA-process (Montanari et al., Lago Maggiore,
1997)

Fractional ARIMA-model
Fitting is done by Whittle-estimator
based on the empirical and theoretical
periodogram
quite robust consistent and asymptotically
normal for linear processes driven by innovatons
with finite forth moments (Giraitis and
Surgailis, 1990)

19
Results of fractional ARIMA fit

H0.846 (standard error 0.014)
p-value 0.558 (indicates goodness of fit)
Innovations can be reconstructed using a linear
filter (the inverse of the filter above)

20
Reconstruct the innovation from the fitted model
21
The density of the innovations
22
Reconstructed innovations are uncorrelated...

But not independent

23
Simulations using i.i.d. innovations

If we assume that innovations are i.i.d, we can
generate synthetic series
Use resampling to generate synthetic innovations
Apply then the linear filter
Add the sesonal components to get a synthetic
streamflow series
But these series do not approximate well the
high quantiles of the original series

24
But they fail to catch the densities and
underestimate the high quantiles of the original
series
25
Logarithmic linear model

It is quite common to take logarithm in order to
get closer to the normal distribution
It is indeed the case here as well
Even the simulated quantiles from a fitted linear
model seem to be almost acceptable

But the backtransformed quantiles are clearly
unrealistic.

27
Lets have a closer look at innovations

Innovations can be regarded as shocks to the
linear system
Few properties
Squared and absolute values are autocorrelated
Skewed and peaked marginal distribution
There are periods of high and low variance
All these point to a GARCH-type model
The classical GARCH is far too heavy tailed to
our purposes

28
Simulation from the GARCH-process

Simulations
Generate i.i.d. series from the estimated
GARCH-residuals
Then simulate the GARCH(1,1) process using these
residuals
Apply the linear filter and add the seasonalities
The simulated series are much heavier-tailed than
the original series

29
General form of GARCH-models

Need a model between
i.i.d.-driven FARIMA-series and
GARCH(1,1)-driven FARIMA-series
General form of GARCH-models

30
A smooth transition GARCH-model
31
ACF of GARCH-residuals
32
Results of simulationsat Vásárosnamény
33
Back to the original GARCH philosophy

The above described GARCH model is somewhat
artificial, and hard to find heuristic
explanations for it
why does the conditional variance depend on the
innovations of the linear filter?
in the original GARCH-context the variance is
dependent on the lagged values of the process
itself.
A possible solution condition the variance on
the lagged discharge process instead !
The fractional integration does not seem to be
necessary
almost the same innovations as from an ARMA(3,1)
In extreme value theory long memory in linear
models does not make a difference

34
Estimated variance of innovations plotted against
the lagged discharge

Spectacularly linear relationship
This approves the new modelling attempt
Distorted at sites with damming along Tisza River

35
The variance is not conditional on the lagged
innovation but it is conditional on the lagged
water discharge.
36

Theoretical problems arise in the new model
Existence of stationary solution
Finiteness of all moments
Consistence and asymptotic normality of quasi
max-likelihood estimators
Heuristically clearer explanation can be given
The discharge is indicative of the saturation of
the watershed
A saturated watershed results in more
straightforward reach for precipitation to the
river, hence an increase in the water supply.
A saturated watershed gives away water quicker.
The possible changes are greater and so is the
uncertainty for the next discharge value.

37
An example ZtN(0,1) c20, a10.95, ?01, ?12,
m1
38
Existence and moments of the stationary solution

We assume that ct constant
The model has a unique stationary solution if the
corresponding ARMA-model is stationary
i.e. all roots of the characteristic equation lie
within the unit circle
Moreover, if the m-th moment of Zt is finite then
the same holds for the stationary distribution of
Xt, too.
These are in contrast to the usual, quadratic
ARCH-type innovations. There the condition for
stationarity is more complicated and not all
moments of the stationary distribution are finite.

39
Sketch of the proof I.

The process can be embedded into a
(pq)-dimensional Markov-chain YtAYt-1Et
where Yt(Xt-c, Xt-1-c,...Xt-p1-c, et, et-1,...,
et-q1) and Et(et, 0,...).
Yt is aperiodic and irreducible (under some
technical conditions).
General condition for geometric ergodicity and
hence for existence of a unique stationary
distribution (Meyn and Tweedie, 1993) there
exists a V?1 function with 0lt?lt1, blt? and C
compact set
E( V(Y1) Y0y ) ? (1-?) V(y) b IC(y)
In other words V is bounded on a compact set
and is a contraction outside it.
Moreover E?(V(Yt)) is finite (? is the
stationary distribution).

40
Sketch of the proof II.

In the given case
if E(Ztm) is finite,
V(y) 1 QPymm will suffice
where
BPAP-1 is the real valued block
Jordan-decomposition of A
and Q is an appropriately chosen diagonal matrix
with positive elements.
This also implies the finiteness of the mth
moment of Xt.

41
Estimation

Estimation of the ARMA-filter can be carried out
by least squares.
Essentially only the uncorrelatedness of
innovations is needed for consistency.
Additional moment condition is needed for
asymptotic normality (e.g. Francq and Zakoian,
1998).
The ARCH-equation is estimated by quasi maximum
likelihood (assuming that Zt is Gaussian), using
the ?t innovations calculated from the
ARMA-filter.
The QML estimator of the ARCH-parameters is
consistent and asymptotically normal under mild
conditions (Zt does not need to be Gaussian).

42
Estimation of the ARCH-equation in the case of
known ?t innovations(along the lines of
Kristensen and Rahbek, 2005)

Maximising the Gaussian log-likelihood
we obtain the QML-estimator of ?.
For simplicity we assume that ?0? ?mingt0 in the
whole parameter space.

43
Consistency of the estimator

By the ergodic theorem
It is easy to check that L(a)ltL(a0) for all a?a0
where a0 denotes the true parameter value.
All other conditions of the usual consistency
result for QML (e.g. Pfanzagl, 1969) are
satisfied hence the estimator is consistent.

44
Asymptotic normality I.

Using the notations
for the derivatives of the log-likelihood
for the information matrix
and for the expected Hessian

45
Asymptotic normality II.

A standard Taylor-expansion implies
Finiteness of the fourth moment with a martingale
central limit theorem yields
Moreover, the asymptotic covariance matrix can be
consistently estimated by the empirical
counterparts of H and V.

46
Estimation of the ARCH-equation when ?t is not
known

In this case the innovations of the ARMA-model
are calculated using the estimated
ARMA-parameters
If the ARMA-parameter vector is estimated
consistently, the mean difference of squared
innovations tends to zero
If the ARMA parameter estimate is asymptotically
normal, a stronger statement holds

47
Consistency in the case of estimated innovations

Now the following expression is maximised
But the difference tends to zero (uniformly on
the parameter space)
which then yields consistency of the estimate of
?.

48
Asymptotic normality in the case of estimated
innovations

Under some moment conditions the least squares
estimate of the ARMA-parameters is asymptotically
normal, hence
This way the differences between the first and,
respectively, the second derivatives both
converge to zero
As a result, all the arguments for asymptotic
normality given above remain valid.

49
Estimation results
Station m a0 a1
Komárom 789 1807.8 (2009.8) 26.06 (2.22)
Nagymaros 586 544.7 (154.1) 11.95 (0.57)
Budapest 580 907.1 (314.0) 10.29 (0.55)
Tivadar 23 24,49 (5,95) 18,80 (1,13)
Namény 30 82,45 (32,82) 20,71 (0,51)
Záhony 45 67,04 (17,31) 12,37 (1,19)
50
How to simulate the residuals of the new
GARCH-type model

Residuals are highly skewed and peaked.
Simulation
Use resampling to simulate from the central
quantiles of the distribution
Use Generalized Pareto distribution to simulate
from upper and lower quantiles
Use periodic monthly densities

51
The simulation process

resampling and GPD
Zt
smoothed GARCH
?t
FARIMA filter
Xt
Seasonal filter
52
Evaluating the model fit

Independence of residual series
ACF, extremal clustering
Fit of probability density and high quantiles
Variance lagged discharge relationship
Extremal index
Level exceedance times
Flood volume distribution

53
ACF of original and squared innovation series
residual series
54
Results of new simulationsat Vásárosnamény
55
Densities and quantiles at all 6 locations
56
Reestimated (from the fitted model)
discharge-variance relationship
57
Seasonalities of extremes

The seasonal appearance of the highest values
(upper 1) of the simulated processes follows
closely the same for the observed one.

58
Extremal index to measure the clustering of high
values

Estimated for the observed and simulated
processes containing all seasonal components
Estimation by block method
Value of block length changes from 0.1 to 1.
Value of threshold ranges from 95 to 99.9.

59
Estimated extremal indices displayed
60
Extremal indices in the threshold GARCH model
61
Extremal indices in the discharge dependent GARCH
model
62
Level exceedance times

The distribution of the level exceedance times
may serve as a further goodness of fit measure.
It has an enormous importance as it represents
the time until the dams should stand high water
pressure.

63
Flood volume distribution

The match of the empirical and simulated flood
volume distributions also approve the good fit.

64
Multivariate modelling

Final aim to model the runoff processes
simultaneously
Nonlinear interdependence and non-Gaussianity
should be addressed here, too
First, the joint behaviour of the discharges
inflowing into Hungary should be modelled
Differential equation-oriented models of
conventional hydrology may be used to describe
downstream evolution of runoffs
Now we concentrate on joint modelling of two
rivers Tisza (at Tivadar) and Szamos (at
Csenger)

65
Issues of joint modelling

Measures of linear interdependences (the
cross-correlations) are likely to be
insufficient.
High runoffs appear to be more synchronized on
the two rivers than small ones
The reason may be the common generating weather
patterns for high flows
This requires a non-conventional analysis of the
dependence structure of the observed series

66
Basic statistics of Tivadar (Tisza) and Csenger
(Szamos)

The model described previously was applied to
both rivers
Correlations between the series of raw values,
innovations and residuals are highest when either
series at Tivadar are lagged by one day
Correlations
Raw discharges 0.79
Deseasonalized data 0.77
Innovations 0.40
Residuals 0.48
Conditional variances 0.84

67
Displaying the nature of interdependence

The joint plot may not be informative because of
the highly non-Gaussian distributions
Transform the marginals into uniform
distributions (produce the so-called copula),
then the scatterplot is more informative on the
joint behaviour
The strange behaviour of the copula of the
innovations is characterized by the concentration
of points
1. at the main diagonal, and especially at the
upper right corner (tail dependence)
2. at the upper left (and the lower right)
corner(s)
Linearly dependent variables do not display this
type of copula
The GARCH-residuals lack the second type of
irregularity

2
1
68
A possible explanation of this type of
interdependence

The variance process is essentially common for
the two rivers
This is justified by the high correlation (0.84)
Generate two linearly interdependent residual
series (correlation0.48)
Multiply by the common standard deviation
distributed as Gamma
Observe the obtained copula
This justifies the hypothesis that the common
variance causes the nonlinear interdependence of
the given type

69
Tail dependence
70
Conclusion

Long range dependency does not have much impact
on extremes of river discharges
Nonlinearities are influental and have to be
accounted for
The proposed model has a stationary solution
Its estimation is consistent and asymptotically
normal
The suggested model catches densities and
quantiles well
The model fitting procedure does not optimise on
quantiles and maxima clustering, therefore their
fit really shows model adequacy
Other fit-independent measures include level
exceedence and flood volume distributions
Something different has to be done with dammed
water

71
Possible refinements of the model

The simulated process is still slightly heavier
tailed than the original one.
The conditional distribution of residuals depends
on the lagged water discharge
very high residual values occur at low discharge
Possible solutions
innovation as a mixture of a GARCH-part and of a
discharge-independent part
Markov-switching model with discharge-dependent
transition probabilities

72
Thank you for your attention!

Write a Comment

User Comments (0)

About PowerShow.com

P PowerPoint PPT Presentation