Title: P
 1 Nonlinear time series for modelling river flows 
and their extremes
- Péter Elek and László Márkus 
 - Dept. Probability Theory and Statistics 
 - Eötvös Loránd University 
 - Budapest, Hungary 
 
  2River Tisza and its aquifer 
 3Tisza produces devastating floods 
 4(No Transcript) 
 5Water discharge at Vásárosnamény(We have 5 more 
monitoring sites)from1901-2000 
 6Basic properties of the series
- Series exhibit 
 - slight upward trend in mean 
 - strong seasonal component both in mean and 
variance  - Seasonal components can be removed by a local 
polinomial fitting (LOESS) procedure  - The distribution of the standardized series are 
still periodic.  - The distribution is skewed.
 
  7Empirical and smoothed seasonal components 
 8Bi-monthly distributions 
 9Autocorrelation function is slowly decaying 
 10How long does the river remember...?
- Long memory - known ever since Hursts original 
work on the Aswan dam (1952) of the Nile River  - Short and long memory - autocorrelations die out 
at exponential versus hyperbolic rate  - Equivalent for long memory - spectral density has 
pole at zero, meaning it tends to infinity at 
zero at polynomial speed.  
  11Long memory processes
- Short and long memory - autocorrelations die out 
at exponential versus hyperbolic rate  - Equivalent for long memory - spectral density has 
pole at zero, meaning it tends to infinity at 
zero at polynomial speed. 
  12Indicators of long memory
- Nonparametric statistics 
 - Rescaled adjusted range or R/S 
 - Classical 
 - Los (test) 
 - Taqqus graphical (robust) 
 - Variance plot 
 - Log-periodogram (Geweke-Porter Hudak) 
 
  13- Classic R/S biased, lack of distribution theory, 
no test or confidence intervals, sensitive for 
short mem component if present.  - Los modification - convergence to Brownian 
Bridge. Test of long memory often accepts short 
memory when long is present, but reliably rejects 
long mem when only short is present.  - Taqqus graphical R/S get out the most of the 
data drop blocks from the beginnig, reestimate 
R/S from the shorter data, log-log plot all 
together and fit a straight line (regression) 
  14(No Transcript) 
 15Variance plot
- The variance of the mean tends to zero as n2H-2 
 - Estimate the variance of means from the 
aggregated processes  - Fit a straight line on the log-log scale 
 - Determine the slope from a linear regression
 
  16The log - spectrum
- The order of the pole of the spectral density is 
1-2H  - On the log-log scaleplot of the periodogram this 
means a straight line of steepness 1-2H  - The celebrated Geweke - Porter Hudak statistics 
uses log(sin2(?/2)) instead of the log of the 
frequencies. 
  17(No Transcript) 
 18Linear long-memory model   fractional 
ARIMA-process (Montanari et al., Lago Maggiore, 
1997)
- Fractional ARIMA-model 
 -  
 - Fitting is done by Whittle-estimator 
 - based on the empirical and theoretical 
periodogram  - quite robust consistent and asymptotically 
normal for linear processes driven by innovatons 
with finite forth moments (Giraitis and 
Surgailis, 1990)  
  19Results of fractional ARIMA fit
- H0.846 (standard error 0.014) 
 - p-value 0.558 (indicates goodness of fit) 
 - Innovations can be reconstructed using a linear 
filter (the inverse of the filter above)  
  20Reconstruct the innovation from the fitted model 
 21The density of the innovations 
 22Reconstructed innovations are uncorrelated...
  23Simulations using i.i.d. innovations
- If we assume that innovations are i.i.d, we can 
generate synthetic series  - Use resampling to generate synthetic innovations 
  - Apply then the linear filter 
 - Add the sesonal components to get a synthetic 
streamflow series  - But these series do not approximate well the 
high quantiles of the original series  
  24But they fail to catch the densities and 
underestimate the high quantiles of the original 
series 
 25Logarithmic linear model
- It is quite common to take logarithm in order to 
get closer to the normal distribution  - It is indeed the case here as well 
 - Even the simulated quantiles from a fitted linear 
model seem to be almost acceptable 
  26- But the backtransformed quantiles are clearly 
unrealistic. 
  27Lets have a closer look at innovations
- Innovations can be regarded as shocks to the 
linear system  - Few properties 
 - Squared and absolute values are autocorrelated 
 - Skewed and peaked marginal distribution 
 - There are periods of high and low variance 
 - All these point to a GARCH-type model 
 - The classical GARCH is far too heavy tailed to 
our purposes 
  28Simulation from the GARCH-process
- Simulations 
 - Generate i.i.d. series from the estimated 
GARCH-residuals  - Then simulate the GARCH(1,1) process using these 
residuals  - Apply the linear filter and add the seasonalities 
 - The simulated series are much heavier-tailed than 
the original series  
  29General form of GARCH-models
- Need a model between 
 - i.i.d.-driven FARIMA-series and 
 - GARCH(1,1)-driven FARIMA-series 
 - General form of GARCH-models 
 
  30A smooth transition GARCH-model 
 31ACF of GARCH-residuals 
 32Results of simulationsat Vásárosnamény 
 33Back to the original GARCH philosophy
- The above described GARCH model is somewhat 
artificial, and hard to find heuristic 
explanations for it  - why does the conditional variance depend on the 
innovations of the linear filter?  - in the original GARCH-context the variance is 
dependent on the lagged values of the process 
itself.  - A possible solution condition the variance on 
the lagged discharge process instead !  - The fractional integration does not seem to be 
necessary  - almost the same innovations as from an ARMA(3,1) 
 - In extreme value theory long memory in linear 
models does not make a difference  
  34Estimated variance of innovations plotted against 
the lagged discharge
- Spectacularly linear relationship 
 - This approves the new modelling attempt 
 - Distorted at sites with damming along Tisza River
 
  35  The variance is not conditional on the lagged 
innovation but it is conditional on the lagged 
water discharge.  
 36- Theoretical problems arise in the new model 
 - Existence of stationary solution 
 - Finiteness of all moments 
 - Consistence and asymptotic normality of quasi 
max-likelihood estimators  - Heuristically clearer explanation can be given 
 - The discharge is indicative of the saturation of 
the watershed  - A saturated watershed results in more 
straightforward reach for precipitation to the 
river, hence an increase in the water supply.  - A saturated watershed gives away water quicker. 
 - The possible changes are greater and so is the 
uncertainty for the next discharge value. 
  37An example ZtN(0,1) c20, a10.95, ?01, ?12, 
m1 
 38Existence and moments of the stationary solution
- We assume that ct  constant 
 - The model has a unique stationary solution if the 
corresponding ARMA-model is stationary  - i.e. all roots of the characteristic equation lie 
within the unit circle  - Moreover, if the m-th moment of Zt is finite then 
the same holds for the stationary distribution of 
Xt, too.  - These are in contrast to the usual, quadratic 
ARCH-type innovations. There the condition for 
stationarity is more complicated and not all 
moments of the stationary distribution are finite. 
  39Sketch of the proof I.
- The process can be embedded into a 
(pq)-dimensional Markov-chain YtAYt-1Et  - where Yt(Xt-c, Xt-1-c,...Xt-p1-c, et, et-1,..., 
et-q1) and Et(et, 0,...).  - Yt is aperiodic and irreducible (under some 
technical conditions).  - General condition for geometric ergodicity and 
hence for existence of a unique stationary 
distribution (Meyn and Tweedie, 1993) there 
exists a V?1 function with 0lt?lt1, blt? and C 
compact set  - E( V(Y1)  Y0y ) ? (1-?) V(y)  b IC(y) 
 - In other words V is bounded on a compact set 
and is a contraction outside it.  - Moreover E?(V(Yt)) is finite (? is the 
stationary distribution). 
  40Sketch of the proof II.
- In the given case 
 - if E(Ztm) is finite, 
 - V(y)  1  QPymm will suffice 
 - where 
 - BPAP-1 is the real valued block 
Jordan-decomposition of A  - and Q is an appropriately chosen diagonal matrix 
with positive elements.  - This also implies the finiteness of the mth 
moment of Xt. 
  41Estimation
- Estimation of the ARMA-filter can be carried out 
by least squares.  - Essentially only the uncorrelatedness of 
innovations is needed for consistency.  - Additional moment condition is needed for 
asymptotic normality (e.g. Francq and Zakoian, 
1998).  - The ARCH-equation is estimated by quasi maximum 
likelihood (assuming that Zt is Gaussian), using 
the ?t innovations calculated from the 
ARMA-filter.  - The QML estimator of the ARCH-parameters is 
consistent and asymptotically normal under mild 
conditions (Zt does not need to be Gaussian). 
  42Estimation of the ARCH-equation in the case of 
known ?t innovations(along the lines of 
Kristensen and Rahbek, 2005) 
- Maximising the Gaussian log-likelihood 
 - we obtain the QML-estimator of ?. 
 - For simplicity we assume that ?0? ?mingt0 in the 
whole parameter space. 
  43Consistency of the estimator
- By the ergodic theorem 
 - It is easy to check that L(a)ltL(a0) for all a?a0 
where a0 denotes the true parameter value.  - All other conditions of the usual consistency 
result for QML (e.g. Pfanzagl, 1969) are 
satisfied hence the estimator is consistent. 
  44Asymptotic normality I.
- Using the notations 
 - for the derivatives of the log-likelihood 
 - for the information matrix 
 - and for the expected Hessian
 
  45Asymptotic normality II.
- A standard Taylor-expansion implies 
 - Finiteness of the fourth moment with a martingale 
central limit theorem yields  - Moreover, the asymptotic covariance matrix can be 
consistently estimated by the empirical 
counterparts of H and V.  
  46Estimation of the ARCH-equation when ?t is not 
known
- In this case the innovations of the ARMA-model 
are calculated using the estimated 
ARMA-parameters  - If the ARMA-parameter vector is estimated 
consistently, the mean difference of squared 
innovations tends to zero  - If the ARMA parameter estimate is asymptotically 
normal, a stronger statement holds 
  47Consistency in the case of estimated innovations
- Now the following expression is maximised 
 - But the difference tends to zero (uniformly on 
the parameter space)  - which then yields consistency of the estimate of 
?. 
  48Asymptotic normality in the case of estimated 
innovations
- Under some moment conditions the least squares 
estimate of the ARMA-parameters is asymptotically 
normal, hence  - This way the differences between the first and, 
respectively, the second derivatives both 
converge to zero  - As a result, all the arguments for asymptotic 
normality given above remain valid. 
  49Estimation results
Station m a0 a1
Komárom 789 1807.8 (2009.8) 26.06 (2.22)
Nagymaros 586 544.7 (154.1) 11.95 (0.57)
Budapest 580 907.1 (314.0) 10.29 (0.55)
Tivadar 23 24,49 (5,95) 18,80 (1,13)
Namény 30 82,45 (32,82) 20,71 (0,51)
Záhony 45 67,04 (17,31) 12,37 (1,19) 
 50How to simulate the residuals of the new 
GARCH-type model
- Residuals are highly skewed and peaked. 
 - Simulation 
 - Use resampling to simulate from the central 
quantiles of the distribution  - Use Generalized Pareto distribution to simulate 
from upper and lower quantiles  - Use periodic monthly densities
 
  51The simulation process
resampling and GPD
Zt
smoothed GARCH
?t
FARIMA filter
Xt
Seasonal filter 
 52Evaluating the model fit
- Independence of residual series 
 - ACF, extremal clustering 
 - Fit of probability density and high quantiles 
 - Variance  lagged discharge relationship 
 - Extremal index 
 - Level exceedance times 
 - Flood volume distribution
 
  53ACF of original and squared innovation series  
residual series 
 54Results of new simulationsat Vásárosnamény 
 55Densities and quantiles at all 6 locations 
 56Reestimated (from the fitted model) 
discharge-variance relationship 
 57Seasonalities of extremes
- The seasonal appearance of the highest values 
(upper 1) of the simulated processes follows 
closely the same for the observed one. 
  58Extremal index to measure the clustering of high 
values
- Estimated for the observed and simulated 
processes containing all seasonal components  - Estimation by block method 
 - Value of block length changes from 0.1 to 1. 
 - Value of threshold ranges from 95 to 99.9.
 
  59Estimated extremal indices displayed 
 60Extremal indices in the threshold GARCH model 
 61Extremal indices in the discharge dependent GARCH 
model 
 62Level exceedance times
- The distribution of the level exceedance times 
may serve as a further goodness of fit measure.  - It has an enormous importance as it represents 
the time until the dams should stand high water 
pressure. 
  63Flood volume distribution
- The match of the empirical and simulated flood 
volume distributions also approve the good fit. 
  64Multivariate modelling
- Final aim to model the runoff processes 
simultaneously  - Nonlinear interdependence and non-Gaussianity 
should be addressed here, too  - First, the joint behaviour of the discharges 
inflowing into Hungary should be modelled  - Differential equation-oriented models of 
conventional hydrology may be used to describe 
downstream evolution of runoffs  - Now we concentrate on joint modelling of two 
rivers Tisza (at Tivadar) and Szamos (at 
Csenger)  
  65Issues of joint modelling
- Measures of linear interdependences (the 
cross-correlations) are likely to be 
insufficient.  - High runoffs appear to be more synchronized on 
the two rivers than small ones  - The reason may be the common generating weather 
patterns for high flows  - This requires a non-conventional analysis of the 
dependence structure of the observed series 
  66Basic statistics of Tivadar (Tisza) and Csenger 
(Szamos)
- The model described previously was applied to 
both rivers  - Correlations between the series of raw values, 
innovations and residuals are highest when either 
series at Tivadar are lagged by one day  - Correlations 
 - Raw discharges 0.79 
 - Deseasonalized data 0.77 
 - Innovations 0.40 
 - Residuals 0.48 
 - Conditional variances 0.84 
 
  67Displaying the nature of interdependence
- The joint plot may not be informative because of 
the highly non-Gaussian distributions  - Transform the marginals into uniform 
distributions (produce the so-called copula),  - then the scatterplot is more informative on the 
joint behaviour  - The strange behaviour of the copula of the 
innovations is characterized by the concentration 
of points  - 1. at the main diagonal, and especially at the 
upper right corner (tail dependence)  - 2. at the upper left (and the lower right) 
corner(s)  - Linearly dependent variables do not display this 
type of copula  - The GARCH-residuals lack the second type of 
irregularity 
2
1 
 68A possible explanation of this type of 
interdependence
- The variance process is essentially common for 
the two rivers  - This is justified by the high correlation (0.84) 
 - Generate two linearly interdependent residual 
series (correlation0.48)  - Multiply by the common standard deviation 
distributed as Gamma  - Observe the obtained copula 
 - This justifies the hypothesis that the common 
variance causes the nonlinear interdependence of 
the given type 
  69Tail dependence 
 70Conclusion
- Long range dependency does not have much impact 
on extremes of river discharges  - Nonlinearities are influental and have to be 
accounted for  - The proposed model has a stationary solution 
 - Its estimation is consistent and asymptotically 
normal  - The suggested model catches densities and 
quantiles well  - The model fitting procedure does not optimise on 
quantiles and maxima clustering, therefore their 
fit really shows model adequacy  - Other fit-independent measures include level 
exceedence and flood volume distributions  - Something different has to be done with dammed 
water 
  71Possible refinements of the model
- The simulated process is still slightly heavier 
tailed than the original one.  - The conditional distribution of residuals depends 
on the lagged water discharge  - very high residual values occur at low discharge 
 - Possible solutions 
 - innovation as a mixture of a GARCH-part and of a 
discharge-independent part  - Markov-switching model with discharge-dependent 
transition probabilities 
  72Thank you for your attention!