Title: Practical issues and tools for modeling temporal and spatio-temporal trends in atmospheric pollutant monitoring data
1Practical issues and tools for modeling temporal
and spatio-temporal trends in atmospheric
pollutant monitoring data Paul D.
Sampson Department of Statistics University of
Washington The International Environmetric
Society Modelling Spatio-Temporal Trends
Workshop 3 November 2003
2Our experience in analysis of trends in
atmospheric pollutants
- Part I Meteorological adjustment and long-term
temporal trends in ozone - Meteorological adjustment of western Washington
and northwest Oregon surface ozone observations
with investigation of trends. Reynolds, Das,
Sampson, Guttorp, NRCSE TRS 15
(http//www.nrcse.washington.edu/pdf/trs15_doe.pdf
) - Meteorological adjustment of Chicago, Illinois,
surface ozone observations with investigation of
trends. Reynolds, Caccia, Sampson, Guttorp,
NRCSE TRS 25 (http//www.nrcse.washington.edu/pdf
/trs25_chicago.pdf) - A review of statistical methods for the
meteorological adjustment of tropospheric ozone.
Thompson, Reynolds, Cox, Guttorp, Sampson,
Atmospheric Environment 35, 617-630, 2001. - Part II Spatial trend for health effects studies
- Spatial estimation of ambient air concentrations
for ozone, 1986-94, for chronic health effects
modeling in 83 counties in the U.S. Current
contract with U.S. EPA. - Spatio-temporal modeling and prediction of
ambient PM2.5 concentrations for acute and
chronic health effects modeling with the
NIH/NHLBI cohort study, MESA (Multi-Ethnic Study
of Atheroscloerosis). Current proposal and
ongoing collaboration with colleagues at the Univ
of Washingtons Northwest Center for Particulate
Matter and Health.
3Spatio-temporal modeling of ambient PM exposure
for chronic health effect studies Paul D.
Sampson Department of Statistics University of
Washington Northwest Center for Particulate
Matter Health External Science Advisory
Committee Meeting 12 November 2003
4Motivation for fine(r)-scale spatial modeling of
pollutant exposure for chronic health effect
studies
- Major North American cohort studies of PM
single community-wide exposure/monitor to
characterize a metropolitan area. Fails to
address important local spatial variation of air
pollutants known to exist within regions. - Hoek et al. 3-component regression model to
predict exposure to air pollutants (black smoke
and NO2). Incorporates (a) regional background
levels, (b) urban gradient (based on population
density) and (c) proximity to heavily-trafficked
roadways and other point sources. - Build on this approach to combine in a spatial
model average concentration data from fixed-site
ambient monitors and spatial covariate
information encoded in a GIS, including
population density, proximity to roads, and
traffic density.
5- Aside U.S. EPA currently funded Epidemiologic
Research on Health Effects of Long-Term Exposure
to Ambient P.M. and Other Air Pollutants (June
2003) - Laden, Schwartz et al (Harvard) Chronic
Exposure to Particulate Matter and
Cardiopulmonary Disease. Nurses Health Study
Prospective cohort study of 121,700 women
throughout U.S. - Knutson, Beeson et al (Loma Linda) Relating
Cardiovascular Disease Risk to Ambient Air
Pollutants using GIS and Bayesian Neural
Networks. AHSMOG study. - Samet, Zeger, Dominici et al (Johns Hopkins)
Chronic and Acute Exposure to Ambient Fine
Particulate Matter and Other Air Pollutants
National Cohort Studies of Mortality and
Morbidity. Data from Medicate beneficiary file
and National Claims History File. - Diez-Roux, Keeler, Samson, Lin (Michigan).
Long-Term Exposure to Ambient PM and Subclincial
Atherosclerosis. MESA Study.
6EPA apparently mandated/directed that all these
studies be concerned with computing exposure
estimates from ambient monitoring data and
GIS-based information on local traffic, pop
density, . Jon Samet (Johns Hopkins) EPA
should invest in drawing national maps of
exposure as all our research groups are trying to
do the same thing.
7Applications
- MESA Air NHLBI-funded Multi-Ethnic Study of
Atherosclerosis effects of ambient PM (and other
pollutants) on subclinical cardiovascular
function - 8700 subjects, aged 50-89, from 9 communities,
assessed prospectively, longitudinally. - Monitoring data and exposure assessment
- Current AQS PM monitors (mostly 3-day sampling)
- Supplemental monitors, up to 5 per community (2
week integrated msmts of key pollutants) - Mobile gradient monitoring (2 week integrated
sampling) - PLUS
- Distances to nearest major roadways with traffic
volume and composition - Distances to pollutant point source
- EVALUATION on PM2.5 and co-pollutants measured
at 10 of homes - Preliminary demonstration of spatio-temporal
modeling using S. Calif ozone data.
8Personal PM exposure for subject I at time t
sum of non-ambient (N) and ambient (A) components
Ambient exposure is ambient concentration times
an ambient exposure attenuation factor reflecting
time spent outside the home and particle
infiltration into the home
Model for ambient concentration trend residual
9Smoothly varying spatio-temporal trend is further
decomposed
- the 1st term represents long-term mean
concentration and will derive from a Bayesian
analysis of a spatial regression model combining
average concentration data from fixed-site
ambient monitors and spatial covariate
information encoded in a GIS. - the 2nd component represents mainly smooth
seasonal temporal variation.
10The variance model for the residual term
represents the spatio-temporal variation
considered primarily at the 2-week time scale of
the fixed sites and mobile gradient monitors.
Estimation of this component will be based on
(extensions to) the Bayesian model for the
Sampson-Guttorp spatial deformation approach to
nonstationary spatial covariance as demonstrated
in Damian et al. (2001, 2003). This modeling
strategy accommodates the spatial varying effects
of predominant meteorology, coast lines and
topographic features that underlie the
statistical relationship between time varying
pollutant levels at different points in space.
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(1) Estimation of the long-term mean spatial
field
Following Hoek and colleagues (2002 Atmos Env,
2003 Epidemiology), assume the regression model
can be written
Where represent pop density,
proximity to roads, traffic density, and possibly
local topographic and climatic wind patterns. The
Bayesian analysis incorporates prior information
on the parameters and on the spatial covariance
structure of residuals from this regression model
in a manner similar to that of our Bayesian
framework for spatial estimation of the residual
component (see (3) below).
24Mean field
Note that monitoring observations will be used
directly in the estimation, not just in the
specification or calibration of the regression
model as in the work of Hoek et al. I.e., in
Hoek et al., (long-term) exposure is estimated as
In our (geostatistical) approach, we will be
estimating the space-time field,
the long-term exposure at a point includes an
estimated (kriged) spatial residual and can be
written
25(2) Smooth, spatially varying, temporal
variation.
The spatial index in the 2nd component allows for
the possibility that the magnitude and precise
details of the seasonal variation may vary from
location to location over the spatial scale of
the regional target communities. Preliminary
analysis of PM2.5 monitoring data in the Los
Angeles county region suggests some spatial
variation in seasonality, but in some regions we
expect to find that this seasonal variation is
homogeneous, permitting an additive (separable)
decomposition of the spatio-temporal trend.
26Trend decomposition
Characterize and estimate the seasonal structure
of air pollutant concentrations in terms of a
model written as
where the are temporal basis
functions describing possible seasonal trend
patterns, and represent spatially
varying coefficients of these trend
patterns. Example O3 trend components. (What
do we expect with PM more generally?)
27We compute trend components empirically as
smoothed versions of the temporal singular
vectors of the T?N data matrix (rather than
assuming parametric forms such as trigonometric
functions). Arbitrary amounts of missing data
are accommodated in an EM-like iterative
calculation of the SVD. The Bayesian spatial
regression model can incorporate the coefficients
of these trend components as spatial fields, and
thus provide the basis for estimation of
at target homes.
28(No Transcript)
29(No Transcript)
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35(No Transcript)
36(No Transcript)
37(No Transcript)
38(3) Nonstationary residual spatio-temporal
variation.
Final component spatio-temporal variation at
the (2-week) time scale of the fixed sites and
mobile gradient monitors. Sampson-Guttorp
spatial deformation approach (Damian et al. 2001,
2003), to model the nonstationary spatial
covariance structure. Allows for spatially
varying effects of predominant meteorology, coast
lines and topographic features that underlie the
statistical relationship between time varying
pollutant levels at different points in space.
Bayesian analysis provides a full posterior
distribution for the model parameters, and thus a
ready computation of multiple imputations of
exposures for the health effects analysis.
39(No Transcript)
40(No Transcript)
41Observed vs Predicted ozone at 3 validation sites.
42(No Transcript)
43(No Transcript)
44(No Transcript)
45Conclusion
- We can estimate/predict both the day-to-day
deviations from the trend, and the seasonal shape
of the trend quite well, but - We sometimes miss the long-term mean.
- gt need to incorporate extra local information
to predict the mean concentration.
46(No Transcript)
47Technical details
48Details, issues, and extensions
- Gaussian assumption after transformation
- Current AQS data sampling usually every 3 days
proposed sampling on 2-week intervals - Conditional, hierarchical approach to estimating
the parameters of our space-time models from this
incomplete data, beginning with models
estimated from the longer-term AQS data and them
updating estimated model parameters with data
from the new fixed and mobile monitors.
49- First stage of analysis build separate models
and estimates for the three major exposures of
interest, PM2.5, NOx, and O3. - Second stage take advantage of the association
between PM2.5 and NOx in a multivariate
(co-kriging) analysis that assumes only that
spatial nonstationarity can be expressed in a
common underlying deformed coordinate system.
50or
where we are writing C as an S?T (space-time)
matrix of observations, ? is an S?(J1) matrix of
coefficients multiplying the matrix F, (J1) ?T,
with columns containing values of the basis
functions evaluated at the S observation sites
(i1,,S). Obvious calculation is an SVD of the
concentration matrix C.
51where the columns of the (truncated) matrix of
right singular vectors is considered to represent
the matrix of values of the J1 temporal basis
functions F
Issues Smoothness of the singular vectors as
components of trend computation with missing
data.
52Posterior sample
53Site variances