Spatial Data Analysis Areas I: Rate Smoothing and the MAUP - PowerPoint PPT Presentation

About This Presentation

Title:

Spatial Data Analysis Areas I: Rate Smoothing and the MAUP

Description:

Source: Renato Assun o (UFMG/Brasil) ... Source: Fred Ramos (CEDEST/Brasil) ... Source: Fred Ramos (CEDEST/Brasil) 96 INCOME-HOMOGENOUS ZONES IN S O PAULO ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 31

Provided by: gilbert79

Category:

more less

Transcript and Presenter's Notes

Title: Spatial Data Analysis Areas I: Rate Smoothing and the MAUP

1
Spatial Data Analysis Areas I Rate Smoothing
and the MAUP
Ifgi, Muenster, Fall School 2005

Gilberto Câmara
INPE, Brazil

2
Areal data

Study region is partitioned in disjoint areas
The region is the union of the areas
Each map has one or more associated measures
Treated as random variables
Examples
Map of Germany divided in municipalities. For
each area, we measure the unemployment rate and
the literacy rate.
Is unemployment correlated with years of school?
What about Brazil?

3
Violence in Minas Gerais
4
Violence in Minas Gerais
5
Violence in Minas Gerais
6
Attributes in areal data

As a general rule, each measure is a sum, count
or a similar aggregated function over all the
area
Each value is associated to all the corresponding
area
If we need to choose a single location, usually
we take the polygon centroid
There are no intermediate values

7
What is mapped in areal data?

Typical values are rates or proportions
Numerator events
Denominador pop at risk
Log maps?

8
Log rate of motor vehicle accident death per
100.000 residents, 1990-92
9
Log ratio of homicide death of males 15-49 per
100.000 residents of same group age, 1990-92
10
Models of Discrete Spatial Variation
Random variable in area i

n of ill people
n of newborn babies
per capita income

Source Renato Assunção (UFMG/Brasil)
11
Dealing with rates and proportions
When the study variable is a rate or a
proportion, mapping those rates is the first
obvious step in any analysis. However, the use of
raw observed rates might be misleading, since the
variability of those rates will be a function of
the population counts, which differs widely
between the areas. Bailey,1995
12
Source Fred Ramos (CEDEST/Brasil)
13
Model-Driven Approaches

Model of discrete spatial variation
Each subregion is described by is a statistical
distribution Zi
e.g., homicides numbers are Poisson (?, ?).
The main objective of the analysis is to estimate
the joint distribution of random variables Z
Z1,,Zn
We use a model-driven approach to correct the
missing data
It is called the Empirical Bayes method...
We could also use the Full Bayes method (but
that is another story...)

14
(measured rate)
i
In Bayesian statistics, the best estimate
of the true and unknown rate is

where
Source Fred Ramos (CEDEST/Brasil)
15
Empirical Bayes
Simplifying assumptions for estimating means and
variances for all random variables of all areas
(Marshall, 1991)
Source Fred Ramos (CEDEST/Brasil)
16
Source Fred Ramos (CEDEST/Brasil)
17
Infant Mortality Rate São Paulo (Raw)
Source Fred Ramos (CEDEST/Brasil)
18
Infant Mortality Rate São Paulo (Corrected)
Source Fred Ramos (CEDEST/Brasil)
19
Some Important Questions

How does scale matter?
How do the spatial partitions matter?
How does proximity matter?
What can we learn by studing how multiple data
vary in space?
How much prior assumptions can we impose in our
spatial data?

20
A Question of Scale
Problema das Unidades de Área Modificáveis - MAUP

A basic problem with areal data
The spatial definition of the frontiers of the
areas impacts the results
Different results can be obtained by just
changing the frontiers of these zones.
This problem is known as the the modifiable area
unit problem

21
Scale Effects
Per capita income
Jobs/ population
Illiterate / population
Source Fred Ramos (CEDEST/Brasil)
22
Scale Effects
Per capita income
Jobs/ population
Illiterate / population
Source Fred Ramos (CEDEST/Brasil)
23
Scale Effects Figthing the MAUP
Population gt60 years
Illiterates
per capita income
270 ZONES OD97
Source Fred Ramos (CEDEST/Brasil)
24
Scale Effects Figthing the MAUP
Population gt60 years
Illiterates
per capita income
96 DISTRICTS OF SÃO PAULO
Source Fred Ramos (CEDEST/Brasil)
25
Scale Effects Figthing the MAUP
Source Fred Ramos (CEDEST/Brasil)
Population gt60 years
Illiterates
per capita income
96 INCOME-HOMOGENOUS ZONES IN SÃO PAULO
26
Correlation matrices
270 ZONES OD97
VARIABLES
A) Percentage of population 60 year-old or
more B) Percentage of illiterate population C)
Per capita individual income
96 DISTRICTS
96 INCOME-AGGREGATED
Source Fred Ramos (CEDEST/Brasil)
27
A Questão da Escala
Get census data
Adaptation
Identify inter-tract variation
Reduce data variability
Minimize the outlier effect
28
Regionalization