Spatial Data Analysis Areas I: Rate Smoothing and the MAUP - PowerPoint PPT Presentation

About This Presentation
Title:

Spatial Data Analysis Areas I: Rate Smoothing and the MAUP

Description:

Source: Renato Assun o (UFMG/Brasil) ... Source: Fred Ramos (CEDEST/Brasil) ... Source: Fred Ramos (CEDEST/Brasil) 96 INCOME-HOMOGENOUS ZONES IN S O PAULO ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 31
Provided by: gilbert79
Category:

less

Transcript and Presenter's Notes

Title: Spatial Data Analysis Areas I: Rate Smoothing and the MAUP


1
Spatial Data Analysis Areas I Rate Smoothing
and the MAUP
Ifgi, Muenster, Fall School 2005
  • Gilberto Câmara
  • INPE, Brazil

2
Areal data
  • Study region is partitioned in disjoint areas
  • The region is the union of the areas
  • Each map has one or more associated measures
  • Treated as random variables
  • Examples
  • Map of Germany divided in municipalities. For
    each area, we measure the unemployment rate and
    the literacy rate.
  • Is unemployment correlated with years of school?
  • What about Brazil?

3
Violence in Minas Gerais
4
Violence in Minas Gerais
5
Violence in Minas Gerais
6
Attributes in areal data
  • As a general rule, each measure is a sum, count
    or a similar aggregated function over all the
    area
  • Each value is associated to all the corresponding
    area
  • If we need to choose a single location, usually
    we take the polygon centroid
  • There are no intermediate values

7
What is mapped in areal data?
  • Typical values are rates or proportions
  • Numerator events
  • Denominador pop at risk
  • Log maps?

8
Log rate of motor vehicle accident death per
100.000 residents, 1990-92
9
Log ratio of homicide death of males 15-49 per
100.000 residents of same group age, 1990-92
10
Models of Discrete Spatial Variation
Random variable in area i
  • n of ill people
  • n of newborn babies
  • per capita income

Source Renato Assunção (UFMG/Brasil)
11
Dealing with rates and proportions
When the study variable is a rate or a
proportion, mapping those rates is the first
obvious step in any analysis. However, the use of
raw observed rates might be misleading, since the
variability of those rates will be a function of
the population counts, which differs widely
between the areas. Bailey,1995
12
Source Fred Ramos (CEDEST/Brasil)
13
Model-Driven Approaches
  • Model of discrete spatial variation
  • Each subregion is described by is a statistical
    distribution Zi
  • e.g., homicides numbers are Poisson (?, ?).
  • The main objective of the analysis is to estimate
    the joint distribution of random variables Z
    Z1,,Zn
  • We use a model-driven approach to correct the
    missing data
  • It is called the Empirical Bayes method...
  • We could also use the Full Bayes method (but
    that is another story...)

14
(measured rate)
i
In Bayesian statistics, the best estimate
of the true and unknown rate is

where
Source Fred Ramos (CEDEST/Brasil)
15
Empirical Bayes
Simplifying assumptions for estimating means and
variances for all random variables of all areas
(Marshall, 1991)
Source Fred Ramos (CEDEST/Brasil)
16
Source Fred Ramos (CEDEST/Brasil)
17
Infant Mortality Rate São Paulo (Raw)
Source Fred Ramos (CEDEST/Brasil)
18
Infant Mortality Rate São Paulo (Corrected)
Source Fred Ramos (CEDEST/Brasil)
19
Some Important Questions
  • How does scale matter?
  • How do the spatial partitions matter?
  • How does proximity matter?
  • What can we learn by studing how multiple data
    vary in space?
  • How much prior assumptions can we impose in our
    spatial data?

20
A Question of Scale
Problema das Unidades de Área Modificáveis - MAUP
  • A basic problem with areal data
  • The spatial definition of the frontiers of the
    areas impacts the results
  • Different results can be obtained by just
    changing the frontiers of these zones.
  • This problem is known as the the modifiable area
    unit problem

21
Scale Effects
Per capita income
Jobs/ population
Illiterate / population
Source Fred Ramos (CEDEST/Brasil)
22
Scale Effects
Per capita income
Jobs/ population
Illiterate / population
Source Fred Ramos (CEDEST/Brasil)
23
Scale Effects Figthing the MAUP
Population gt60 years
Illiterates
per capita income
270 ZONES OD97
Source Fred Ramos (CEDEST/Brasil)
24
Scale Effects Figthing the MAUP
Population gt60 years
Illiterates
per capita income
96 DISTRICTS OF SÃO PAULO
Source Fred Ramos (CEDEST/Brasil)
25
Scale Effects Figthing the MAUP
Source Fred Ramos (CEDEST/Brasil)
Population gt60 years
Illiterates
per capita income
96 INCOME-HOMOGENOUS ZONES IN SÃO PAULO
26
Correlation matrices
270 ZONES OD97
VARIABLES
A) Percentage of population 60 year-old or
more B) Percentage of illiterate population C)
Per capita individual income
96 DISTRICTS
96 INCOME-AGGREGATED
Source Fred Ramos (CEDEST/Brasil)
27
A Questão da Escala
Get census data
Adaptation
Identify inter-tract variation
Reduce data variability
Minimize the outlier effect
28
Regionalization
  • Reagregate N small areas (finest scale available)
    into M bigger regions to reduce scale effects.
  • A possible solution constrained clustering

29
Regionalization Maps as graphs
30
Regionalization Maps as graphs
Simple aggregation
Population-constrained aggregation
Write a Comment
User Comments (0)
About PowerShow.com