Geographically weighted regression - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Geographically weighted regression

Description:

Global models: statements about processes which are assumed to be stationary and ... Except for floor size, the established relationship between house values and the ... – PowerPoint PPT presentation

Number of Views:908
Avg rating:3.0/5.0
Slides: 45
Provided by: Danl96
Category:

less

Transcript and Presenter's Notes

Title: Geographically weighted regression


1
Geographically weighted regression
  • Danlin Yu
  • Yehua Dennis Wei
  • Dept. of Geog., UWM

2
Outline of the presentation
  • Spatial non-stationarity an example
  • GWR some definitions
  • 6 good reasons using GWR
  • Calibration and tests of GWR
  • An example housing hedonic model in Milwaukee
  • Further information

3
1. Stationary v.s non-stationary
yi ?i0 ?i1x1i
yi ?0 ?1x1i
?e1
?e1
?e2
?e2
Stationary process
Non-stationary process
?e4
?e3
?e3
?e4
Assumed
More realistic
4
Simpsons paradox
Spatially aggregated data
Spatially disaggregated data
House Price
House density
House density
5
Stationary v.s. non-stationary
  • If non-stationarity is modeled by stationary
    models
  • Possible wrong conclusions might be drawn
  • Residuals of the model might be highly spatial
    autocorrelated

6
Why do relationships vary spatially?
  • Sampling variation
  • Nuisance variation, not real spatial
    non-stationarity
  • Relationships intrinsically different across
    space
  • Real spatial non-stationarity
  • Model misspecification
  • Can significant local variations be removed?

7
2. Some definitions
  • Spatial non-stationarity the same stimulus
    provokes a different response in different parts
    of the study region
  • Global models statements about processes which
    are assumed to be stationary and as such are
    location independent

8
Some definitions
  • Local models spatial decompositions of global
    models, the results of local models are location
    dependent a characteristic we usually
    anticipate from geographic (spatial) data

9
Regression
  • Regression establishes relationship among a
    dependent variable and a set of independent
    variable(s)
  • A typical linear regression model looks like
  • yi?0 ?1x1i ?2x2i ?nxni?i
  • With yi the dependent variable, xji (j from 1 to
    n) the set of independent variables, and ?i the
    residual, all at location i

10
Regression
  • When applied to spatial data, as can be seen, it
    assumes a stationary spatial process
  • The same stimulus provokes the same response in
    all parts of the study region
  • Highly untenable for spatial process

11
Geographically weighted regression
  • Local statistical technique to analyze spatial
    variations in relationships
  • Spatial non-stationarity is assumed and will be
    tested
  • Based on the First Law of Geography everything
    is related with everything else, but closer
    things are more related

12
GWR
  • Addresses the non-stationarity directly
  • Allows the relationships to vary over space,
    i.e., ?s do not need to be everywhere the same
  • This is the essence of GWR, in the linear form
  • yi?i0 ?i1x1i ?i2x2i ?inxni?i
  • Instead of remaining the same everywhere, ?s now
    vary in terms of locations (i)

13
3. 6 good reasons why using GWR
  • GWR is part of a growing trend in GIS towards
    local analysis
  • Local statistics are spatial disaggregations of
    global ones
  • Local analysis intends to understand the spatial
    data in more detail

14
Global v.s. local statistics
  • Global statistics
  • Similarity across space
  • Single-valued statistics
  • Not mappable
  • GIS unfriendly
  • Search for regularities
  • aspatial
  • Local statistics
  • Difference across space
  • Multi-valued statistics
  • Mappable
  • GIS friendly
  • Search for exceptions
  • spatial

15
6 good reasons why using GWR
  • Provides useful link to GIS
  • GISs are very useful for the storage,
    manipulation and display of spatial data
  • Analytical functions are not fully developed
  • In some cases the link between GIS and spatial
    analysis has been a step backwards
  • Better spatial analytical tools are called for to
    take advantage of GISs functions

16
GWR and GIS
  • An important catalyst for the better integration
    of GIS and spatial analysis has been the
    development of local spatial statistical
    techniques
  • GWR is among the recently new developments of
    local spatial analytical techniques

17
6 good reasons why using GWR
  • GWR is widely applicable to almost any form of
    spatial data
  • Spatial link between health and wealth
  • Presence/absence of a disease
  • Determinants of house values
  • Regional development mechanisms
  • Remote sensing

18
6 good reasons why using GWR
  • GWR is truly a spatial technique
  • It uses geographic information as well as
    attribute information
  • It employs a spatial weighting function with the
    assumption that near places are more similar than
    distant ones (geography matters)
  • The outputs are location specific hence mappable
    for further analysis

19
6 good reasons why using GWR
  • Residuals from GWR are generally much lower and
    usually much less spatially dependent
  • GWR models give much better fits to data, EVEN
    accounting for added model complexity and number
    of parameters (decrease in degrees of freedom)
  • GWR residuals are usually much less spatially
    dependent

20
(No Transcript)
21
6 good reasons why using GWR
  • GWR as a spatial microscope
  • Instead of determining an optimal bandwidth
    (nearest neighbors), they can be input a priori
  • A series of bandwidths can be selected and the
    resulting parameter surface examined at different
    levels of smoothing (adjusting amplifying factor
    in a microscope)

22
6 good reasons why using GWR
  • GWR as a spatial microscope
  • Different details will exhibit different spatial
    varying patterns, which enables the researchers
    to be more flexible in discovering interesting
    spatial patterns, examining theories, and
    determining further steps

23
4. Calibration of GWR
  • Local weighted least squares
  • Weights are attached with locations
  • Based on the First Law of Geography everything
    is related with everything else, but closer
    things are more related than remote ones

24
Weighting schemes
  • Determines weights
  • Most schemes tend to be Gaussian or Gaussian-like
    reflecting the type of dependency found in most
    spatial processes
  • It can be either Fixed or Adaptive
  • Both schemes based on Gaussian or Gaussian-like
    functions are implemented in GWR3.0 and R

25
Fixed weighting scheme
Weighting function
Bandwidth
26
Problems of fixed schemes
  • Might produce large estimate variances where data
    are sparse, while mask subtle local variations
    where data are dense
  • In extreme condition, fixed schemes might not be
    able to calibrate in local areas where data are
    too sparse to satisfy the calibration
    requirements (observations must be more than
    parameters)

27
Adaptive weighting schemes
Weighting function
Bandwidth
28
Adaptive weighting schemes
  • Adaptive schemes adjust itself according to the
    density of data
  • Shorter bandwidths where data are dense and
    longer where sparse
  • Finding nearest neighbors are one of the often
    used approaches

29
Calibration
  • Surprisingly, the results of GWR appear to be
    relatively insensitive to the choice of weighting
    functions as long as it is a continuous
    distance-based function (Gaussian or
    Gaussian-like functions)
  • Whichever weighting function is used, however the
    result will be sensitive to the bandwidth(s)

30
Calibration
  • An optimal bandwidth (or nearest neighbors)
    satisfies either
  • Least cross-validation (CV) score
  • CV score the difference between observed value
    and the GWR calibrated value using the bandwidth
    or nearest neighbors
  • Least Akaike Information Criterion (AIC)
  • An information criterion, considers the added
    complexity of GWR models

31
Tests
  • Are GWR really better than OLS models?
  • An ANOVA table test (done in GWR 3.0, R)
  • The Akaike Information Criterion (AIC)
  • Less the AIC, better the model
  • Rule of thumbs a decrease of AIC of 3 is
    regarded as successful improvement

32
Tests
  • Are the coefficients really varying across space
  • F-tests based on the variance of coefficients
  • Monte Carlo tests random permutation of the data

33
5. An example
  • Housing hedonic model in Milwaukee
  • Data MPROP 2004 3430 samples used
  • Dependent variable the assessed value (price)
  • Independent variables air conditioner, floor
    size, fire place, house age, number of bathrooms,
    soil and Impervious surface (remote sensing
    acquired)

34
The global model
35
The global model
  • 62 of the dependent variables variation is
    explained
  • All determinants are statistically significant
  • Floor size is the largest positive determinant
    house age is the largest negative determinant
  • Deteriorated environment condition (large portion
    of soilimpervious surface) has significant
    negative impact

36
GWR run summary
  • Number of nearest neighbors for calibration 176
    (adaptive scheme)
  • AIC 76317.39 (global 81731.63)
  • GWR performs better than global model

37
GWR run non-stationarity check
Tests are based on variance of coefficients, all
independent variables vary significantly over
space
38
(No Transcript)
39
General conclusions
  • Except for floor size, the established
    relationship between house values and the
    predictors are not necessarily significant
    everywhere in the City
  • Same amount of change in these attributes
    (ceteris paribus) will bring larger amount of
    change in house values for houses locate near the
    Lake than those farther away

40
General conclusions
  • In the northwest and central eastern part of the
    City, house ages and house values hold opposite
    relationship as the global model suggests
  • This is where the original immigrants built their
    house, and historical values weight more than
    house ages negative impact on house values

41
6. Interested Groups
  • GWR 3.0 software package can be obtained from
    Professor Stewart Fotheringham stewart.fotheringha
    m_at_MAY.IE
  • GWR R codes are available from Danlin Yu directly
    (danlinyu_at_uwm.edu)
  • Any interested groups can contact either
    Professor Yehua Dennis Wei (weiy_at_uwm.edu) or me
    for further info.

42
Interested Groups
  • The book Geographically Weighted Regression the
    analysis of spatially varying relationships is
    HIGHLY recommended for anyone who are interested
    in applying GWR in their own problems

43
Acknowledgement
  • Parts of the contents in this workshop are from
    CSISS 2004 summer workshop Geographically
    Weighted Regression Associated Statistics
  • Specific thanks go to Professors Stewart
    Fotheringham, Chris Brunsdon, Roger Bivand and
    Martin Charlton

44
Thank you all
  • Questions and comments
Write a Comment
User Comments (0)
About PowerShow.com