Delineating Metropolitan Housing Submarkets with Fuzzy Clustering Methods - PowerPoint PPT Presentation

About This Presentation
Title:

Delineating Metropolitan Housing Submarkets with Fuzzy Clustering Methods

Description:

Delineating Metropolitan Housing Submarkets with Fuzzy Clustering Methods Julie Sungsoon Hwang Department of Geography, University of Washington – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 24
Provided by: Sungs151
Learn more at: https://gis.depaul.edu
Category:

less

Transcript and Presenter's Notes

Title: Delineating Metropolitan Housing Submarkets with Fuzzy Clustering Methods


1
Delineating Metropolitan Housing Submarkets with
Fuzzy Clustering Methods
  • Julie Sungsoon Hwang
  • Department of Geography, University of Washington
  • Jean-Claude Thill
  • Department of Geography, State University of New
    York at Buffalo

November 10, 2005 North American Meetings of
Regional Science Association International
2
Outlines
  • Research objectives
  • Methodology specification
  • Methodology illustration
  • Evaluating the performance of fuzzy clustering
  • Conclusions

3
Research objectives
  • Demonstrate the use of fuzzy c-means (FCM)
    algorithm for delineating housing submarkets
  • Comparison to K-means
  • Discuss empirical characteristics of FCM applied
    to given applications, in particular choice of
    parameters
  • Cluster validity index

4
Challenges
  • Are the boundaries of clusters crisp?

5
  • Methodology specification

6
  • Our task is to group census tracts to homogeneous
    housing submarkets within a metropolitan area
  • Using fuzzy c-means algorithm
  • In order to examine whether fuzzy set-based
    clustering can do the better job
  • Implemented in 85 metropolitan areas
  • Most of data set are public (e.g. 2000 Census)
  • The whole procedure is automated in GIS

7
Methodology flow chart
For each metropolitan area
Candidate variables
National
x1 x2 x3 xm
1
2
3

n
Regional
Metro
Local
Uj membership to cluster j
Cluster Analysis
U1 U2 Uc
1 1 0 0
2 0 1 0
0 1 0
n 0 0 1
Significant variables
y1 y2 yk
1
2
3

n
U1 U2 Uc
1 0.85 0.05 0.10
2 0.12 0.80 .. 0.05
0.02 0.74 0.12
n 0.40 0.03 0.50
(c n)
k selected variables
c submarkets
8
Explanatory variables for house price
Var_Name Variable Definition Data Year Spatial Unit
Socioeconomic/demographic Characteristics of Residents Socioeconomic/demographic Characteristics of Residents Socioeconomic/demographic Characteristics of Residents Socioeconomic/demographic Characteristics of Residents Socioeconomic/demographic Characteristics of Residents
pcincome per capita income Census 2000 Census Tract
college college degree Census 2000 Census Tract
managep management workers Census 2000 Census Tract
prodp production workers Census 2000 Census Tract
famcpchl family with children Census 2000 Census Tract
nfmalone nonfamily living alone Census 2000 Census Tract
black_p black Census 2000 Census Tract
nhwht_p non-hispanic white Census 2000 Census Tract
nativebr native born Census 2000 Census Tract
Structural Characteristics of Housing Units Structural Characteristics of Housing Units Structural Characteristics of Housing Units Structural Characteristics of Housing Units Structural Characteristics of Housing Units
medroom median number of room Census 2000 Census Tract
hudetp detached housing unit Census 2000 Census Tract
yrhublt median year structure built Census 2000 Census Tract
Locational Characteristics (Amenities) of Neighborhoods Locational Characteristics (Amenities) of Neighborhoods Locational Characteristics (Amenities) of Neighborhoods Locational Characteristics (Amenities) of Neighborhoods Locational Characteristics (Amenities) of Neighborhoods
ptratio pupil to teacher ratio NCES 2002 School District
schexp school expenditure per student NCES 2002 School District
vrlcrime violent crime rate FBI 2003 Designated Place
prpcrime property crime rate FBI 2003 Designated Place
jobacm job accessibility (Hansen 1959) CTPP 2000 Census Tract
National Center for Education Statistics FBI
annual report Crime in the U.S. 2003
CTPP Census Transportation Planning Package
Dependent variables median home value of
owner-occupied housing units
9
Study set 85 metropolitan areas
10
What is fuzzy c-means (FCM)?
  • Clustering method that minimizes the following
    objective function

Vectors of data point, 1 k n Center of
cluster i, 1 i c Membership degree of data
point k with cluster i 0,1 Fuzziness amount
associated with assigning data point k to cluster
i, 1 m 8
  • Updates cluster means vi and membership degree
    uik until the algorithm converges

(III-3a)
(III-3b)
Source Bezdek 1981
11
FCM missing elements
  • Optimal number of clusters c
  • Optimal fuzziness amount m

m
c
FCM
12
Extended fuzzy c-means algorithm
  • Step 1 Initialize the parameters related to
    fuzzy partitioning c 2 (2 c ? cmax), m 1
    (1 m ? mmax), where c is an integer, m is a
    real number Fix minc where minc is incremental
    value of m ( 0 lt minc 0.1) Fix cut-off
    threshold ?L Choose validity index v
  • Step 2 Given c and m, initialize U(0) so that it
    becomes the fuzzy matrix. Then at step l, l 0,
    1, 2, .
  • Step 3 Calculate the c fuzzy cluster centers
    vi(l) with (III-3a) and U(l)
  • Step 4 Update U(l1) using (III-3b) and vi(l)
  • Step 5 Compare U(l) to U(l1) in a convenient
    matrix norm if U(l1) U(l) ?L to go
    step 6 otherwise return to Step 3.
  • Step 6 Compute the validity index for given c
    and m
  • Step 7 If c lt cmax, then increase c ? c 1 and
    go to step 3 otherwise go to step 8
  • Step 8 If m lt mmax, then increase m ? m minc
    and go to step 3 otherwise go to step 9
  • Step 9 Obtain the optimal validity index from ,
    optimal number of clusters c, and optimal amount
    of fuzziness exponent m The optimal fuzzy
    partition U is obtained given c and m

13
Cluster validity indices
Partition coefficient
Partition entropy
SVi index where w is set to 2 in this study
Xie-Beni index
14
Determining c and m
  • Selected validity indices are calibrated over the
    study set
  • Xie-Beni index is recommended as a validity index
  • Average m is 1.38

15
Histogram of m for FCM
16
  • Methodology illustration

17
Median home value of Buffalo, NY
18
Dimensionality of Buffalo housing market
Hedonic regression equation of median home value
in Buffalo, NY
Predictor Coefficient Standard Error t-statistics p-value
Constant -1455768 164417 -8.85 0.000
Per capita income 2.3667 0.2791 8.48 0.000
college degree 88221 11346 7.78 0.000
family couple with children 65735 18775 3.50 0.001
detached housing unit -31260 5527 -5.66 0.000
Housing age (year) 692.88 80.26 8.63 0.000
non-hispanic white 11186 3914 2.86 0.005
native born status 130039 31111 4.18 0.000
Job accessibility -0.05266 0.02227 -2.36 0.019
Adjusted R sq 84.3
19
Optimal number of housing submarkets c, Optimal
fuzziness amount m, Buffalo, NY
c m 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9
2 0.4735 0.4570 0.4380 8.0983 10.4115 12.5478 14.4334 16.0634 17.4645 18.6721
3 0.4136 0.3889 0.3460 0.3385 10.7864 12.9137 14.7939 16.4217 17.8290 19.0553
4 0.7802 0.7116 0.6080 0.5241 1.3154 6.8837 7.4807 8.0441 8.5632 9.0391
5 0.5560 0.5622 0.5940 0.6121 0.4683 0.3404 0.6489 0.6850 0.7206 0.7555
6 0.6223 0.7578 1.0187 0.8173 0.6907 1.3393 1.4074 1.4819 1.5595 1.6382
7 0.8836 0.6903 0.6881 0.6016 0.6148 0.9515 2.4397 2.6306 2.8317 3.0383
8 0.5981 0.5888 0.5703 0.5232 0.3992 0.7381 0.8910 1.2388 1.2926 1.3538
9 0.9645 0.6160 0.4836 0.4866 0.8449 1.4020 1.4198 1.8317 1.8639 1.9161
10 0.7053 0.6004 0.6619 0.5873 0.5868 1.3465 1.5081 1.6875 1.8215 1.8591
c 3 3 3 3 8 5 5 5 5 5
Values in the cell represent Xie-Beni index given
c and m
20
Buffalo housing submarkets
c 3 m 1.3
21
  • Evaluating the performance of fuzzy clustering

22
Compare FCM with K-means (KM)
  • Compare the sum of squared error
    derived from KM (m1) and FCM (mm) given
    c
  • Fuzzy clustering outperforms crisp clustering

23
Conclusions
  • Fuzzy set theory provides a mechanism for
    uncertainty handling involved in classification
    task
  • Fuzzy c-means algorithm is of practical use in
    delineating housing submarkets
  • Fuzzy set theory needs further attention in
    social science fields
  • More works on the choice of parameters are needed
Write a Comment
User Comments (0)
About PowerShow.com