Predictive Models I - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Predictive Models I

Description:

Locational Characteristic Models. Based on values of ... Data Used in Regressions. Linear. Dep var1 var2. 43 840 149. 22 852 155. 69.4 854 151. 15 805 134 ... – PowerPoint PPT presentation

Number of Views:137
Avg rating:3.0/5.0
Slides: 24
Provided by: garylchris
Category:

less

Transcript and Presenter's Notes

Title: Predictive Models I


1
Predictive Models I
  • RNR/Geog 420/520
  • Spring 2000

2
Predictive Models
  • Important to understand what we are attempting to
    predict
  • These models predict location
  • This prediction is based on reasoned or measured
    relationships
  • No predictive model is perfect
  • Some are more efficient than others

3
Broad Model Types
  • Deductive models are based on reasoning in which
    the conclusion follows necessarily the presented
    premises
  • Inductive models base validity on observations
    about part of a class as evidence for a
    proposition about the whole class

4
The Goal of Predictive Models
  • Models in a GIS should maintain a high percentage
    of correct predictions while decreasing the area
    needed to obtain these predictions

5
Why Make Predictive Models
  • Resource management
  • Reduce costs but maintain service
  • Planning decisions
  • Discover favored habitats
  • Understand behavior
  • Discover preferences
  • Prove theories
  • Disprove theories

6
Components of a Model
  • Variables
  • Study Group
  • Control Group
  • Suitable Statistical model/test

7
Variable Selection
  • Variables are usually selected because they are
    thought to exert influence on the phenomena being
    studied
  • The data type (i.e. nominal vs. continuous) of a
    variable can restrict the types of models it is
    possible to make
  • GIS allow researcher to control continuous data

8
Study Group
  • Locations where phenomena being investigated are
    located
  • A good study group requires good collection
    strategy
  • SAMPLE(ltmask_gridgt, grid, ..., grid)
  • zone x y cellvalue1 cellvaluen

9
Control Group
  • Often necessary to discover the significance of
    spatial patterns
  • For example, is it significant if 75 of coyote
    dens are located within 50 meters of houses?

10
Trend Surface Models
  • Based strictly on location
  • Trend surface models use a polynomial regression
    to fit a least-squares surface to input points
  • As the order of the polynomial is increased, the
    surface being fitted becomes progressively more
    complex
  • Two variations Linear and Logistic
  • TREND(ltpoint_cover point_filegt, spot_item,
    order,LINEAR LOGISTIC, cellsize, xmin,
    ymin, xmax, ymax)

11
Linear/Logistic Trend Surfaces
  • Linear Trend Surfaces
  • Useful for continuous data
  • Uses x, y, and z values to model data trends
  • Z values are continuous
  • Creates smooth surfaces
  • Surface complexity increases as order of
    polynomial increases
  • Logistic Trend Surfaces
  • Useful for binary types of data (e.g. yes/no)
  • Uses x, y, and z to model trends.
  • Z values are 0 or 1
  • Creates smooth surfaces
  • Surface complexity increases as order of
    polynomial increases

12
First Order Trend Surfaces
Based on Site Locations (Logistic)
Based on Pottery Counts (Linear)
13
Third Order Trend Surfaces
Based on Site Locations (Logistic)
Based on Pottery Counts (Linear)
14
Locational Characteristic Models
  • Based on values of variables at the study group
    locations
  • Univariate Analysis (Kolmogorov-Smirnov)
  • Multivariate Analysis (multiple regression,
    cluster, classification, principle component)

15
Kolmogorov Smirnov Test
  • Statistic is the maximum difference between
    cumulative proportions of two samples, usually
    study group and control group
  • Use GRIDs SAMPLE command to extract values for
    both groups
  • Preference can be seen graphically

Significance at 5 level reached if
16
Multiple Regression Models
  • Regression models measure relationship between
    dependent and independent variables
  • The dependent variable in linear regression is
    generally a real number
  • The dependent variable in logistic regression is
    either a 1 or a 0

17
Data Used in Regressions
  • Linear
  • Dep var1 var2
  • 43 840 149
  • 22 852 155
  • 69.4 854 151
  • 15 805 134
  • 46 853 062
  • Logistic
  • Dep var1 var2
  • 1 840 149
  • 0 852 155
  • 1 854 151
  • 0 805 134
  • 1 853 062

18
Creating Multiple Regression Models in GRID
  • Subject SAMPLE results to regression
  • Statistics Software
  • GRIDs REGRESSION command
  • Results of the regression include coefficients
    and a constant, or y-intercept
  • Model made by multiplying variables by
    coefficients
  • surface 1.250 (-0.029 x img1) (0.263 x img2)

19
Results of GRIDs Regression
  • Grid gt regression hsam.txt logistic brief lt
  • coef coef
  • ------ ----------------
  • 0 -3.797
  • 1 -0.001
  • 2 0.014
  • 3 0.006
  • 4 0.000
  • 5 0.055
  • ------ ----------------
  • RMS Error 0.393
  • Chi-Square 51.608

20
Results of STATAs Regression
--------------------------------------------------
---------------------------- Logit Estimates
Number of obs
383
chi2(13) 53.88
Prob
( chi2 0.0000 Log Likelihood -220.37417
Pseudo R2
0.1089   -----------------------------------------
------------------------------------- site
Coef. Std. Err. t P(t
95 Conf. Interval ----------------------------
-------------------------------------------------
aspew .0004698 .002341 0.201
0.841 -.0041336 .0050731 aspns
.0021229 .0023099 0.919 0.359
-.0024193 .006665 elev -.0038272
.0042056 -0.910 0.363 -.0120971
.0044428 relfa -.1647048 .0695988
-2.366 0.018 -.3015648 -.0278448
relfm .2218111 .0720802 3.077 0.002
.0800717 .3635505 texture -.5435591
.2748572 -1.978 0.049 -1.084042
-.0030762 ridge .0014501 .0032384
0.448 0.655 -.0049179 .0078182 sd1
.0001864 .0004607 0.405 0.686
-.0007195 .0010924 sd2 -.0001118
.0012555 -0.089 0.929 -.0025806
.0023571 sd3 -.0052209 .0021802
-2.395 0.017 -.009508 -.0009337
shelter -.0012764 .0015435 -0.827
0.409 -.0043115 .0017587 slope
.0752924 .0386194 1.950 0.052
-.0006493 .1512342 wadist .0007286
.0007215 1.010 0.313 -.0006902
.0021474 _cons 3.439327 4.429328
0.776 0.438 -5.270564
12.14922 -----------------------------------------
------------------------------------- The
corrected Y-intercept constant 4.0704389
21
Regression Model
22
Probability Models
Group 1
Group 2
23
Model Strength
Write a Comment
User Comments (0)
About PowerShow.com