Title: Lee Herrington
1The Importance of Error
In RASTER GIS
- Lee Herrington
- Rei Liu
- Susanna McMaster
2Comments on Raster GIS (1995)
- With introduction of GRID in ESRI software RASTER
processing has become more acceptable to vector
folks. - This was good because RASTER processing is a very
powerful adjunct - Both of these studies were raster based ESF PhD
thesis projects
3Contents
- Rei Liu studied the effect of error in GIS data
on decisions made regarding forest stands to
harvest - Based profit estimates derived from a raster
skidding model - Susanna studied the effects of error in
- Cell resolution ( not really error but bad
choice) - Error in forest stand attribute data
- on decisions made by a forest management
program.
4Rei Liu - Overview
- The problem
- The skidding model
- The analysis procedure
- The results
5Rei Liu - The problem
- Error propagation in GIS analyses is very
difficult - Systems in forest management use multiple data
layers - We wondered what the effect of errors in these
data layer would have on decisions made using GIS
decision support models (and in decision MAKING
systems) - Decided to focus on harvest planning models
6Rei Liu - Decision Support
- There are two kinds of systems used to support
people who make decisions - Decision Support Systems (DSS, or in the case of
Geo data, SDSS) which provide information which
can be used by a human making decisions - Decision Making Systems (DMS) which MAKE
decisions - You can play What if with DSSs but not with DMSs
7Rei Liu-The problem (cond)
- Review of GI used in forestry revealed
- Locational errors for features were not
considered important in practice - roads - compartments - waterways - etc.
- Accuracy in DEMs was not considered important
either - However,inclusions in soil pedons might be
important - What impact would this attitude have on GIS
modeling?
8Rei Liu - The Problem (cond)
- A major cost in harvesting wood from forest
compartments to the roadside is the skidding cost - Skidding is the process of removing felled timber
to the nearest landing - The landing is a place on the roadside where
harvested wood is stored before transportation to
the mill
9Rei Liu - The problem boils down to
- How to use GIS to calculate skidding costs
- How to determine the effect of error in the data
needed to find skidding costs - How to make a model that could be used in a test
of the effects of error
10Rei Liu - Skidding cost model
- Distance is the least cost path from the timber
on a pixel to a road. - Cost() depends on
- Distance from timber to landing or nearest road
- Slopes along that path
- soil tractability along the path
- influenced by soil inclusions in soil maps
- Cant cross water
- Based on a model made in 86 using MAP (the
second GIS thesis at ESF)
11Friction values (1986)
12Friction values (1986)
Now we would use -1 for water or really steep
slopes or other places that could not be
crossed f
13Rei Liu - Skidding Costs Model
- Rei Lui based his model on the MAP model but did
the work in IDRISI - Data layers used were the same
- SLOPE
- SOIL
- ROADS
- WATER
1486 model
- In MAP cost could be forced to run only downhill
or up hill - But there was no absolute barrier so crossing
water had to be made expensive - Water had to be made very expensive to cross
15Rei Liu Max Potential Stumpage
Comptmnt
Fst Type
Calc Vol
Max Potential Stumpage
Scalar
Mkt Val
Haul
Skid
MPS Mkt_val (Haul skid))
16Rei Liu Skidding Model
Soils
DEM
Roads
Streams
Reclass
Surface
Reclass
Soil Frict
Slope Frict
Water Frict
Cost
Friction
Cost/Cord
Cords/pix
From Vol calc
X
Skid
17Adding Error
- How to do that?
- Use RANDOM to add error to each layer
- Then what?
- D0 this several hundred times and determine what
the cumlative error is - How?
18Adding Error
- How to do that?
- Use RANDOM to add error to each layer
- Then what?
- D0 this several hundred times and determine what
the cumlative error is - How?
This is a Monte Carlo Technique
19Details
- DEM
- Added random error to cell values
- Calculated slope
- These steps not as simple as they sound!
- SOIL
- Added inclusions at randomly located places in
soil polygon at different intensities - varied resolution of soil pedons representing
soil tractability - Roads and compartment boundaries
- Shifted images relative to one another
20And then
- Created a random layer with
- mean - 0 feet
- sd21 feet
- But this created new DEM with a lower spatial
autocorrelation - 0.9907 ? 0.9855 kings case Morans I
- didnt look like elevation any more - too rough
- Filtered with low pass filter
- increased Morans I back to near original value
- surface looked like elevation - smoother
- BUT we did have a change in resolution of the
data since filter smoothes over 3x3 kernel
Now we would use a FFT method To smooth the
surface
21Processed image does not look like elevation!
Morans I 0.9907
Morans I 0.9855
22After Smoothing
Morans I 0.9907 Morans
I 0.9907
23The Skidding ModelImplementation for DEM error
- Create friction surface by adding
- Slope from DEM had weight of actual slope
- Soil tractability had weight of 0-100
- Water was given a weight of 100
- Compute modified distance cost (friction)
- Compute cost-to-landing for each cell using
volume estimates and modified cost
24Rei Liu - The Skidding ModelImplementation
- Assume costs to mills for product
- Compute gross profit for each cell
- P value - (cost-to-skid cost to transport)
- Ran other versions of model with
- random variation in soil inclusions, etc.
- random shifting of boundaries and roads
- All sources of error
- Averaged for compartments
- (Note - this area averaging reduced extremes in
cost)
25The Skidding ModelEstimating effects of error
- Used Monte Carlo technique
- Made 200 realizations of models with a new random
error layer each time - From the 200 realizations ranked the
profitability of each compartment - Based on maximum potential stumpage
- From ranked compartments made analyses of changes
in ranking
26Average Ranking Results (200 runs)
- Introducing error into DEM had no effect on
average ranking of compartments - due to spatial averaging the result is not
surprising - Compartment boundary shifting did change the
ranking of compartments - small ones the most - Ranking did vary with soil limitation classes
- Introduction of soil inclusions into the model
had a large effect on the ranking - Combining all sources of error had large effect
on ranking
27Rei Liu problems with
- Have to remember that the max. potential stumpage
values are AVERAGES for each compartment - Therefore effects of introduced errors in the DEM
will average out in many cases
28Distribution of Results
- With Monte Carlo the input of randomly
distributed error results in a distribution of
output values - In this case there will be a distribution of
rankings - With the DEM error there was estimate that error
might cause errors in estimated gross profit of
13 - A significant amount of money!
29Cost of Uncertainty
- An analysis of the expected cost of uncertainty
(ECU) showed that - The propagation of error through the model
increased with increasing DEM error - As a of gross profit ECU values ranged from
- a minimum of 8.3 for small errors (sd15)
- a maximum of 13.1 for large errors (sd21)
- This is money which could be invested in
improving quality control
30Rei Liu - Conclusion!
- Obtaining and maintaining
- quality data
- pays-
- even in forest management
31Susanna - Overview
- The study objectives
- The model
- The analysis procedure
- The results
32SM - Study Objectives
- Asses impact of error in cell
resolution stand table attribute values - on a spatial model designed to identify sites
suitable for pulpwood management in N. Minnesota
33SM - differences with Rei Liu
- In this case we are evaluating error in decisions
made using a standard decision making model - How do pixel or cell size influence model results
(error in cell size choice) - How do errors in the stand attributes influence
model results
34SM - The Model
- Analysis carried out in SPANS (quadtree)
- A little different kind of system
- All layers are raster but can be of different
resolution - In any given set of data a gigantic identity
overlay is made of all layers and all attributes
are linked to the resultant polygons - Models are written as FORTRAN style programs
35Quadtree
36System9 Quadtree
- All layers are created as quadtrees
- Dont have to be the same size and the cells are
different sizes - Analysis is carried out by doing an intersect of
all layers so end up with one quadtree layer
where each cell contains information from all
layers - Write FORTRAN like statements to do analysis
37SM - The Model
- The model was the standard MN management decision
model - The management decisions for each forest stand
were based on - geographic layers (stands, roads, water)
- attribute data for each stand
- The possible decisions were
- No action - Thin
- Clearcut - Regenerate
38MN forest management system
39SM - The Analytic procedure
- Run spatial model using original data
- Manipulate specific layers
- change resolution of stands(compartment) layer
- introduce error in stand attributes (age data)
- Re-run the model
- Compare original results with results from the
re-run - summary statistics
- which stands to cut
- what volume to be cut
- visual map comparison
40SM - Results Resolution
Effect of resolution on stands included
40
35
30
25
missclassified
20
eliminated
15
10
5
0
20m
40m
80m
160m
320m
Pixel size in meters
41SM - Results Attributes
Area in either not cut or falsely cut at
different levels of introduced random error in age
Best and worst refers to the best and worst cases
model results from 25 trials
42SM - Results Attributes
Cords either not cut or falsely cut at different
levels of introduced random error in Site Index
43Conclusions
- Rei Liu
- Error in raster coverages can impact decisions
made using GIS decision support system. - The cost of quality data is worth the investment
- Susanna
- Picking the right raster size can impact the
results of decision making programs - Accurate attribute data is very important in
these programs
Quality pays!