Title: Playing with Spaghetti: Vector and Raster Data Models in Depth
1Playing with SpaghettiVector and Raster Data
Models in Depth
- Talbot J. Brooks
- ASU Dept. of Geography
2Tonights topics
- Recap of discussion so far
- Big picture overview Raster vs. Vector
- The details Vector data models
- The details Raster data models
- Cardinality an exercise
3Review you tell me
- What is the difference between vector and raster
data? - Basic vector data types
- Examples of raster data
- Computer file structures
- Flat
- Hierarchical
- Network
- Relational
4RASTER AND VECTOR FORMATS
RASTER Grid-based, Simplify reality VECTOR
Analog map, Cartography
5DATA MODEL OF RASTER AND VECTOR
REAL WORLD
1 2 3 4 5 6
7 8 9 10
1 2 3 4 5 6 7 8 9 10
GRID RASTER
VECTOR
6RASTER DATA MODEL
- derive from formulation that real world - it has
spatial elements and objects fills those elements - real world is represented with uniform cells
- list of cells is a rectangle
- cell comprises of triangles, hexagon and higher
complexities - a cell reports its own true characteristics
- per units cell does not represent an object
- an object is represented by a group of cells
7Lake
River
Pond
Reality - Hydrography
Lake
River
Pond
Reality overlaid with a grid
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
0
0
0
0
0
0
0
0 No Water Feature 1 Water Body 2 River
1
1
1
2
0
0
0
0
0
0
0
0
2
2
1
1
0
0
0
0
0
0
0
0
2
2
0
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
0
0
0
0
Resulting raster
Creating a Raster
8VECTOR DATA MODEL
- derived from the formulation of spatial concepts
that emphasize on real world objects - geometry primitives of vector data model are
point, line and polygon - objects can be built from these primitives
- object location determined by represented
location point - uniqueness of vector data model lies in its
management and storage of data geometry
primitives - spaghetti model
- topology model
9VECTOR CHARACTERISTICS
POINT X LINE POLYGON
10RASTER TO VECTOR
RIVER CHANGED FROM RASTER TO VECTOR FORMAT
RIVER THAT HAS BEEN
VECTORISED ORIGINAL RIVER
11PRO AND CONS OF RASTER MODEL
- pro
- raster data is more affordable
- simple data structure
- very efficient overlay operation
- cons
- topology relationship difficult to implement
- raster data requires large storage
- not all world phenomena related directly with
raster representation - raster data mainly is obtained from satellite
images and scanning
12PRO AND CONS OF VECTOR MODEL
- pro
- more efficient data storage
- topological encoding more efferent
- suitable for most usage and compatible with data
- good graphic presentation
- cons
- overlay operation not efficient
- complex data structure
13A look behind the scenes Vector GIS data models
- Spaghetti model
- Topological vector model
- Cardinality (this is gonna hurt!)
- Break
14The Spaghetti Model
- The spaghetti model is the most simple vector
data model - The model is a direct representation of a
graphical image - NO explicit topological information
15Spaghetti Model
- Description direct line for line translation of
the paper map (often viewed as raw digital data) - Pros easy to implement, good for fast drawing
- Cons storage and searches are sequential,
storage of attribute data
16Spaghetti model
17Topology
- Branch of mathematics dealing with geometric
properties - Geometry of objects remain invariant under
transformations - Neighborhood relationships remain the same
- Topology is the distinguishing basis for more
complicated vector models
18Topological Vector Model
- Topological data models are provided with
information that can help us in obtaining
solutions to common operations in advanced GIS
analytical techniques. - This is done by explicitly recording adjacency
information into the data structure, eliminating
the need to determine it for multiple operations. - Each line segment, the basic logical entity in
topological data structures, begins and ends when
it either contacts or intersects another line, or
when there is a change in direction of the line.
19Topological Vector Model
- Each line has two sets of numbers, a pair of
coordinates and an associated node number. - Each line segment has its identification number
that is used as a pointer to indicate which set
of nodes represent its beginning and ending.
20Topological Vector Model
- Polygons also have identification codes that
relate back to the link numbers. Each link in
the polygon now is capable of looking left and
right at the polygon numbers to see which two
polygons are also stored explicitly, so that even
this tedious step is eliminated. - The Topological data model more closely
approximates how we as map readers identify the
spatial relationships contained in an analog map
document.
21Topological Vector Model
22How do we preserve topology ina computer
database?
- What are we storing?
- Points, lines, polygons
- What do we need to preserve?
- Neighborhood relationships between these objects
- Terminology
- point, link, node, polygon
23Terminology
- Point x, y coordinate identifying a geographic
location - Link (line, arc) an ordered set of points with a
node at the beginning and end of it - Node the beginning and end of link (often
defined where 3 or more lines connect) - Polygon two or more links connected at the
nodes, contains a point inside to identify the
polygons attributes
24Nevada
Utah
California
Arizona
25Identify the polygons
26Create the polygon attribute table (PAT)
27Identify the nodes
28Node table
29Identify the links (arcs, lines)
30Simplify this
31Create the topology!
32Nodes First
33Nodes First
34Polygons
35Polygons
36Identify the points
37Link List
38Point Coordinates
39Putting it all together
40Putting it all together
41Putting it all together
42Putting it all together
43Putting it all together
44Cardinality
- Cardinality is the relationship between spatial
objects, attributes, or spatial objects and
attributes. - This relationship may be defined as
- 11
- 1many
- manymany
45Cardinality
- We can use cardinality to establish relationships
and rules among objects and attributes - This becomes the basis for modeling how data is
arranged within a GIS - especially one that uses
vector data.
46Cardinality contd
- Entity-entity relationships are described by
cardinality which may be - One to one. A FOREST can have only one MANAGER
and a MANAGER can have only one FOREST - Many to one. Many FACILITIES may be contained
within one FOREST - Many to Many. The relationship water_supply may
have many entries and may be connected to many
entries FACILITIES, FOREST, etc
47Cardinality contd
- The same concept applies to space
- A bathroom is located within a house (11)
- Many homes are within a town (many1)
- Many people are within many homes (manymany)
48Diagram Characteristics
- Boxes represent entities
- Ovals represent attributes
- Diamonds represent relationships
- Note how cardinality is depicted
- Key attributes are underlined
- Multi-valued attributes are in double ovals
49Entity-Relationship (ER) Diagrams A Conceptual
Model
50Exercise work in pairs 10 minutes
- Create a simple ER diagram for your neighborhood
- Pick a feature that matches each geometry type
(point, line). For example - For points, you might pick fire hydrants and lamp
posts - For lines, you might pick streets and water mains
- For polygons, pick parcels or zip codes
51Break time!
52Raster data
53What type of data?
- Continuous data
- Examples elevation, temperature
- Square grid tessellation also called raster
54Raster Models (tessellation)
55Raster
Data values are stored in rows and columns
56Two types
- Scanned Map images
- Digital Raster Graphic
- Other maps
- Tessellation Models
- Square Grid Tessellation
- Hexagon Tessellation
57Scanned Maps
- Scanned map as a photograph
- The value of each cell represents the color on
the map needs to be interpreted the way a
paper/analog map is interpreted
58Digital Raster Graphic (DRG)
There is typically another file linked with the
DRG, so that the geographic position of the
graphic is known
59MapQuest
60Maps or Images??
61Summary of scanned maps
- Have the characteristics of an analog map in that
the location information and the attributes are
stored as a visual product - No queries can be made based on the database
62Tessellation Models
- Location-based spatial data model process of
dividing an area into smaller, contiguous tiles
with no gaps between them - Types
- regular and irregular
- Uses continuous surfaces
- Pros easy to implement and manipulate
- Cons high data storage, output not cartographic
quality
63Spatial and Attribute Data
- Combined in a single file
- Unlike the scanned maps, they can be searched
64Tessellation Models
Most common
Rarely used
65Tessellation models
Regular grid
66Data
- Rows and columns containing the attribute value
associated with each data layer - The row/column location of the data value
represents the spatial position - Exact geographic position is typically
established with header information before the
rows and columns of data - Also need knowledge of what the values represent
(e.g., elevation in meters) typically part of
the metadata
67Rows and Columns
68Geographic Position
origin
orientation
size of each cell
69Sample data
70Each cell has a value
71Data File
Origin (x,y) Ymax (x,y) Row,col Cell size
72Tessellation models
Hexagonal mesh
Primary advantage over square grid tessellation
is distance measurements. Important in
applications that need to spread distances evenly
- e.g., spread of forest fires
73Distance between adjacent cells?
Example modeling the spread of a fire from one
cell to the next adjacent cell.
74Distance measurements between cells is the same
in the hexagon model
75Where do we get raster data? Four sources
- Data that are collected in a raster format (e.g.,
satellite data) - Data in vector format converted to raster format
- Data in a paper map converted to raster format
- DRG
- Converted into a tessellation database
- Interpolating data from points
76One satellite data
- Example Landsat Thematic Mapper (TM) data from
USGS
77(No Transcript)
78(No Transcript)
79Multispectral
- Multispectral meaning that each cell has more
than one value (different sections of the
electromagnetic spectrum) associated with it
(these are called bands)
80Bands and Resolution
- Fixed spatial resolution (either 30 meters or 120
meters) depending on the band
Landsats 4-5 Wavelength (micrometers) Resolution
(meters) Band 1 0.45-0.52 30 Band 2
0.52-0.60 30 Band 3 0.63-0.69 30 Band
4 0.76-0.90 30 Band 5 1.55-1.75 30
Band 6 10.40-12.50 120 Band 7 2.08-2.35
30
81What can we do with the bands?
- Band 1 penetrates water for bathymetric mapping
along coastal areas and is useful for
soil-vegetation differentiation and for
distinguishing forest types. - Band 2 detects green reflectance from healthy
vegetation, and - Band 3 is designed for detecting chlorophyll
absorption in vegetation. - Band 4 data is ideal for detecting near-IR
reflectance peaks in healthy green vegetation and
for detecting water-land interfaces. - The two mid-IR red bands on (bands 5 and 7) are
useful for vegetation and soil moisture studies
and for discriminating between rock and mineral
types. - The thermal-IR band on (band 6) is designed to
assist in thermal mapping, and is used for soil
moisture and vegetation studies.
82False color
- Bands 4, 3, and 2 can be combined to make
false-color composite images where band 4
represents the red, band 3 represents the green,
and band 2 represents the blue portions of the
electromagnetic spectrum. This combination makes
vegetation appear as shades of red, brighter reds
indicating more vigorously growing vegetation.
Soils with no or sparse vegetation range from
white (sands) to greens or browns depending on
moisture and organic matter content. Water bodies
will appear blue. Deep, clear water appears dark
blue to black in color, while sediment-laden or
shallow waters appear lighter in color. Urban
areas appear blue-gray in color. Clouds and snow
appear bright white. Clouds and snow are usually
distinguishable from each other by the shadows
associated with clouds
83False Color Example
84False Color example
85False Color example
86With the same data (NDVI)
Normalized difference vegetation index
87Where do we get raster data? Four sources
- Data that are collected in a raster format (e.g.,
satellite data) - Data in vector format converted to raster format
- Data in a paper map converted to raster format
- DRG
- Converted into a tessellation database
- Interpolating data from points
88Second source for raster data
- Data that are in another format (either vector or
paper map) and need to be converted to a raster
format
89Land use in vector format
To convert it, we need to decide what size each
cell needs to be. How do we decide? Minimum
mapping unit and spatial resolution.
90Sort the database
91Minimum mapping unit
92Better
This would give us a 2 m cell size
93Default settings
94Resulting data
95Resulting data
755 (default)
96200 meters
97100 meters
9810 meters
99(No Transcript)
100(No Transcript)
101(No Transcript)
102Which is best?
vector
100 meter
10 meter
103Area and database size comparisons
104Three conversion from a paper map
- Scanning can convert to a DRG or into a square
grid or hexagon database - Same rules apply as with vector scanning best
approach is to trace to mylar, then scan - (my personal experience it is easier to vector
digitize, then use software to convert to raster
format)
Note with scanning you can create either a DRG
or a tessellation database
105Database size can be a problem Compaction
Run length encoding
106In some cases, there is very little you can do
107Four sources
- Data that are collected in a raster format (e.g.,
satellite data) - Data in vector format converted to raster format
- Data in a paper map converted to raster format
- Interpolating data from points