Title: GIS Data Models
1GIS Data Models
2Objective
- To understand the evolution of GIS and how GIS
history directly effects its form and function
today
3Topics
- Brief history of GIS evolution
- Explanation of databases
- Overview of GIS data models
4Early GIS
- A GIS can be conceptualized as the use of
overlays placed upon a base map. - GIS-type systems pre-date the invention of the
computer - For example, deaths of cholera were mapped using
overlays by Dr. John Snow in 1854. It allowed
him to find that victims were all drinking from a
common well. - Maps of the Battle of Yorktown were drawn by
Louis-Alexandre Berthier using overlays.
5Enter the computer
- With the advent of the computer, a new tool was
added to the arsenal available to designers,
cartographers, and engineers. - The invention of Computer Aided Design (CAD)
allowed for the display of vector maps with lines
on a computer screen in the late 1950s and early
1960s. - Data was stored in binary file formats with dot
representations for points, lines, and arc. - This data model could make little or no use of
attribute data.
6Enter the computer (contd)
- The computer allowed for a change in basic
cartographic technique because it allowed for
more complex analysis of geographic information
at relatively fast rates. - The Dept. of Geography at the University of
Washington pioneered the way to modern GIS.
7University of Washington GIS Gurus
- Nystuen - fundamental spatial concepts -
distance, orientation, connectivity - Tobler - computer algorithms for map projections,
computer cartography - Bunge - theoretical geography - geometric basis
for geography - points, lines and areas - Berry's Geographical Matrix of places by
characteristics (attributes) - regional studies
by overlaying maps of different themes -
systematic studies by detailed evaluation of a
single layer
8Canada Geographic Information System
- First modern GIS developed in 1960, Roger
Tomlinson was key developer - use of scanning for input of high density area
objects - maps had to be redrafted (scribed) for scanning
- note scribing is as labor intensive as
digitizing - vectorization of scanned images
- geographical partitioning of data into "map
sheets" or "tiles" but with edgematching across
tile boundaries - partitioning of data into themes or layers
9CGIS (contd)
- use of absolute system of coordinates for entire
country with precision adjustable to resolution
of data - number of digits of precision can be set by the
system manager and changed from layer to layer - internal representation of line objects as chains
of incremental moves in 8 compass directions
rather than straight lines between points
(Freeman chain code) - coding of area object boundaries by arc, with
pointers to left and right area objects
10CGIS (contd)
- first "topological" system with planar
enforcement in each layer, relationships between
arcs and areas coded in the database - separation of data into attribute and locational
files - "descriptor dataset" (DDS) and "image dataset"
(IDS) - concept of an attribute table
- implementation of functions for polygon overlay,
measurement of area, user-defined circles and
polygons for query
11(No Transcript)
12Explanation of database types
- a database is a collection of non-redundant data
which can be shared by different application
systems - implies separation of physical storage from use
of the data by an application program, i.e.
program/data independence - changes can be made to data without affecting
other components of the system.
13Database types
- tabular ("flat file") - data in a single table
- hierarchical
- network
- relational
14The ideal GIS database is one that maximizes the
uniqueness of every feature while minimizing
total data quantity
15Hierarchical databases
- Developed in the 1960s by International Business
Machines (IBM) - Somewhat resembles real-world filing systems
- Tree-structured, similar to folder arrangements
in a computer directory - The database keeps track of the different record
types, their attributes, and the hierarchical
relationships between them - The attribute which assigns records to levels in
the database structure is called the key (e.g. is
record a department, part or supplier?)
16Features of a hierarchical model
- a set of record "types"
- e.g. supplier record type, department record
type, part record type - a set of links connecting all record types in one
data structure diagram (tree) - at most one link between two record types, hence
links need not be named - for every record, there is only one parent record
at the next level up in the tree
17Features (contd)
- e.g. every county has exactly one state, every
part has exactly one department - no connections between occurrences of the same
record type - cannot go between records at the same level
unless they share the same parent - diagram
18Pros and cons
- data must possess a tree structure
- tree structure is natural for geographical data
- data access is easy via the key attribute, but
difficult for other attributes - in the business case, easy to find record given
its type (department, part or supplier) - in the geographical case, easy to find record
given its geographical level (state, county,
city, census tract), but difficult to find it
given any other attribute
19Pros and cons (contd)
- e.g. find the records with population 5,000 or
less - tree structure is inflexible
- cannot define new linkages between records once
the tree is established - e.g. in the geographical case, new relationships
between objects - cannot define linkages laterally or diagonally in
the tree, only vertically
20Pros and cons (contd)
- the only geographical relationships which can be
coded easily are "is contained in" or "belongs
to" - DBMSs based on the hierarchical model (e.g.
System 2000) have often been used to store
spatial data, but have not been very successful
as bases for GIS
21Network data model
- developed in mid 1960s as part of work of CODASYL
(Conference on Data Systems Languages) which
proposed programming language COBOL (1966) and
then network model (1971) - other aspects of database systems also proposed
at this time include database administrator, data
security, audit trail - objective of network model is to separate data
structure from physical storage, eliminate
unnecessary duplication of data with associated
errors and costs
22Networked model (contd)
- uses concept of a data definition language, data
manipulation language - uses concept of mn linkages or relationships
- an owner record can have many member records
- a member record can have several owners
- hierarchical model allows only 1n
- example of a network database
- a hospital database has three record types
- patient name, date of admission, etc.
23Networked model (contd)
- doctor name, etc.
- ward number of beds, name of staff nurse, etc.
- need to link patients to doctor, also to ward
- doctor record can own many patient records
- patient record can be owned by both doctor and
ward records - network DBMSs include methods for building and
redefining linkages, e.g. when patient is
assigned to ward
24Problems with the networked model
- links between records of the same type are not
allowed - while a record can be owned by several records of
different types, it cannot be owned by more than
one record of the same type (patient can have
only one doctor, only one ward)
25Relational database model
- the most popular DBMS model for GIS
- Used by ArcInfo
- flexible approach to linkages between records
comes closest to modeling the complexity of
spatial relationships between objects - proposed by IBM researcher E.F. Codd in 1970
- more of a concept than a data structure
- internal architecture varies substantially from
one RDBMS to another
26Relational databases (contd)
- each record has a set of attributes
- the range of possible values (domain) is defined
for each attribute - records of each type form a table or relation
- each row is a record or tuple
- each column is an attribute
- note the potential confusion - a "relation" is a
table of records, not a linkage between records - the degree of a relation is the number of
attributes in the table
27Relational databases (contd)
- 1 attribute is a unary relation
- 2 attributes is a binary relation
- n attributes is an n-ary relation
- Examples
- unary COURSES(SUBJECT)
- binary PERSONS(NAME,ADDRESS) OWNER(PERSON
NAME,HOUSE ADDRESS) - ternary HOUSES(ADDRESS,PRICE,SIZE)
28How a relational database works
- a key of a relation is a subset of attributes
with the following properties - unique identification
- The value of the key is unique for each tuple
- nonredundancy
- no attribute in the key can be discarded without
destroying the key's uniqueness - A prime attribute of a relation is an attribute
which participates in at least one key - All other attributes are non-prime
29Relational database key example
- For example, a phone number is a unique key in a
phone directory - in the normal phone directory the key attributes
are last name, first name, street address - if street address is dropped from this key, the
key is no longer unique (many Smith, Mary's)
30Pros and cons
- the most flexible of the database models
- no obvious match of implementation to model -
model is the user's view, not the way the data is
organized internally - is the basis of an area of formal mathematical
theory
31Pros and cons (contd)
- most RDBMS data manipulation languages require
the user to know the contents of relations, but
allow access from one relation to another through
common attributes Example Given two relations
PROPERTY(ADDRESS,VALUE,COUNTY_ID) COUNTY(COUNTY
ID,NAME,TAX_RATE) - to answer the query "what are the taxes on
property x" the user would
32Pros and cons (contd)
- retrieve the property record
- link the property and county records through the
common attribute COUNTY_ID - compute the taxes by multiplying VALUE from the
property tuple with TAX_RATE from the linked
county tuple
33(No Transcript)
34Evolution of GIS data models
35CAD model
- Vector based mapping
- Maps created with computer aided design programs
(CAD) - Little or no attribute data
36Coverage data model
- Created in 1981 by ESRI as part of ArcInfo, the
first commercially available GIS package - Spatial data stored with attribute data using
indexed binary files - Allowed for storage of topological relationships
37Limitations of coverage model
- All features have a generic behavior
- For example, a highway running across a polygon
split that polygon made defining behaviors
extremely difficult! - Required use of macro code (ArcAML) to resolve
complex features
38The geodatabase
- Created in 2000 by ESRI
- Allows for specific behaviors to be assigned to
specific features without writing code - Based upon a relational database
- Said to be object-oriented data model
39The ArcGIS Environment
- ArcGIS is packaged similar to Microsoft Office.
Whereas Office encompasses Excel, Word, and
PowerPoint, ArcGIS comes with - ArcMap
- ArcCatalog
- ArcToolbox
40Component Overview ArcCatalog
- ArcCatalog acts as the operating system for GIS
its look and feel is similar to Windows Explorer - ArcCatalog allows users to preview data in both a
geographic (map) format and table format
(attributes). - ArcCatalog is the principal management tool for
reading and writing metadata
41Explorer vs. ArcCatalog
Windows Explorer
ArcCatalog
Table of Contents
Preview Pane
Table of Contents
Preview Pane
42A Quick Tour of ArcCatalog
Quick Launch Buttons
Navigation Buttons
Preview Selection Tabs
43ArcCatalog uses a unique symbol set to indicate
data formats
Raster
Geodatabase
Feature Dataset
Feature Classes
44Symbology is maintained among different data
formats
Shapefile formats appear green
45Preview tab allows user to view either geography
or table
Toggle views here
46Metadata Data about the data
Metadata toolbar accessed from VIEW toolbars Allo
ws users to edit all metadata and
select metadata format convention
47References
- National Center for Geographic Information
Analysis (NCGIA), Core Curriculum at - http//www.geog.ubc.ca/courses/klink/gis.notes/ncg
ia/toc.html - Goodchild, M.F., and K.K. Kemp,eds. 1990. NCGIA
Core Curriculum in GIS. National Center for
Geographic Information and Analysis, University
of California, Santa Barbara CA