CP6514 Lecture 4 Databases and GIS - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

CP6514 Lecture 4 Databases and GIS

Description:

The power of GIS lies not only in the pretty maps it can make, but in the ... Quantile Classification. Data broken up into intervals such that: ... – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 26
Provided by: sfre
Category:

less

Transcript and Presenter's Notes

Title: CP6514 Lecture 4 Databases and GIS


1
UST/ENV/PAD/PDD 642/742 Introduction to
Geographic Information Systems
Lecture 4-1 GIS Database Management and
Thematic Mapping I
2
Attribute Database and GIS
The power of GIS lies not only in the pretty maps
it can make, but in the quality and amount of
attribute data behind the map
The referenced attribute data can be used to
drive thematic displays, perform spatial
analyses, and derive new data
3
Definition of a Database
  • A Database is a logically coherent collection
    of data about some aspect(s) of the real world,
    with some inherent meaning
  • The Database is designed, implemented and
    populated with data for a specific purpose
  • More informally, a database is an organized
    collection of records (rows) representing
    distinct persons, places or things.
  • Each record is comprised of a consistent set
    of fields (columns) which contains information
    describing that distinct person, place or thing

4
A Simple Database
5
Field Types Text
  • The Text Field Type holds characters
  • You can have a single character in a text
    string or a set of characters
  • Depending on the type of information and the
    guiding business rules, values that consist
    entirely as digits should be considered strings
  • In queries, text values should be enclosed in
    single or double quotes
  • e.g. LU_CODE A
  • Name Abigail Kristoffersson
  • Zip_Code 30332

6
Field Types Integers
  • Integer Field Types hold integer values
    restricted to particular integer ranges
  • The Long Integer holds signed integers
    between 2,147,483,648 and 2,147,483,647
  • The Short Integer holds signed integers
    between 32,768 and 32,767
  • Integer arithmetic is very fast, but long
    integers have twice the memory requirements as
    short integers
  • e.g. Age 32 short Int
  • Yearned_Income 10000000 long Int

7
Field Types Floats and Doubles
  • Floats and Doubles hold decimal numbers,
    depending on size and accuracy
  • The Float Field Type holds decimal numbers
    with up to 7 significant digits
  • The Double Field Type holds decimal numbers
    with up to 17 significant digits with accuracy
  • Calculations are slower and approximate
  • e.g. Height 1.720000 float
  • X-Coord 2,134,587.916673144 double

8
Field Types Date
  • The Date Field Type allows you to store both
    date and time information for a range between
    Jan. 1, 0100 and Dec. 1, 9999
  • e.g. End_of_World Dec. 31, 1999 1159 PM
  • End_of_World January 1, 2000
  • End_of_World 1/1/2000 0001
  • If no time is specified in the date field
    value, the system will assume 1200 midnight

9
Field Types BLOB
  • BLOB stands for Binary Large Object and is the
    binary data type of a column in an RDBMS table
    that stores large image, text, or geometry data
  • The BLOB field type essentially can store
    complex spatial information
  • The BLOB field is often the RDBMS equivalent
    of the Shape field in the dBase component of a
    shapefile

10
Common Database Models
  • Flat File Data Management Systems
  • Hierarchical Data Management Systems
  • Network Data Management Systems
  • Relational Data Management Systems

11
Flat File Database and GIS
Single table of rows and columns Unique
identifier for each record High degree of
redundancy Several null values Not very
flexible
12
Relational Database Model
  • Consists of several flat file tables or
    Entities
  • Database defined both in terms of entities and
    their relationships
  • Multiple tables can be joined to form a single
    virtual table
  • Both entities and relationships are defined by
    unique identifiers or primary keys

13
JOINS
  • In GISs, new virtual tables are usually
    formed when two or more tables are JOINED
  • In JOINING operations, the source table fields
    are appended to the destination table, if the
    values in the specified join field coincide --
    you therefore have to specify a from source
    table, a source field item, a to destination
    table, and a destination field item before
    implementing a join
  • JOINs are used when the relationship between a
    source and destination has a cardinality ratio
    of 1 1 or 1 N
  • You can then make thematic displays or perform
    analyses using the fields of the source table
  • You have the option of keeping all the source
    records, or keeping only those source records
    that had valid joins

14
One-to-One Relationship (JOIN)
TABLE 2 DESTINATION
TABLE 1 SOURCE
JOIN Relationship TABLE_1 . TR_NAME TABLE_2
. TRACT_NO
New Virtual Table
15
Many-to-One Relationship (JOIN)
TABLE 1 SOURCE
TABLE 2 DESTINATION
New Virtual Table
JOIN Relationship TABLE_1. Soil_Code
TABLE_2. Code
16
Advantages of Relational Model
  • Easier Data Sharing
  • Multi-user Transaction Processing
  • Multiple Views of Database
  • Flexibility, Scale Economies
  • Access Control
  • Redundancy Control
  • Fewer Data Entry Errors
  • Greater Data Consistency
  • Ease in Backup and Recovery

17
Graphic and Attribute Data Linkages (JOIN)
Table 1
Table 2
(SOURCE)
(DESTINATION)
103
I-1
102
104
R-1
C-1
105
101
Interim Output from Table 1 and Table 2
Zoning Layer
Parcel Layer
Table 3.
(Source)
Final Output from Interim Output and Table 3
18
RELATE
  • In RELATING operations, the procedure involves
    declaring the two tables that are going to be
    related, and the two fields that contain the
    relate values -- the relate operation then
    forms a relationship between the two tables,
    while visually keeping them separate
  • RELATES can be used when the relationship
    between a source and destination has a
    cardinality ratio of N 1
  • RELATES are primarily used to select related
    records in one table by selecting records in the
    other table
  • Selections may be done either by location
    spatial query or by attribute query
  • Each RELATE is given a user-specified name

19
RELATE (source destinationN1)
TABLE 1 SOURCE
TABLE 2 DESTINATION
20
Thematic Mapping Numeric Data Classification
  • Used for Exploration purposes
  • Used to make maps more readable, easier to
    understand
  • Used to depict something about the data or area
    that is not self-evident

Numeric Data Classification Methods Natural
Breaks Standard Deviation Quantiles Equal
Interval User Defined or Manual
21
Natural Break Classification
  • Data broken up into interval ranges such that
  • Variance within each range is minimized
  • Used for patterns/groupings inherent in data
  • Uses Jenks optimization method

22
Quantile Classification
  • Data broken up into interval ranges such that
  • each range has an equal number of observations
  • useful for linear patterns in data
  • not very useful if disproportionate numbers of
    observations with very similar values occur

23
Equal Interval Classification
  • Data broken up into intervals such that
  • The range of each interval is equal

Manual or User-specified Classification
  • Continuous - user sets UPPER LIMIT for each range
  • Discontinuous - user sets both UPPER and LOWER
    LIMITS for each range similar to continuous,
    but with some ranges removed
  • Discontinuous useful if no observations occur for
    some intermediate range -- these ranges may be
    skipped!

24
Standard Deviation Classification
  • First, the mean is found for the entire data set
  • Interval breaks found above and below the mean
    such that
  • Breaks occur at 1/4 or 1/3 or 1/2 or 1 standard
    deviation(s) above and below the mean, till all
    the data is classified
  • Useful in cases where the underlying population
    distribution is known
  • Typically, a two-color ramp helps to emphasize
    values above in blue and below in red the
    mean

25
Classification by Normalization
  • Instead of using the direct data value, a
    computation is made based on percent of total
  • Alternately, one can compute a ratio of two data
    values for each record
  • If the variation in the data values is extremely
    large, instead of using the direct data value, a
    computation is made based on the log value
  • The Normalized Value then drives the data
    display

or
or
Write a Comment
User Comments (0)
About PowerShow.com