Data Storage and Editing - PowerPoint PPT Presentation

About This Presentation
Title:

Data Storage and Editing

Description:

Data Storage and Editing (Entity and attribute) DeMers Chapter 6 http://www.iupui.edu/~jeswilso/g438/lecture5/ Introduction Any analysis performs must be based on ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 40
Provided by: KAED
Category:

less

Transcript and Presenter's Notes

Title: Data Storage and Editing


1
Data Storage and Editing
  • (Entity and attribute) DeMers Chapter 6
  • http//www.iupui.edu/jeswilso/g438/lecture5/

2
Introduction
  • Any analysis performs must be based on good data,
    correctly organized and in the proper format.
  • In raster, we may need to display each coverage
    to isolate illogical or out-of-place grid cells
    as we compare them to the input document
  • In vector systems, we may have to build in
    topology after the initial data input, to
    pinpoint any digitization errors
  • In case of entity-attribute agreement, we may
    need to output sample portions of our map for
    comparison against the original input material

3
Storage of GIS Databases
  • Raster Attribute values for grid cells are the
    primary data stored in the computer. Values make
    up the actual grid and positions of grid cells
    catalogued relative to the order in which they
    appear e.g., if you store the origin of the
    grid, cell size, and number of rows and columns,
    all you need is the cell values
  • Vector Common for GISs to store vector entities
    and associated attributes in separate files
    (reason for RDBMS). For example, in ArcView shape
    file format, entities are stored in one file,
    attribute in another, and projection info in a
    third file and Arc/Info Coverage ( workspace,
    entity directory, info directory )
  • Tiling - storage of individual sections (tiles)
    in predefined subsections. The purpose is to
    reduce volume of data needed for analysis of any
    particular section e.g., quad boundaries, TR
    grid, etc.

4
The Importance of Editing the GIS Database
  • Most errors result from improper input
  • Generally, at least some errors will always occur
    and require editing, e.g., pushing the wrong
    digitizer button (vertices instead of node),
    pushing the wrong keyboard button when entering
    attribute information, and position errors in
    digitizing (shaky hand)
  • 3 general types of error
  • Entity error - (position error), primarily
    associated with vector model (missing entities,
    incorrectly placed entities, disordered entities)
  • Attribute error ( occurs in both vector and
    raster models, typing errors, misspelling, etc.
  • Entity-attribute agreement error ( a.k.a.,
    logical consistency, correctly type codes
    attached to wrong entities)

5
Accuracy
  • The degree to which information on a map or in a
    digital database matches true or accepted values
  • An issue pertaining to the quality of data and
    the number of errors contained in a data set or
    map
  • It is possible to consider horizontal and
    vertical accuracy with respect to geographic
    position
  • Attribute accuracy - conceptual, and logical
    accuracy
  • Level of accuracy required for particular
    applications varies greatly. Highly accurate data
    can be very difficult and costly to produce and
    compile
  • e.g., mapping standards employed by the United
    States Geological Survey (USGS) "requirements
    for meeting horizontal accuracy as 90 per cent of
    all measurable points must be within 1/30th of
    an inch for maps at a scale of 120,000 or
    larger, and 1/50th of an inch for maps at scales
    smaller than 120,000."

6
Accuracy Standards for Various Scale Maps
  • 11,200 3.33 feet
  • 12,400 6.67 feet
  • 14,800 13.33 feet
  • 110,000 27.78 feet
  • 112,000 33.33 feet
  • 124,000 40.00 feet
  • 163,360 105.60 feet
  • 1100,000 166.67 feet

7
Accuracy Standards for Various Scale Maps
  • 11,200 3.33 feet
  • 12,400 6.67 feet
  • 14,800 13.33 feet
  • 110,000 27.78 feet
  • 112,000 33.33 feet
  • 124,000 40.00 feet
  • 163,360 105.60 feet
  • 1100,000 166.67 feet

8
Precision
  • Refers to the level of measurement and exactness
    of description in a GIS database (e.g., number of
    decimal places)
  • Precise locational data may measure position to a
    fraction of a unit e.g. to the millimeter
  • Precise attribute information may specify the
    characteristics of features in great detail
  • Important to realize, however, that precise
    data--no matter how carefully measured--may be
    inaccurate
  • Level of precision required for particular
    applications varies greatly. Engineering projects
    such as road construction require very precise
    information measured to the millimeter.
    Demographic analyses of marketing or electoral
    trends can often make do with less, say to the
    closest zip code or precinct boundary

9
Why be concerned about error? - The Problems of
Propagation and Cascading
  • Discussion focused to this point on errors that
    may be present in single sets of data
  • Doing" GIS usually involves comparisons of many
    sets of data. If errors exist in one or all of
    the data layers, the solution to the GIS problem
    generated from them may itself be erroneous
  • Inaccuracy, imprecision, and error may be
    compounded in GIS that employ many data sources

10
DIGITIZATION-continue
Tic

3
2
1
4
Geographic features
11
Error Propagation and Cascading
  • Occurs when one error leads to another
  • Means that erroneous, imprecise, and inaccurate
    information will skew a GIS solution when
    information is combined
  • DeMers - "error prone data will lead to error
    prone analysis"
  • e.g., if a map registration point has been
    mis-digitized in one coverage and is then used to
    register a second coverage
  • Result the second coverage will propagate the
    first mistake
  • In this way, a single error may lead to others
    and spread until it corrupts data throughout the
    entire GIS project

12
Entity Errors Vector
  • Six categories identified by DeMers/ESRI
  • All entities that should have been entered are
    present
  • No extra entities have been entered
  • Entities are in the right place and are of the
    correct shape and size
  • Entities that are supposed to be connected to
    each other are all polygons have a single label
    point which identifies them
  • All entities are within the outside boundary
    identified

13
Nodes and Vertices
  • Specific types of entity errors in vector GIS
  • can involve points, lines, polygons, nodes,
    vertices, label points
  • nodes - denote ends of lines or point where
    polygon closes on itself
  • vertices - denote change or direction within a
    line
  • points -gt lines -gt polys
  • Nodes are used to show specific topological
    relationships, e.g.
  • intersection of roads or streams
  • intersection between stream and lake
  • node errors include pseudo nodes and dangle
    nodes

14
Pseudo nodes
  • Occur where lines connect with itself or other
    line
  • A line connects with itself to form a polygon,
    a.k.a. island pseudo node (fig. 6.1a, p. 161)
  • Also occur where two lines intersect (rather than
    crossing) (fig. 6.1b)
  • Pseudo nodes are not necessarily errors, but
    indicate the potential location of errors
  • e.g., pseudo node in the middle of a line
    representing a node can be used to separate road
    into two different speed limit zones
  • Others may indicate error, (pushed wrong button
    when digitizing, placed cursor at wrong location)

15
Digitization errors- Pseudo node (Diamond)
Pseudo node connects two and only two arcs
Pseudo node Not representing a serious errors
Pseudo node
Error
16
Dangle nodes
  • A single node connected to a single line
  • Again, not necessarily and error, but may be
  • Can result from three possible mistakes (fig.
    6.2, p. 162)
  • Failure to close a polygon
  • Undershoot
  • Overshoot
  • Sometimes result from incorrect placement,
    sometimes from fuzzy tolerance and snapping
    distance
  • One method of general error detection is
    comparing digitized to original document at
    equivalent scales good for broad scale
    obvious errors, not for finer scale errors

17
DIGITIZATION
  • For linear features such as rivers, roads,
    railways it is important to digitize each section
    separately (start node and end node at a
    specified section) or use Route latter

Node
18
Digitization errors - Dangle Error (square)
Overshoot
Closed polygon
Undershoot
Natural feature
Acceptable dangle node e.g. end of roads
19
Label point and sliver errors
  • Polygon label point errors ( points -gt lines -gt
    polys)
  • Label point is used to associate a polygon with
    attributes
  • If label point is missing, or there are more than
    one, indicates error e.g., fig. 6.4, p. 163
  • Sliver polygon errors
  • Commonly result from incorrect practice of double
    digitizing
  • Can also result from overlay or merging
    operations which join coverages from different
    sources
  • Can be removed manually or by dissolving polygons
    less than a certain area and/comparing intended
    number of polys with actual number
    (Fig. 6.5, p.164)

20
Digitization errors-Labels
  • Missing labels or too many labels

too many labels
missing labels
21
Sliver polygon errors
22
How to correct digitization errors?
  • List digitization errors using the command
    (Nodeerrors and Labelerrors)
  • Using ARCEDIT to edit the coverage then use the
    commands (edit feature (ef) e.g. ef label, ef
    node, ef arc
  • Use a series of commands such as nodesnap,
    arcsnap, reshape, split, add, delete, move, copy,
    rotate, extend, and unsplit
  • For labels use Createlabels

23
Topology
  • Topology is the process of projecting complex
    surfaces to a simple ones
  • Topology is a procedure for explicitly defining
    spatial relationships connecting adjacent
    features (e.g., arcs, nodes, polygons, and
    points).
  • Different types of spatial relationships are
    expressed as lists of features e.g.
  • An area is defined by the arcs comprising its
    border
  • An arc is defined by set of points (X,Y)

24
Topology-Main Concepts
  • The three major topological concepts are
  • Connectivity Arcs connected to each other at
    nodes
  • Contiguity/Adjacency Arcs have direction and
    left and right sides
  • Area Definition Arcs connected to surround an
    area define a polygon (area)

25
Spatial Relationships(Topology)
Area Definition
Connectivity
Adjacency
26
PolygonTopology
27
Advantages of Topology
  • Check for digitization errors (overshoot,
    undershoot, unclosed polygon, missing labels, too
    many labels)
  • Store data more efficiently (eliminate data
    redundancy-normalization)
  • Make spatial analysis more faster

28
Topology
  • Topological data structures dominate GIS
    software.
  • Topology allows automated error detection and
    elimination.
  • Rarely are maps topologically clean when
    digitized or imported.
  • A GIS has to be able to build topology from
    unconnected arcs.
  • Nodes that are close together are snapped.
  • Slivers due to double digitizing and overlay are
    eliminated.

29
Creating topology in Arc/Info
  • After digitization and correction to digitization
    errors topology can be built
  • The command BUILD is used for point, line, or
    polygon coverages
  • The command CLEAN is used for line and polygon
    coverages
  • CLEAN never create topology for point coverage
  • BUILD never detect intersection of arcs and
    polygons

30
Topology commands
  • C\ARC CLEAN in-cov out-cov dangle-length
    fuzzy-tol
  • C\ARC CLEAN road1 road2 3.4
  • C\ARC BUILD in-cov POLY/ LINE/ POINT
  • C\ARC BUILD cities POINT
  • For features that have no intersection such as
    contours, BUILD with line option can be used
  • For features that have intersection such as roads
    and lots, it is better to first use CLEAN and
    then use BUILD

31
Tables created by topology
  • Arc Attribute Table (AAT)
  • Polygon Attribute Table (PAT)
  • Point Attribute Table (PAT) Area and perimeter
    0
  • Route Attribute Table (RAT)
  • Feature Attribute Table (FAT)
  • Node Attribute Table (NAD)

32
Hint for topology
  • Make a copy of the original data before start
    building topology
  • Make a known strategy for naming of the coverages
  • For example, names of raw coverages start with R
    e.g Rroads and Rlanduse
  • Keep coverage names less than or equal 8
    characters and without extension (8.3)

33
Coordinate Transformation
  • The tablet coordinates must be converted to real
    world (map) coordinates
  • The commands that used for coordinate
    transformation are
  • CREATE or GENERATE - used to create a master
    coverage
  • The (X,Y) of the tic file (Tic.dbf) must be set
    to map coordinates.
  • TRANSFORM - used to transform the coverage

34
Coordinate Transformation-continue
  • Latitude (Ø) and longitude (?) must be converted
    to Decimal degrees (DD) e.g. Latitude 13 deg
    45 min/6055/360
  • Project the decimal degrees to plane coordinate
    e.g. UTM

(50,80)
(5,8)
Map coordinates
Digitizer coordinate
(0,0)
(0,0)
35
Generate
  • Generate can create a coverage from raw
    coordinates (Id, X,Y) e.g. from GPS
  • Create a file of tic coordinates e.g. Tic1 which
    is ACII with (TICID, X, Y)
  • Create a file of polygon coordinates e.g. poly1
  • GENERATE INPUT Tic1 TICS
  • GENERATE INPUT Poly1 POLYS Quit

36
Attribute Errors Raster and Vector
  • Attribute errors generally more difficult to
    detect
  • Types include
  • Missing attributes perhaps only kind of
    attribute error traceable without comparison to
    source material e.g., plot all polygons and color
    them according to a certain attribute, if color
    is missing, attribute is missing
  • Incorrect attribute values or text more
    difficult to detect one method is to plot all
    polygons and color them according to a certain
    attribute, if only one polygon has a certain
    attribute and there should be other, it may stick
    out, in general, involves direct comparison with
    source material)

37
Dealing With Projection Changes
  • Often times, regardless of input method, separate
    GIS data input for a project will be based on
    different projection systems
  • Necessary to transform all data to common system
    before use in integrated modeling examples in
    ArcView
  • Joining Adjacent Coverages Edge Matching
    (Union)
  • Joining two adjacent coverages (usually of the
    same theme) together to produce a single data set
    that covers a broader region edge matching also
    done in raster systems

38
Conflation and Rubber Sheeting
  • Conflation and Rubber Sheeting Refers to the
    registration (georeferencing) of two maps (vector
    or raster) in a non-linear way (Ovelay two maps)
  • Used to make maps of different sources spatially
    correspond with one another. Most often used in
    raster data using ground control points (GCPs).
    Conflation and rubber sheeting are synonymous
    terms according to DeMers (Figure 6.1, p. 174)
  • The need to geo-reference internal objects
    themselves not just the map corners (Rubber
    Sheeting)
  • Templating " cookie cutting"
  • If you have multiple coverages of different
    extents, the template is used to "cookie cut"
    them all to the same extent

39
Exercise
  • Characteristics of data storage in raster and
    vector
  • 3 general types of error in spatial databases
  • Accuracy vs. precision
  • Error propagation and cascading of error in GIS
  • Types of errors in vector GIS
  • Types of errors in attribute data
  • The concept of topology - what is it, what types
    of Relationships are stored for point, line, and
    poly features, why do we need it in GIS?
Write a Comment
User Comments (0)
About PowerShow.com