GEOINFO 2006 - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

GEOINFO 2006

Description:

Latitude (decimal degrees) Longitude (decimal degrees) Distance (miles) Heading. Restrictions ... max deviation values to coordinates, distance and route. Begin: ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 23
Provided by: JeDi3
Category:

less

Transcript and Presenter's Notes

Title: GEOINFO 2006


1
GEOINFO 2006
  • Utilização da biblioteca TerraLib para algoritmos
    de agrupamento em Sistemas de Informações
    Geográficas

Use of the TerraLib library for clustering
algorithms in Geographic Information Systems
Mauricio P. Guidini Carlos H. C. Ribeiro
Supervisor
Nov 2006
2
... 3000 unregistered flights, with origin and
destiny unkown by authorities, invaded the
Brazilian airspace in the first ten months of
this year. The Air Force calculates that about
30 of these flights were related to drug dealing
...
Translated from note from
25/10/2004
3
Data Mining in GIS
  • Objetive
  • To present the integration of a Data Mining
    algorithm (k-means) to TerraLib/TerraView,
    forming a Geographic Information System for
    Unknown Air Traffic analysis (GisTAD).

4
Data Mining in GIS
  • Summary
  • Data Mining
  • Clustering Algorithms
  • Air Traffic
  • K-means Implementation
  • Results
  • Aplication

5
Data Mining in GIS
Data Mining Definition A non-trivial process
of identification of valid, new, useful standards
implicitly present in large volumes of data
Knowledge Discovery in Database (KDD) - Fayyad et
al. (1996)
6
Data Mining in GIS
  • How proceed DM?
  • KDD process

7
Data Mining in GIS
Clustering Algorithms The clustering process
tries to grouping the data into groups that have
highly similar features, helping the
understanding of the information that they hold.
A good clustering algorithm is characterized by
the production of high level classes, where the
intraclass similarity is high, and the interclass
similarity is low. Han Kamber 2001
8
Data Mining in GIS
  • Major Categories
  • Partitioning k-means, k-medoids
  • Hierarchical CURE, BIRCH
  • Density-based DBSCAN, OPTICS
  • Grid-based STING
  • Model-based
  • Others
  • ANN Kohonen network
  • Incremental - Leader

9
Data Mining in GIS
  • Air Traffic
  • Movement of aircraft, national or foreign, that
    fly over national territory.
  • Unkown Air Traffic
  • To unidentified airplanes (flight plan), two
    lines of action can be takenBernabeu 2004
  • Intercept or
  • Generate an Unkown Air Traffic Report

10
Data Mining in GIS
  • Traffic Representation
  • Line segments
  • Latitude (decimal degrees)
  • Longitude (decimal degrees)
  • Distance (miles)
  • Heading
  • Restrictions
  • Acceptable deviations

11
Data Mining in GIS
K-means algorithm
Precondition set max deviation values to
coordinates, distance and route Begin K0
While criterion condition not satisfied
(deviation in clusters) Increase K
Arbitrarily choose K centers (among data
objects) While centers change (k-means)
(re)assign routes in cluster based on
weights update centers values
end movement intergroups deviation in groups
ok Save results End
12
Data Mining in GIS
Distance Measure
  • Minimize deviations
  • Improve cluster quality

and
13
Data Mining in GIS
  • GIS Integration
  • TerraLib
  • TerraView
  • k-means

14
Data Mining in GIS
  • Data preparation
  • 8000 records
  • looking for information (what?)
  • Search space restrictions

15
Data Mining in GIS
  • Numeric Tests
  • to 500 records
  • GisTAD Tests
  • 319 records
  • 73 groups
  • Aprox. time 40 sec.

16
TerraView
17
TerraView
18
(No Transcript)
19
Data Mining in GIS
  • Applications
  • Air Operations
  • Improper use of air space

20
(No Transcript)
21
Data Mining in GIS
Conclusion Considering the problem proposed, the
k-means algorithm is applicable, and returned a
good set of clusters. However, the number of
records that must be clustered can make the
application of the algorithm very time consuming.
22
Future Work
  • Other partitioning algorithms should be
    implemented, to verify which one is the most
    efficient for the problem in analysis,
    considering any size of records to be clustered.
  • The algorithms to be tested are
  • Kohonen neural network
  • Leader algorithm.
Write a Comment
User Comments (0)
About PowerShow.com