Knowledge Discovery from Transportation Network Data - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Knowledge Discovery from Transportation Network Data

Description:

Knowledge Discovery from Transportation Network Data Paper Review Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., and Banich, B. Knowledge Discovery from ... – PowerPoint PPT presentation

Number of Views:131
Avg rating:3.0/5.0
Slides: 20
Provided by: emoryEdu
Category:

less

Transcript and Presenter's Notes

Title: Knowledge Discovery from Transportation Network Data


1
Knowledge Discovery from Transportation Network
Data
  • Paper Review
  • Jiang, W., Vaidya, J., Balaporia, Z.,
    Clifton, C., and Banich, B. Knowledge Discovery
    from Transportation Network Data. In ICDE, 2005

2
Outline
  • Background.
  • Experiments.
  • Structurally Similar Routes
  • Temporally Repeated Routes
  • Experiment results.
  • Conventional techniques.
  • New challenges.

3
A natural application area for Data Mining
  • Transportation and logistics are an important
    sector of the economy.
  • --Transportation consumes 60 of oil
    worldwide
  • Data mining has lead to significant gains in
    other areas
  • Computer use is widespread in transportation and
    logistics.
  • --Inventory management, parcel tracking,
    and even on-truck location sensors

4
Existing Applications
  • Data Mining
  • Mining with transactional characteristics of
    freight and events.
  • -- i.e. classification on
    safety/accident records might find that trucks
    are prone to accidents at 700 AM on east - west
    roads.
  • -- NO geometry of the network.
  • Network Structure
  • Optimization
  • -- Finds solution (Minimize cost)

5
Transportation Networks
  • Graph problems
  • Graph mining
  • i.e. Finding the frequent sub-graphs
  • Algorithms
  • WARMR
  • AGM
  • SUBDUE
  • FSG

6
Dataset
  • Six months of origin-destination (OD) data from a
    large third-party logistic company. 98,292
    transactions.
  • Represented as a directed graph by mapping
    locations to vertices.
  • Each transaction can then be represented as the
    edge of an OD pair.
  • The edges are labeled with the other attributes
    of the transaction pickup date, delivery date,
    distance, hours, weight, and mode. (binning
    strategy)

7
(No Transcript)
8
Mining Interests
  • Structurally Similar Routes
  • --Identify structurally similar patterns that
    occur in many locations.
  • Methods SUBDUE
  • FSG
  • Temporally Repeated Routes
  • --Find patterns of routes repeated in time,
    rather than space.
  • Method FSG

9
Structurally Similar Routes
  • We assign all vertices the same label.
  • Three variants for edge labels weight, distance,
    and time.
  • -- OD_TD TOTAL-DISTANCE
  • -- OD_GW GROSS-WEIGHT
  • -- OD_TH MOVE-TRANSIT-HOURS

10
Experiments with SUBDUE (MDL principle)
  • SUBDUE A substructure discovery system
  • Results
  • Took about 3.25 hours to handle a graph of 100
    vertices and 561 edges to find the best 3
    patterns of beam size 4.
  • Would need 6 months on the complete graph.
  • Results were trivial.

11
  • Significant traffic from node 2 to node 4 via
    node 3, but not much return traffic (deadheading)

12
Experiments with FSG
  • FSG mines patterns across a set of graph
    transactions.
  • Divides the single graph into multiple distinct
    sub-graphs, and treats each sub-graph as a
    separate transaction.
  • Breadth first partitioning
  • Depth first partitioning
  • Both may result in patterns being broken across
    partitions

13
  • Results
  • Partition sizes 400, 800, 1200 and 1600.
  • Depth-first partitioning 200 frequent patterns
    were found with the minimum support 120.
  • Breadth-first partitioning 667 frequent patterns
    were found with the minimum support 240.
  • Had runtime and memory problems with lower
    supports on the breadth-first partitions.
  • FSG is not an appropriate tool to use for mining
    recurrence patterns in a large single graph

14
(No Transcript)
15
Temporally Repeated Routes
  • FSG
  • Exploits the temporal nature of the
    transportation graph
  • Partition each graph into a set of graph
    transactions based on date

16
  • Results
  • Unable to run FSG on the entire data set due to
    insufficient memory / swap space.
  • Most were small patterns. (The following is the
    biggest one)

17
Patterns Discovered by Using ConventionalMining
Algorithms
  • Mapped the dataset into a standard
    transactional representation.
  • Used traditional data mining approaches.
  • Used Weka for association rule mining, instance
    (tuple) classification and cluster analysis on
    the transportation data.

18
Evaluations of Conventional Algorithms
  • Traditional data mining techniques have produced
    interesting and meaningful results to summarize
    our data.
  • Further experimentation is required to explore
    the potential and limitations of these techniques
    on temporal transportation network data.
  • Lose some insights from the structural
    characteristics of the data.

19
Challenges forData Mining Research
  • Handling the temporal aspects of graphs (dynamic
    graphs).
  • Incorporating the notion of events into a graph.
  • Expanding graph mining techniques beyond data
    similar to molecular structures.
  • Determining what makes a graph pattern
    interesting.
Write a Comment
User Comments (0)
About PowerShow.com