Estimating Missing Data in Sensor Network Databases Using SpatialTemporal Data Mining to Support Spa - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Estimating Missing Data in Sensor Network Databases Using SpatialTemporal Data Mining to Support Spa

Description:

Space Science - sensors collecting MARS' conditions. ... 'On spring days between 12-2pm if the temperature of a node A is between 30 and ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 21
Provided by: informat1759
Category:

less

Transcript and Presenter's Notes

Title: Estimating Missing Data in Sensor Network Databases Using SpatialTemporal Data Mining to Support Spa


1
Estimating Missing Data in Sensor Network
Databases Using Spatial-Temporal Data Mining to
Support Space Data Analysis
  • Le Gruenwald
  • The University of Oklahoma
  • School of Computer Science
  • Norman, OK 73019
  • ggruenwald_at_ou.edu

2
Project Objective
  • To develop a mining framework to automatically
    estimate missing sensor readings and answer
    deliberate user stream queries

3
Project Progress (2007-2008)
  • Designed and Developed a new Spatio-Temporal
    Mining Framework for answering mining queries and
    estimating missing node values (MASTER Mining
    Autonomously Spatio-Temporal Environmental
    Rules).
  • Conducted simulation experiments comparing MASTER
    with existing approaches using climate sensor
    datasets obtained from the Sensor Webs Botanical
    Garden Project data server.

4
Computing Environment (1)
  • Sensor Networks
  • Triggered by recent technology advances in
    Micro Electro Mechanical Systems (MEMS)
    technology, low-power analog and digital
    electronics, and low-power radio frequency (RF)
    design.
  • Purpose
  • To monitor, combine, analyze and respond to the
    data collected by hundreds (thousands) sensors
    distributed in the physical world in a timely
    manner.
  • Example
  • Space Science - sensors collecting MARS
    conditions.
  • Transportation sensors for traffic
    monitoring.
  • Battlefield sensors attached to soldiers,
    vehicles or scattered throughout important
    areas.

5
The Computing Environment (2)
Data Streams
SERVER
SensorN
Sensor2
Sensor1
Real World
Queries
Answers
USER
6
The Computing Environment (3)
  • Data Streams
  • - the most natural way to process data in the
    majority of sensor network applications - an
    append-only collection of tuples that is ordered
    by some increasing key value (often time)
    Zdonik,02
  • Data Stream Example

sens_id, time(n-4), reading
sens_id, time(n-3), reading
sens_id, time(n-2), reading
sens_id, time(n-1), reading
sens_id, time(n), reading
Sensor X
7
Accomplishment
  • Developed a Spatio-Temporal Mining Framework to
  • Capture the intrinsic spatial and temporal trends
    within sensor data streams
  • Automatically seek spatio-temporal trends to
    estimate any missing values
  • Allow for an SQL-like query processing system to
    evolving trend analysis

8
Framework Contributions (1)
  • Incrementally and compactly store data streams
    without abstracting away key trends using a
    single-pass storing procedure
  • Framework is resource-aware
  • User-defined space usage bound
  • User-defined storing time bound
  • User-defined estimation time bound
  • Quality of Service (QoS)
  • Resource bounds (above)
  • Probabilisitic bound on estimation error margin

9
Framework Contributions (2)
  • Assumption-free no statistical distribution
    models are assumed a priori (e.g. Markovian,
    Gaussian,etc.)
  • Comprehensive Association Rule Definition
  • Include temporal qualifiers (time expressions
    over user defined time attributes)
  • Include any number (as limited by the enforced
    overhead bounds) of node items in the rule
  • Each multidimensional node item in the rule may
    relate to other nodes in the same rule with
    respect to any data range from the entire vector
    space (hence linear and non-linear correlations)

10
System Architecture
11
Sensor Temporal Association Rule Examples
  • On weekdays during rush hours, if a traffic
    sensor A reports between 20 and 30 average
    passing cars per minute then it can be deduced
    that a far off traffic sensor will indicate an
    average of 15 to 20 passing cars per minute.
  • On spring days between 12-2pm if the temperature
    of a node A is between 30 and 35, its humidity
    between 50 and 60, and the temperature reported
    by a separate node B is between 20 and 25 then
    the humidity reported by a third node C is
    likely between 40 and 45.

12
Temporal Association Rule Formalization (1)
  • Traditional Association Rule

13
Temporal Association Rule Formalization (2)
  • Sensor-context Association Rule
  • If node items of the rule have values over the
    vector space of their transmission range then
  • We map back to Boolean items by evaluating each
    node item over a specific subspace (i.e., an item
    evaluates to true when the data falls in the
    particular subspace that is now part of the rule
    definition and false otherwise)
  • The goal of the rule-mining estimation method is
    too seek appropriate node items over appropriate
    respective subspaces to imply the consequent
    subspace of the missing node

14
Temporal Association Rule Formalization (3)
15
Iterative EstimationMethod Workflow
16
SQL-like Query Mining
  • MINE
  • IN
  • SELECT ltnode-listgt
  • FROM ltcluster-idgt
  • WITH ltlargest-allowed-time-periodgt
  • WHERE ltconsequent-node-setgt,
  • ltinitial-relevant-subspace-list-expressiongt
  • HAVING ltminimum-support-thresholdgt,
    ltMCSS-thresholdgt,
  • ltminimum-confidence-thresholdgt

17
Estimation Results Synopsis (1)
  • Input
  • One-year data of sensor readings sampled every 5
    minutes embedded in the Huntington Botanical
    Garden
  • Reported tuple from each sensor pod (temperature,
    humidity, flux)
  • Output
  • Estimated missing sensor temperature values
  • Performance Measures
  • MAE (Mean Absolute Error)
  • Average Execution Time Per Round (in ms)
  • Space Consumption (in kb)

18
Sensor Network Spatial Map
19
NASA Dataset Server
20
Estimation Results Synopsis (2)
  • Performance Measures
  • MAE (Mean Absolute Error)
  • 0.57 degree Celsius
  • Absolute Error distribution (Temperature Error in
    degree Celsius)
Write a Comment
User Comments (0)
About PowerShow.com