DIMENSIONS: Why do we need a new Data Handling architecture for sensor networks - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

DIMENSIONS: Why do we need a new Data Handling architecture for sensor networks

Description:

Streaming. Media (MPEG-2) Wireless. Sensor Networks ... Streaming Media (MPEG-2) Web Caches. Geo-Spatial Data Mining. Internet-based Peer-to Peer Systems ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 22
Provided by: lassCs
Category:

less

Transcript and Presenter's Notes

Title: DIMENSIONS: Why do we need a new Data Handling architecture for sensor networks


1
DIMENSIONS Why do we need a new Data Handling
architecture for sensor networks?
  • Deepak Ganesan, Deborah Estrin (UCLA), John
    Heidemann (USC/ISI)
  • Presenter Vijay Sundaram

2
Deployment Microclimate monitoring at James
Reserve Park (UC Riverside)
How well does data fit model ltMgt of variation of
temperature with altitude.
Send robotic agent to edge between low and high
precipitation regions
Weather Sensor Network
Get detailed data from node with maximum
precipitation from Sept to Dec 2003
HmmI wonder why packet-loss is so high. Get a
connectivity map of the network for all transmit
power settings
3
Goals
  • Flexible spatio-temporal querying
  • Provide ability to mine for interesting patterns
    and features in data.
  • Drill-down on details
  • Distributed Long-term networked data storage
  • Preserve ability for long-term data mining, while
    catering to node storage constraints
  • Performance
  • Reasonable Accuracy for wide range of queries
  • Low communication (energy) overhead

4
How can we achieve goals?
  • Exploit redundancy in data
  • Potentially huge gains from lossy compression
    exploiting spatio-temporal correlation
  • Exploit rarity of interesting features
  • Preserve only interesting features.
  • Exploit scale of sensor network.
  • large distributed storage, although limited local
    storage.
  • Exploit low cost of approximate query processing
  • allow approximate query processing that obtain
    sufficiently accurate responses.

5
Can existing systems satisfy design goals?
6
DIMENSIONS Design Key Ideas
  • Construct hierarchy of lossy compressed summaries
    of data using wavelet compression.
  • Queries drill-down from root of hierarchy to
    focus search on small portions of the network.
  • Progressively age lossy data along
    spatio-temporal hierarchy to enable long-term
    storage

Level 2
Level 1
PROGRESSIVELY LOSSY
PROGRESSIVELY AGE
Level 0
7
Roadmap
  • Why wavelets?
  • Example Precipitation Hierarchy
  • Spatial and Temporal Processing internals
  • Initial Results Precipitation Dataset

8
Enabling Technique Wavelets
  • Very popular signal processing approach, that
    provides good time and frequency localization.
  • JPEG2000, Geo-Spatial Data Mining
  • preserves spatio-temporal features (edges,
    discontinuities) while providing good
    approximation of long-term trends in data
  • Efficient distributed implementation possible.

9
Sample Architecture Precipitation Hierarchy
What is the maximum precipitation between
Sept-Dec 2002?
  • Local Processing Construct lossy time-series
    summary (zero communication cost)
  • Spatial Data Processing Hierarchical Lossy
    Compression
  • Organize network into hierarchy. At each higher
    level, reduce number of participating nodes by a
    factor of 4.
  • At each step of the hierarchy, summarize data
    from 4 quadrants, and propagate

Direct query to quadrant that best matches
query
decreasing spatial resolution
decreasing temporal resolution
  • Wavelet
  • Coeffs

10
Spatial Decomposition
  • Recursively split network into non-overlapping
    square grids.
  • At each level of the hierarchy,
  • Elect clusterhead
  • Cluster-head combines and summarizes data from 4
    quadrants
  • Cluster-head propagates compressed data to the
    next level of the hierarchy.
  • Routing protocol GPSR variant (DCS - Ratnasamy
    et al,)

Hierarchy construction
11
Wavelet Compression Internals
Compressed Output
Thresholding Quantization Drop Subbands
Wavelet Subband Decomposition
Lossless Encoder
Input Data
time
y
Filter
x
Cost Metric
  • Communication Budget
  • Error bound
  • Haar Filter
  • Debauchies 9/7 filter

12
Initial Results with Precipitation Dataset
Communication Overhead
  • 15x12 grid (50km edge) of precipitation data from
    1949-1994, from Pacific Northwest. Gridded
    before processing.
  • Handpicked choice of threshold, quantization
    intervals, subbands to drop. Huffman Encoder at
    output.
  • Very large compression ratio up the hierarchy

M. Widmann and C.Bretherton. 50 km resolution
daily precipitation for the Pacific Northwest,
1949-94.
13
Find maximum annual precipitation for each year.
  • Exact Answer for 89 of queries. Within 90 of
    answer for gt95 of queries.
  • Queries require less than 3 of network.
  • Good performance on average with very low lookup
    overhead

14
Locate boundary in annual precipitation between
Low and High Precipitation Areas
  • Error Metric Number of nodes greater than 1
    pixel distance from drill-down boundary
  • Accuracy Within 25 error for 93 of the queries
    (or within 13 error for 75 of the queries)
  • Less than 5 of the network queried.

15
Open Issues
  • Load Balancing and Robustness
  • Hierarchical Model vs Peer Model lot of work in
    p2p systems
  • Irregular Node Placement
  • Use wavelet extensions for irregular node
    placement. Computationally more expensive
  • Gridify dataset with interpolation
  • Providing Query Guarantees
  • Can we bound error in response obtained for a
    drill-down query at a particular level of
    hierarchy?
  • Implementation on IPAQ/mote network

16
Summary
  • DIMENSIONS provides a holistic data handling
    architecture for sensor networks that can
  • Support a wide range of sensor-network usage and
    query models (using drill-down querying of
    wavelet summaries)
  • Provide a gracefully degrading lossy storage
    model (by progressively ageing summaries)
  • Offer ability to tune energy expended for query
    performance. (tunable lossy compression)

17
Different optimization metrics
18
Other Examples Packet Loss
  • Different example of dataset that exhibits
    spatial correlation
  • Throughput from one transmitter to proximate
    receivers is correlated
  • Throughput from multiple proximate transmitters
    to one receiver is correlated.
  • Typically, what we want to query is the
    deviations from normal and average throughput.

19
Packet-Loss Dataset Get Throughput Vs Distance
Map
  • Involves expensive transfer of 12x14 map from
    each node.
  • Good approximate results can be obtained from
    querying compressed data.

20
Long-term Storage Concepts
  • Data is progressively aged, both locally, and
    along the hierarchy.
  • Summaries that cover larger areas and longer
    time-periods are retained for much longer than
    raw time-series.

21
Load Balancing and Robustness Concepts
  • Hierarchical Model
  • Naturally fits wavelet processing
  • Strict hierarchies are vulnerable to node
    failures. Failures near root of hierarchy can be
    expensive to repair
  • Decentralized Peer Model
  • Summaries communicated to multiple nodes
    probabilistically.
  • Better robustness, but incurs greater
    communication overhead.
Write a Comment
User Comments (0)
About PowerShow.com