Datacentric view of sensornets: An Overview - PowerPoint PPT Presentation

About This Presentation
Title:

Datacentric view of sensornets: An Overview

Description:

Communication, computation, limited storage, sensing capabilities ... Bonnet, Gehrke, Seshadri. Sensornet Database architecture ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 37
Provided by: pras72
Learn more at: https://ics.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: Datacentric view of sensornets: An Overview


1
Data-centric view of sensornets An Overview
  • Puru Kulkarni
  • Vijay Sundaram
  • Bhuvan Urgaonkar

2
Motivation
  • Ubiquitous presence of sensor networks
  • Communication, computation, limited storage,
    sensing capabilities
  • Used to sense, actuate, control
  • Sensors everywhere Data everywhere!
  • Require an infrastructure for data access and
    storage

3
Overview
  • Sensors sense/generate data
  • Users/Applications interested in data or some
    measure of data
  • Common user operations are
  • Queries and Monitoring
  • Actuate and Control

4
Typical Queries
  • Historical
  • What is the average rainfall over past 2 days?
  • Current
  • What is the current temperate in Rm 226?
  • Long Running
  • Temperature in Rm 226 over the next 4 hours
    every 30 seconds

5
Issues
  • How to identify relevant sensors?
  • Computation vs. Communication tradeoff
  • Where to process query?
  • inside the sensor network (route query)
  • Need new techniques
  • at a centralized location (route data)
  • Large amounts of data transfer (not efficient)
  • Data gathering may not reflect query rate
  • How to process query?
  • queries on streaming data

6
DataSpace Querying and Monitoring Deeply
Networked Collections in Physical SpaceT.
Imielinski and S. Goel, Rutgers University
  • Billions of objects populate space
  • Each produces and locally stores data
  • Location aware
  • Can be selectively monitored, queried and
    controlled
  • Physical world enhanced with data

7
Characteristics
  • Dataspace
  • Data lives on the object
  • Users access not only local information but can
    navigate entire dataspace
  • Spatial world divided in 3-D datacubes
  • CS Bldg. , street, block etc
  • Communication, messaging and computation
    techniques for querying and monitoring required

8
Querying and Monitoring
  • Queries are spatially driven
  • Steps
  • Identify relevant datacubes
  • Identify relevant nodes (dataflocks)
  • Datacube directory service
  • Aggregation for queries on several datacubes
  • e.g. Information about Manhattan taxi cabs

9
Architecting DataSpace
  • Network as DataSpace engine
  • multicast mechanisms
  • (each node has an IP address!)
  • group membership based on
  • physical location
  • attribute (temperature, vehicles etc)
  • multicast fits selective node addressing criteria
    to access relevant data
  • e.g. what is average temperature in CS Bldg?
  • Query reaches only sensors in the CS Bldg
    datacube and have the corresponding group address

10
Network as DataSpace engine
  • Space Handle encodes datacube information
  • Subject Handle attributes that are part of a
    multicast group
  • Dataspace address is a IPv6 mutlicast address

E.g. Space handle 224.4.5 Subject
handle 8 Dataspace address 224.4.5.8
11
Geographic Routing infrastruture
  • Route message based on physical location rather
    than IP address
  • Use GPS coordinates for locations
  • Avoids use of multicast for routing queries to
    datacubes
  • Once query reaches a region use mutlicast

12
Geographic Routing infrastruture
  • Geo-router (routes based on datacube location)
  • Geo-node (issue query to nodes in datacube)
  • Geo-host (process geographics messages)
  • Approach
  • Route query to datacube
  • Geo-nodes route query within datacube
  • mulitcast with a TTL of 1

13
  • The Sensor Network as a Database
  • Govindan, Hellerstein, Hong, Madden, Franklin,
    Shenker
  • Querying the Physical World
  • Bonnet, Gehrke, Seshadri

14
Sensornet Database architecture
  • Given a routing and access mechanism, how to
    process queries?
  • Provide a DB-view to users/apps
  • well understood programming interface
  • common data operations use computation in network
  • help energy-efficiency
  • allow users to be unaware of actual network, but
    treat it as a database
  • Sensor Network Data gt Sensor Network Database

15
What is required?
  • Core DB operations tailored for sensor networks
  • Design appropriate building blocks for DB
    operations
  • Join, aggregation, grouping, selection etc

16
Sensornet Database Architecutre
  • Two important ideas
  • in-network implementations of primitive database
    query operators such as grouping, aggregation,
    and joins
  • group communication and routing protocols with
    possible processing at intermediate nodes
    implement the operator in an application
    independent way

17
Sensornet Database Architecutre
  • Relax the semantics of database queries to allow
    approximate results
  • relaxation enables energy-efficient
    implementations even given the expected high
    level of network dynamics
  • A sensor network is a proxy for a continuous
    realworld phenomenon, and by nature samples that
    phenomenon discretely at some rate, with some
    degree of error.

18
In-network Implementation
  • JOIN operator
  • selection over cross-product of a pair of tables
  • Tuples generated at different nodes might be
    joined at a single node
  • Some JOIN implementations are blocking
  • Blocking is infeasible in sensor networks
  • tables can contain unbounded streams of data
  • amount of memory available is limited
  • Need to retool these operations
  • Pipelining
  • Partitioning

19
Non Blocking Pipelinined Joins
  • Symmetric hash-join
  • Maintains two hash tables (keyed by the column(s)
    used for the join)
  • On an input tuple, looks up matching tuples from
    other inputs hash table
  • Outputs any matching results
  • Ripple joins
  • Statistically sample the two tables to be joined,
    in order to produce a stream of joined tuples
  • Relative rates at which the two tables are
    sampled adapt to match the variance produced by
    the data in each
  • low energy approach to obtain approximate answers

20
Partitioning
  • Partitioning
  • tuples are partitioned based on their join-column
    values and redistributed on the fly across
    multiple nodes
  • the work of joining the individual partitions is
    done in parallel by each of the nodes
  • Partitions can be defined by value,
    geographically, or by sensor type, and a node (or
    nodes) can be designated to perform the join for
    the partition

21
In-network Implementation
  • Aggregation operators
  • summarization of a column(s) into a single
    numerical value E.g. SUM, COUNT, AVERAGE, MIN,
    MAX etc
  • query flooded in the network and the responses
    are routed on the reverse path trees,
  • results aggregated across several nodes
  • E.g to calculate AVERAGE each node returns (SUM,
    COUNT) values to parent
  • Can be a very common operator

22
Distributed Sensnet DBs
  • How to represent devices in DBs on sensornets?
  • ADTs (Abstract Data Types)
  • Methods correspond to sensing functionality
  • Virtual Relations (VRs) store local data
  • Network used for query operations

23
Virtual Relation
  • VR with attributes as
  • Inputs to an ADT (device) function
  • Arguments to an ADT function
  • Output of the function
  • Timestamp of the function

24
Virtual Relation
  • Some VR properties
  • records are never updated or deleted
  • is naturally partitioned over the sensnet (each
    device takes care of its set of VR records)
  • What does this mean? a distributed DB
  • Records from the VRs (distributed over the
    devices) are processed using distributed query
    execution plans

25
Approximate Results
  • Energy-efficiency can be achieved using
    approximate aggregates
  • Uniform sampling
  • Tuples are uniformly sampled and the resulting
    average is assumed to represent the actual
    average
  • Packet loss might invalidate the statistical
    assumptions that these intervals depend on.
  • Logarithmic sampling
  • The number of respondents (or the size of memory
    needed for the count) scales logarithmically with
    the size of the network
  • Provides looser error bounds but uses
    significantly less memory or communication.

26
Complex query evaluation
  • R x S x T
  • What order to follow?
  • (RxS)xT or Rx(SxT) or (RxT)XS
  • Decided by query optimizer
  • Usually depends on table size
  • With Sensernret DB
  • Need adaptive policy to route tuples based on
  • Energy consumption
  • Topology
  • Loss rates

27
Conclusions
  • Explosion of data from sensor networks needs an
    infrastructure for access, storage etc
  • Organizing sensors
  • Datacubes
  • Other techniques ?
  • Identifying relevant sensors is preliminary to
    fetch data
  • Dataspace provided two solutions
  • Other approaches ?

28
Conclusions
  • Sensornets as Distributed DB
  • Provide a database view to sensornet data
  • Pros
  • App development easy
  • In-network processing helps resource usage
  • Cons
  • Distributed DB can be difficult
  • Requires to retool DB operations for sensornets
  • Other approaches?

29
Representations for Devices Functions
  • Internal Representation
  • We cant use trad OO DB methods
  • - they all demand immediate access
  • - with asynchronous quality of sensnets this is
    unacceptable

30
Overview
  • Direction of sensor networks progress
  • Small form-factor devices
  • On-board computation
  • Wireless communication
  • Increased sensing capabilities
  • Improved OS and networking functionalities
  • Prediction
  • Every device (gt 1 ) will have some sensor
  • Ubiquitous presence of sensor networks

31
Overview
  • Typical sensor networks usage
  • Sense, collect and convey data
  • Provides a ubiquitous computing platform
  • Applications query/monitor sensed data
  • Ecosystem dynamics
  • Temperature/weather sensing
  • Automobile traffic analysis
  • Data-centric network, generated data more
    important than node identity

32
Requirements
  • Addressing
  • Identify relevant sensors
  • How to access/process data?
  • Communicate data and process centrally
  • Compute query at node and perform DB operations
  • Interface for querying/monitoring and control

33
What to do with data?
  • Answer queries/give useful info
  • How ??
  • Centralized approach
  • Communicate data
  • Store and process all data at central location
    (traditional DB approach)
  • Is all temporal data to be stored?
  • Communication overhead?

34
What to do with data?
  • De-centralized approach
  • Communicate query (query routing)
  • Required data attribute of node
  • Node stores and communicates data to queries
  • Processing at node
  • Computation overhead
  • Computation overhead smaller than communication!
  • How to aggregate data?
  • How to route queries?
  • How to map nodes to addresses for communication
    purposes?

35
Need for Decentralization
  • Centralized (Traditional databases)
  • Inefficient use of resources
  • Large amounts of data communicated to central
    location
  • All sensors send data all the time
  • Dissociates access to device from query load
  • Communication more expensive than computation
  • Decentralized (Distributed DBs)
  • Data on devices
  • In-network query processing

36
Pipelining Benefits
  • Provide streamed partial answers, hence, can
    enable query refinement
  • Schemes like ripple joins form a low energy
    approach to obtain approximate answers and can be
    used together with sampling
Write a Comment
User Comments (0)
About PowerShow.com