Implementation and Research Issues in Query Processing for Wireless Sensor Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Implementation and Research Issues in Query Processing for Wireless Sensor Networks

Description:

(1998) Pister et al: Smart Dust. Initial focus on optical communication. By 1999, radio based networks, COTS Dust, 'Motes' (1999) Estrin Govindan ... – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 104
Provided by: S342
Learn more at: https://db.csail.mit.edu
Category:

less

Transcript and Presenter's Notes

Title: Implementation and Research Issues in Query Processing for Wireless Sensor Networks


1
Implementation and Research Issues in Query
Processing for Wireless Sensor Networks
  • Wei Hong
  • Intel Research, Berkeley
  • whong_at_intel-research.net

Sam Madden MIT madden_at_csail.mit.edu
MDM Tutorial, January 19th 2004
2
Motivation
  • Sensor networks (aka sensor webs, emnets) are
    here
  • Several widely deployed HW/SW platforms
  • Low power radio, small processor, RAM/Flash
  • Variety of (novel) applications scientific,
    industrial, commercial
  • Great platform for mobile ubicomp
    experimentation
  • Real, hard research problems to be solved
  • Networking, systems, languages, databases
  • We will summarize
  • The state of the art
  • Our experiences building TinyDB
  • Current and future research directions

Berkeley Mote
3
Sensor Network Apps
Habitat Monitoring Storm petrels on Great Duck
Island, microclimates on James Reserve.
Just the tip of the iceberg -- more tomorrow!
4
Declarative Queries
  • Programming Apps is Hard
  • Limited power budget
  • Lossy, low bandwidth communication
  • Require long-lived, zero admin deployments
  • Distributed Algorithms
  • Limited tools, debugging interfaces
  • Queries abstract away much of the complexity
  • Burden on the database developers
  • Users get
  • Safe, optimizable programs
  • Freedom to think about apps instead of details

5
TinyDB Prototype declarativequery processor
  • Platform Berkeley Motes TinyOS
  • Continuous variant of SQL TinySQL
  • Power and data-acquisition based in-network
    optimization framework
  • Extensible interface for aggregates, new types of
    sensors

6
Agenda
  • Part 1 Sensor Networks (50 Minutes)
  • TinyOS
  • NesC
  • Short Break
  • Part 2 TinyDB (1 Hour)
  • Data Model and Query Language
  • Software Architecture
  • Long Break Hands On
  • Part 3 Sensor Network Database Research
    Directions (1 Hour, 10 Minutes)

7
Part 1
  • Sensornet Background
  • Motes Mote Hardware
  • TinyOS
  • Programming Model NesC
  • TinyOS Architecture
  • Major Software Subsystems
  • Networking Services

8
A Brief History of Sensornets
  • People have used sensors for a long time
  • Recent CS History
  • (1998) Pottie Kaiser Radio based networks of
    sensors
  • (1998) Pister et al Smart Dust
  • Initial focus on optical communication
  • By 1999, radio based networks, COTS Dust, Motes
  • (1999) Estrin Govindan
  • Ad-hoc networks of sensors
  • (2000) Culler/Hill et al TinyOS Motes
  • (2002) Hill / Dust SPEC, mm3 scale computing
  • UCLA / USC / Berkeley Continue to Lead Research
  • Many other players now
  • TinyOS/Motes as most common platform
  • Emerging commercial space
  • Crossbow, Ember, Dust, Sensicast, Moteiv, Intel

9
Why Now?
  • Commoditization of radio hardware
  • Cellular and cordless phones, wireless
    communication
  • (some radio pictures, etc.)
  • Low cost -gt many/tiny -gt new applications!
  • Real application for ad-hoc network research from
    the late 90s
  • Coming together of EE CS communities

10
Motes
4Mhz, 8 bit Atmel RISC uProc 40 kbit Radio 4 K
RAM, 128 K Program Flash, 512 K Data Flash AA
battery pack Based on TinyOS
Mica Mote
Mica2Dot
11
History of Motes
  • Initial research goal wasnt hardware
  • Has since become more of a priority with emerging
    hardware needs, e.g.
  • Power consumption
  • (Ultrasonic) ranging localization
  • MIT Cricket, NEST Project
  • Connectivity with diverse sensors
  • UCLA sensor board
  • Even so, now on the 5th generation of devices
  • Costs down to 50/node (Moteiv, Dust)
  • Greatly improved radio quality
  • Multitude of interfaces USB, Ethernet, CF, etc.
  • Variety of form factors, packages

12
Motes vs. Traditional Computing
  • Lossy, Adhoc Radio Communication
  • Sensing Hardware
  • Severe Power Constraints

13
Radio Communication
  • Low Bandwidth Shared Radio Channel
  • 40kBits on motes
  • Much less in practice
  • Encoding, Contention for Media Access (MAC)
  • Very lossy 30 base loss rate
  • Argues against TCP-like end-to-end retransmission
  • And for link-layer retries
  • Generally, not well behaved

14
Types of Sensors
  • Sensors attach via daughtercard
  • Weather
  • Temperature
  • Light x 2 (high intensity PAR, low intensity,
    full spectrum)
  • Air Pressure
  • Humidity
  • Vibration
  • 2 or 3 axis accelerometers
  • Tracking
  • Microphone (for ranging and acoustic signatures)
  • Magnetometer
  • GPS

15
Power Consumption and Lifetime
  • Power typically supplied by a small battery
  • 1000-2000 mAH
  • 1 mAH 1 milliamp current for 1 hour
  • Typically at optimum voltage, current drain rates
  • Power Watts (W) Amps (A) Volts (V)
  • Energy Joules (J) W time
  • Lifetime, power consumption varies by application
  • Processor 5mA active, 1 mA idle, 5 uA sleeping
  • Radio 5 mA listen, 10 mA xmit/receive, 20mS /
    packet
  • Sensors 1 uA -gt 100s mA, 1 uS -gt 1 S / sample

16
Energy Usage in A Typical Data Collection Scenario
  • Each mote collects 1 sample of (light,humidity)
    data every 10 seconds, forwards it
  • Each mote can hear 10 other motes
  • Process
  • Wake up, collect samples ( 1 second)
  • Listen to radio for messages to forward (1
    second)
  • Forward data

17
Sensors Slow, Power Hungry, Noisy
18
Programming Sensornets TinyOS
  • Component Based Programming Model
  • Suite of software components
  • Timers, clocks, clock synchronization
  • Single and multi-hop networking
  • Power management
  • Non-volatile storage management

19
Programming Philosophy
  • Component Based
  • Wiring to components together via interfaces,
    configurations
  • Split-Phased
  • Nothing blocks, ever.
  • Instead, completion events are signaled.
  • Highly Concurrent
  • Single thread of tasks, posted and scheduled
    FIFO
  • Events fired asynchronously in response to
    interrupts.

20
NesC
  • C-like programming language with component model
    support
  • Compiles into GCC-compatible C
  • 3 types of files
  • Interfaces
  • Set of function prototypes no implementations
    or variables
  • Modules
  • Provide (implement) zero or more interfaces
  • Require zero or more interfaces
  • May define module variables, scoped to functions
    in module
  • Configurations
  • Wire (connect) modules according to
    requires/provides relationship

21
Component Example Leds
. async command result_t Leds.redOn()
dbg(DBG_LED, "LEDS Red on.\n") atomic
TOSH_CLR_RED_LED_PIN() ledsOn
RED_BIT return SUCCESS .
  • module LedsC
  • provides interface Leds
  • implementation
  • uint8_t ledsOn
  • enum
  • RED_BIT 1,
  • GREEN_BIT 2,
  • YELLOW_BIT 4

22
Configuration Example
  • configuration CntToLedsAndRfm
  • implementation
  • components Main, Counter, IntToLeds, IntToRfm,
    TimerC
  • Main.StdControl -gt Counter.StdControl
  • Main.StdControl -gt IntToLeds.StdControl
  • Main.StdControl -gt IntToRfm.StdControl
  • Main.StdControl -gt TimerC.StdControl
  • Counter.Timer -gt TimerC.Timerunique("Timer")
  • IntToLeds lt- Counter.IntOutput
  • Counter.IntOutput -gt IntToRfm

23
Split Phase Example
  • module IntToRfmM
  • implementation
  • command result_t IntOutput.output
  • (uint16_t value)
  • IntMsg message (IntMsg )data.data
  • if (!pending)
  • pending TRUE
  • message-gtval value
  • atomic
  • message-gtsrc TOS_LOCAL_ADDRESS
  • if (call Send.send(TOS_BCAST_ADDR,

  • sizeof(IntMsg), data))
  • return SUCCESS
  • pending FALSE
  • return FAIL

event result_t Send.sendDone
(TOS_MsgPtr msg,
result_t success) if (pending msg
data) pending FALSE signal
IntOutput.outputComplete
(success) return SUCCESS

24
Major Components
  • Timers Clock, TimerC, LogicalTime
  • Networking Send, GenericComm, AMStandard,
    lib/Route
  • Power Management HPLPowerManagement
  • Storage Management EEPROM, MatchBox

25
Timers
  • Clock Basic abstraction over hardware timers
    periodic events, single frequency.
  • LogicalTime Fire an event some number of
    HMSms in the future.
  • TimerC Multiplex multiple periodic timers on
    top of LogicalTime.

26
Radio Stack
  • Interfaces
  • Send
  • Broadcast, or to a specific ID
  • split phase
  • Receive
  • asynchronous signal
  • Implementations
  • AMStandard
  • Application specific messages
  • Id-based dispatch
  • GenericComm
  • AMStandard Serial IO
  • Lib/Route
  • Mulithop

IntMsg message (IntMsg )data.data message-gt
val value atomic message-gtsrc
TOS_LOCAL_ADDRESS call Send.send(TOS_BCAST_ADDR
, sizeof(IntMsg), data))
event TOS_MsgPtr ReceiveIntMsg. receive(TOS_MsgP
tr m) IntMsg message (IntMsg )m-gtdata
call IntOutput.output(message-gtval)
return m
Wiring to equate IntMsg to ReceiveIntMsg
27
Multihop Networking
  • Standard implementation tree based routing

Problems Parent Selection Asymmetric
Links Adaptation vs. Stability
28
Geographic Routing
  • Any-to-any routing via geographic coordinates
  • See GPSR, MOBICOM 2000, Karp Kung.
  • Requires coordinate system
  • Requires endpont coordinates
  • Hard to route around local minima (holes)

B
A
Could be virtual, as in Rao et al Geographic
Routing Without Coordinate Information. MOBICOM
2003
29
Power Management
  • HPLPowerManagement
  • TinyOS sleeps processor when possible
  • Observes the radio, sensor, and timer state
  • Application managed, for the most part
  • App. must turn off subsystems when not in use
  • Helper utility ServiceScheduler
  • Peridically calls the start and stop methods
    of an app
  • More on power management in TinyDB later
  • Approach works because
  • single application
  • no interactivity requirements

30
Non-Volatile Storage
  • EEPROM
  • 512K off chip, 32K on chip
  • Writes at disk speeds, reads at RAM speeds
  • Interface random access, read/write 256 byte
    pages
  • Maximum throughput 10Kbytes / second
  • MatchBox Filing System
  • Provides a Unix-like file I/O interface
  • Single, flat directory
  • Only one file being read/written at a time

31
TinyOS Getting Started
  • The TinyOS home page
  • http//webs.cs.berkeley.edu/tinyos
  • Start with the tutorials!
  • The CVS repository
  • http//sf.net/projects/tinyos
  • The NesC Project Page
  • http//sf.net/projects/nescc
  • Crossbow motes (hardware)
  • http//www.xbow.com
  • Intel Imote
  • www.intel.com/research/exploratory/motes.htm.

32
Part 2
  • The Design and Implementation of TinyDB

33
Part 2 Outline
  • TinyDB Overview
  • Data Model and Query Language
  • TinyDB Java API and Scripting
  • Demo with TinyDB GUI
  • TinyDB Internals
  • Extending TinyDB
  • TinyDB Status and Roadmap

34
TinyDB Revisited
SELECT MAX(mag) FROM sensors WHERE mag gt
thresh SAMPLE PERIOD 64ms
  • High level abstraction
  • Data centric programming
  • Interact with sensor network as a whole
  • Extensible framework
  • Under the hood
  • Intelligent query processing query optimization,
    power efficient execution
  • Fault Mitigation automatically introduce
    redundancy, avoid problem areas

App
Query, Trigger
Data
TinyDB
35
Feature Overview
  • Declarative SQL-like query interface
  • Metadata catalog management
  • Multiple concurrent queries
  • Network monitoring (via queries)
  • In-network, distributed query processing
  • Extensible framework for attributes, commands and
    aggregates
  • In-network, persistent storage

36
Architecture
TinyDB GUI
JDBC
TinyDB Client API
DBMS
PC side
0
Mote side
0
TinyDB query processor
2
1
3
8
4
5
6
Sensor network
7
37
Data Model
  • Entire sensor network as one single,
    infinitely-long logical table sensors
  • Columns consist of all the attributes defined in
    the network
  • Typical attributes
  • Sensor readings
  • Meta-data node id, location, etc.
  • Internal states routing tree parent, timestamp,
    queue length, etc.
  • Nodes return NULL for unknown attributes
  • On server, all attributes are defined in
    catalog.xml
  • Discussion other alternative data models?

38
Query Language (TinySQL)
  • SELECT ltaggregatesgt, ltattributesgt
  • FROM sensors ltbuffergt
  • WHERE ltpredicatesgt
  • GROUP BY ltexprsgt
  • SAMPLE PERIOD ltconstgt ONCE
  • INTO ltbuffergt
  • TRIGGER ACTION ltcommandgt

39
Comparison with SQL
  • Single table in FROM clause
  • Only conjunctive comparison predicates in WHERE
    and HAVING
  • No subqueries
  • No column alias in SELECT clause
  • Arithmetic expressions limited to column op
    constant
  • Only fundamental difference SAMPLE PERIOD clause

40
TinySQL Examples
Find the sensors in bright nests.
Sensors
  • SELECT nodeid, nestNo, light
  • FROM sensors
  • WHERE light gt 400
  • EPOCH DURATION 1s

1
41
TinySQL Examples (cont.)
Count the number occupied nests in each loud
region of the island.
42
Event-based Queries
  • ON event SELECT
  • Run query only when interesting events happens
  • Event examples
  • Button pushed
  • Message arrival
  • Bird enters nest
  • Analogous to triggers but events are user-defined

43
Query over Stored Data
  • Named buffers in Flash memory
  • Store query results in buffers
  • Query over named buffers
  • Analogous to materialized views
  • Example
  • CREATE BUFFER name SIZE x (field1 type1, field2
    type2, )
  • SELECT a1, a2 FROM sensors SAMPLE PERIOD d INTO
    name
  • SELECT field1, field2, FROM name SAMPLE PERIOD d

44
Using the Java API
  • SensorQueryer
  • translateQuery() converts TinySQL string into
    TinyDBQuery object
  • Static query optimization
  • TinyDBNetwork
  • sendQuery() injects query into network
  • abortQuery() stops a running query
  • addResultListener() adds a ResultListener that is
    invoked for every QueryResult received
  • removeResultListener()
  • QueryResult
  • A complete result tuple, or
  • A partial aggregate result, call
    mergeQueryResult() to combine partial results
  • Key difference from JDBC push vs. pull

45
Writing Scripts with TinyDB
  • TinyDBs text interface
  • java net.tinyos.tinydb.TinyDBMain run select
  • Query results printed out to the console
  • All motes get reset each time new query is posed
  • Handy for writing scripts with shell, perl, etc.

46
Using the GUI Tools
  • Demo time

47
Inside TinyDB
Multihop Network
Query Processor
10,000 Lines Embedded C Code 5,000 Lines
(PC-Side) Java 3200 Bytes RAM (w/ 768 byte
heap) 58 kB compiled code (3x larger than 2nd
largest TinyOS Program)
Filterlight gt 400
Schema
TinyOS
TinyDB
48
Tree-based Routing
  • Tree-based routing
  • Used in
  • Query delivery
  • Data collection
  • In-network aggregation
  • Relationship to indexing?

49
Power Management Approach
  • Coarse-grained app-controlled communication
    scheduling

Epoch (10s -100s of seconds)
Mote ID
1
zzz
zzz
2
3
4
5
time
2-4s Waking Period
50
Time Synchronization
  • All messages include a 5 byte time stamp
    indicating system time in ms
  • Synchronize (e.g. set system time to timestamp)
    with
  • Any message from parent
  • Any new query message (even if not from parent)
  • Punt on multiple queries
  • Timestamps written just after preamble is xmitted
  • All nodes agree that the waking period begins
    when (system time epoch dur 0)
  • And lasts for WAKING_PERIOD ms
  • Adjustment of clock happens by changing duration
    of sleep cycle, not wake cycle.

51
Extending TinyDB
  • Why extending TinyDB?
  • New sensors ? attributes
  • New control/actuation ? commands
  • New data processing logic ? aggregates
  • New events
  • Analogous to concepts in object-relational
    databases

52
Adding Attributes
  • Types of attributes
  • Sensor attributes raw or cooked sensor readings
  • Introspective attributes parent, voltage, ram
    usage, etc.
  • Constant attributes constant values that can be
    statically or dynamically assigned to a mote,
    e.g., nodeid, location, etc.

53
Adding Attributes (cont)
  • Interfaces provided by Attr component
  • StdControl init, start, stop
  • AttrRegister
  • command registerAttr(name, type, len)
  • event getAttr(name, resultBuf, errorPtr)
  • event setAttr(name, val)
  • command getAttrDone(name, resultBuf, error)
  • AttrUse
  • command startAttr(attr)
  • event startAttrDone(attr)
  • command getAttrValue(name, resultBuf, errorPtr)
  • event getAttrDone(name, resultBuf, error)
  • command setAttrValue(name, val)

54
Adding Attributes (cont)
  • Steps to adding attributes to TinyDB
  • Create attribute nesC components
  • Wire new attribute components to TinyDBAttr
    configuration
  • Reprogram TinyDB motes
  • Add new attribute entries to catalog.xml
  • Constant attributes can be added on the fly
    through TinyDB GUI

55
Adding Aggregates
  • Step 1 wire new nesC components

56
Adding Aggregates (cont)
  • Step 2 add entry to catalog.xml
  • ltaggregategt
  • ltnamegtAVGlt/namegt
  • ltidgt5lt/idgt
  • lttemporalgtfalselt/temporalgt
  • ltreaderClassgtnet.tinyos.tinydb.AverageClasslt/read
    erClassgt
  • lt/aggregategt
  • Step 3 (optional) implement reader class in Java
  • a reader class interprets and finalizes aggregate
    state received from the mote network, returns
    final result as a string for display.

57
TinyDB Status
  • Latest released with TinyOS 1.1 (9/03)
  • Install the task-tinydb package in TinyOS 1.1
    distribution
  • First release in TinyOS 1.0 (9/02)
  • Widely used by research groups as well as
    industry pilot projects
  • Successful deployments in Intel Berkeley Lab and
    redwood trees at UC Botanical Garden
  • Largest deployment 80 weather station nodes
  • Network longevity 4-5 months

58
The Redwood Tree Deployment
  • Redwood Grove in UC Botanical Garden, Berkeley
  • Collect dense sensor readings to monitor climatic
    variations across
  • altitudes,
  • angles,
  • time,
  • forest locations, etc.
  • Versus sporadic monitoring points with 30lb
    loggers!
  • Current focus study how dense sensor data affect
    predictions of conventional tree-growth models

59
Data from Redwoods
36m
33m 111
32m 110
30m 109,108,107
20m 106,105,104
10m 103, 102, 101
60
TinyDB Roadmap (near term)
  • Support for high frequency sampling
  • Equipment vibration monitoring, structural
    monitoring, etc.
  • Store and forward
  • Bulk reliable data transfer
  • Scheduling of communications
  • Port to Intel Mote
  • Deployment in Intel Fab equipment monitoring
    application and the Golden Gate Bridge monitoring
    application

61
For more information
  • http//berkeley.intel-research.net/tinydb or
    http//triplerock.cs.bekeley.edu/tinydb

62
Part 3
  • Database Research Issues in Sensor Networks

63
Sensor Network Research
  • Very active research area
  • Cant summarize it all
  • Focus database-relevant research topics
  • Some outside of Berkeley
  • Other topics that are itching to be scratched
  • But, some bias towards work that we find
    compelling

64
Topics
  • In-network aggregation
  • Acquisitional Query Processing
  • Heterogeneity
  • Intermittent Connectivity
  • In-network Storage
  • Statistics-based summarization and sampling
  • In-network Joins
  • Adaptivity and Sensor Networks
  • Multiple Queries

65
Topics
  • In-network aggregation
  • Acquisitional Query Processing
  • Heterogeneity
  • Intermittent Connectivity
  • In-network Storage
  • Statistics-based summarization and sampling
  • In-network Joins
  • Adaptivity and Sensor Networks
  • Multiple Queries

66
Tiny Aggregation (TAG)
  • In-network processing of aggregates
  • Common data analysis operation
  • Aka gather operation or reduction in
    programming
  • Communication reducing
  • Operator dependent benefit
  • Across nodes during same epoch
  • Exploit query semantics to improve efficiency!

Madden, Franklin, Hellerstein, Hong. Tiny
AGgregation (TAG), OSDI 2002.
67
Basic Aggregation
  • In each epoch
  • Each node samples local sensors once
  • Generates partial state record (PSR)
  • local readings
  • readings from children
  • Outputs PSR during assigned comm. interval
  • At end of epoch, PSR for whole network output at
    root
  • New result on each successive epoch
  • Extras
  • Predicate-based partitioning via GROUP BY

68
Illustration Aggregation
SELECT COUNT() FROM sensors
Interval 4
Sensor
Epoch
Interval
1
69
Illustration Aggregation
SELECT COUNT() FROM sensors
Interval 3
Sensor
2
Interval
70
Illustration Aggregation
SELECT COUNT() FROM sensors
Interval 2
Sensor
1
3
Interval
71
Illustration Aggregation
SELECT COUNT() FROM sensors
Interval 1
5
Sensor
Interval
72
Illustration Aggregation
SELECT COUNT() FROM sensors
Interval 4
Sensor
Interval
1
73
Aggregation Framework
  • As in extensible databases, TinyDB supports any
    aggregation function conforming to

Aggnfinit, fmerge, fevaluate Finit a0 ?
lta0gt Fmerge lta1gt,lta2gt ? lta12gt Fevaluate lta1gt
? aggregate value
Partial State Record (PSR)
Example Average AVGinit v ?
ltv,1gt AVGmerge ltS1, C1gt, ltS2, C2gt ? lt S1
S2 , C1 C2gt AVGevaluateltS, Cgt ? S/C
Restriction Merge associative, commutative
74
Taxonomy of Aggregates
  • TAG insight classify aggregates according to
    various functional properties
  • Yields a general set of optimizations that can
    automatically be applied

Drives an API!
75
Use Multiple Parents
  • Use graph structure
  • Increase delivery probability with no
    communication overhead
  • For duplicate insensitive aggregates, or
  • Aggs expressible as sum of parts
  • Send (part of) aggregate to all parents
  • In just one message, via multicast
  • Assuming independence, decreases variance

SELECT COUNT()
of parents n E(cnt) n (c/n
p2) Var(cnt) n (c/n)2 p2 (1 p2) V/n
P(link xmit successful) p P(success from A-gtR)
p2 E(cnt) c p2 Var(cnt) c2 p2 (1
p2) ? V
76
Multiple Parents Results
  • Better than previous analysis expected!
  • Losses arent independent!
  • Insight spreads data over many links

77
Acquisitional Query Processing (ACQP)
  • TinyDB acquires AND processes data
  • Could generate an infinite number of samples
  • An acqusitional query processor controls
  • when,
  • where,
  • and with what frequency data is collected!
  • Versus traditional systems where data is provided
    a priori

Madden, Franklin, Hellerstein, and Hong. The
Design of An Acqusitional Query Processor.
SIGMOD, 2003.
78
ACQP Whats Different?
  • How should the query be processed?
  • Sampling as a first class operation
  • How does the user control acquisition?
  • Rates or lifetimes
  • Event-based triggers
  • Which nodes have relevant data?
  • Index-like data structures
  • Which samples should be transmitted?
  • Prioritization, summary, and rate control

79
Operator Ordering Interleave Sampling Selection
At 1 sample / sec, total power savings could be
as much as 3.5mW ? Comparable to processor!
  • SELECT light, mag
  • FROM sensors
  • WHERE pred1(mag)
  • AND pred2(light)
  • EPOCH DURATION 1s
  • E(sampling mag) gtgt E(sampling light)
  • 1500 uJ vs. 90 uJ

80
Exemplary Aggregate Pushdown
  • SELECT WINMAX(light,8s,8s)
  • FROM sensors
  • WHERE mag gt x
  • EPOCH DURATION 1s
  • Novel, general pushdown technique
  • Mag sampling is the most expensive operation!

81
Topics
  • In-network aggregation
  • Acquisitional Query Processing
  • Heterogeneity
  • Intermittent Connectivity
  • In-network Storage
  • Statistics-based summarization and sampling
  • In-network Joins
  • Adaptivity and Sensor Networks
  • Multiple Queries

82
Heterogeneous Sensor Networks
  • Leverage small numbers of high-end nodes to
    benefit large numbers of inexpensive nodes
  • Still must be transparent and ad-hoc
  • Key to scalability of sensor networks
  • Interesting heterogeneities
  • Energy battery vs. outlet power
  • Link bandwidth Chipcon vs. 802.11x
  • Computing and storage ATMega128 vs. Xscale
  • Pre-computed results
  • Sensing nodes vs. QP nodes

83
Computing Heterogeneity with TinyDB
  • Separate query processing from sensing
  • Provide query processing on a small number of
    nodes
  • Attract packets to query processors based on
    service value
  • Compare the total energy consumption of the
    network
  • No aggregation
  • All aggregation
  • Opportunistic aggregation
  • HSN proactive aggregation

Mark Yarvis and York Liu, Intels Heterogeneous
Sensor Network Project, ftp//download.intel.com/r
esearch/people/HSN_IR_Day_Poster_03.pdf.
84
5x7 TinyDB/HSN Mica2 Testbed
85
Data Packet Saving
  • How many aggregators are desired?
  • Does placement matter?

86
Occasionally Connected Sensornets
internet
GTWY
Mobile GTWY
Mobile GTWY
Mobile GTWY
GTWY
87
Occasionally Connected Sensornets Challenges
  • Networking support
  • Tradeoff between reliability, power consumption
    and delay
  • Data custody transfer duplicates?
  • Load shedding
  • Routing of mobile gateways
  • Query processing
  • Operation placement in-network vs. on mobile
    gateways
  • Proactive pre-computation and data movement
  • Tight interaction between networking and QP

Fall, Hong and Madden, Custody Transfer for
Reliable Delivery in Delay Tolerant Networks,
http//www.intel-research.net/Publications/Berkele
y/081220030852_157.pdf.
88
Distributed In-network Storage
  • Collectively, sensornets have large amounts of
    in-network storage
  • Good for in-network consumption or caching
  • Challenges
  • Distributed indexing for fast query dissemination
  • Resilience to node or link failures
  • Graceful adaptation to data skews
  • Minimizing index insertion/maintenance cost

89
Example DIM
  • Functionality
  • Efficient range query for multidimensional data.
  • Approaches
  • Divide sensor field into bins.
  • Locality preserving mapping from m-d space to
    geographic locations.
  • Use geographic routing such as GPSR.
  • Assumptions
  • Nodes know their locations and network boundary
  • No node mobility

Xin Li, Young Jin Kim, Ramesh Govindan and Wei
Hong, Distributed Index for Multi-dimentional
Data (DIM) in Sensor Networks, SenSys 2003.
90
Statistical Techniques
  • Approximations, summaries, and sampling based on
    statistics
  • Applications
  • Limited bandwidth and large number of nodes -gt
    data reduction
  • Lossiness -gt predictive modeling
  • Uncertainty -gt tracking correlations and changes
    over time

91
IDSQ
  • Idea task sensors in order of best improvement
    to estimate of some value
  • Choose leader(s)
  • Suppress subordinates
  • Task subordinates, one at a time
  • Until some measure of goodness (error bound) is
    met
  • E.g. Mahalanobis Distance -- Accounts for
    correlations in axes, tends to favor minimizing
    principal axis

See Scalable Information-Driven Sensor Querying
and Routing for ad hoc Heterogeneous Sensor
Networks. Chu, Haussecker and Zhao. Xerox TR
P2001-10113. May, 2001.
92
Model location estimate as a point with
2-dimensional Gaussian uncertainty.
Graphical Representation
Principal Axis
93
In-Net Regression
  • Linear regression simple way to predict future
    values, identify outliers
  • Regression can be across local or remote values,
    multiple dimensions, or with high degree
    polynomials
  • E.g., node A readings vs. node Bs
  • Or, location (X,Y), versus temperature
  • E.g., over many nodes

Guestrin, Thibaux, Bodik, Paskin, Madden.
Distributed Regression an Efficient Framework
for Modeling Sensor Network Data . Under
submission.
94
In-Net Regression (Continued)
  • Problem may require data from all sensors to
    build model
  • Solution partition sensors into overlapping
    kernels that influence each other
  • Run regression in each kernel
  • Requiring just local communication
  • Blend data between kernels
  • Requires some clever matrix manipulation
  • End result regressed model at every node
  • Useful in failure detection, missing value
    estimation

95
Correlated Attributes
  • Data in sensor networks is correlated e.g.,
  • Temperature and voltage
  • Temperature and light
  • Temperature and humidity
  • Temperature and time of day
  • etc.

96
Exploiting Correlations in Query Processing
  • Simple idea
  • Given predicate P(A) over expensive attribute A
  • Replace it with P over cheap attribute A such
    that P evaluates to P
  • Problem unless A and A are perfectly
    correlated, P ? P for all time
  • So we could incorrectly accept or reject some
    readings
  • Alternative use correlations to improve
    selectivity estimates in query optimization
  • Construct conditional plans that vary predicate
    order based on prior observations

97
Exploiting Correlations (Cont.)
  • Insight by observing a (cheap and correlated)
    variable not involved in the query, it may be
    possible to improve query performance
  • Improves estimates of selectivities
  • Use conditional plans
  • Example

98
In-Network Join Strategies
  • Types of joins
  • non-sensor -gt sensor
  • sensor -gt sensor
  • Optimization questions
  • Should the join be pushed down?
  • If so, where should it be placed?
  • What if a join table exceeds the memory available
    on one node?

99
Choosing Where to Place Operators
  • Idea choose a join node to run the operator
  • Over time, explore other candidate placements
  • Nodes advertise data rates to their neighbors
  • Neighbors compute expected cost of running the
    join based on these rates
  • Neighbors advertise costs
  • Current join node selects a new, lower cost node

Bonfils Bonnet, Adaptive and Decentralized
Operator Placement for In-Network QueryProcessing
IPSN 2003.
100
Topics
  • In-network aggregation
  • Acquisitional Query Processing
  • Heterogeneity
  • Intermittent Connectivity
  • In-network Storage
  • Statistics-based summarization and sampling
  • In-network Joins
  • Adaptivity and Sensor Networks
  • Multiple Queries

101
Adaptivity In Sensor Networks
  • Queries are long running
  • Selectivities change
  • E.g. night vs day
  • Network load and available energy vary
  • All suggest that some adaptivity is needed
  • Of data rates or granularity of aggregation when
    optimizing for lifetimes
  • Of operator orderings or placements when
    selectivities change (c.f., conditional plans for
    correlations)
  • As far as we know, this is an open problem!

102
Multiple Queries and Work Sharing
  • As sensornets evolve, users will run many queries
    simultaneously
  • E.g., traffic monitoring
  • Likely that queries will be similar
  • But have different end points, parameters, etc
  • Would like to share processing, routing as much
    as possible
  • But how? Again, an open problem.

103
Concluding Remarks
  • Sensor networks are an exciting emerging
    technology, with a wide variety of applications
  • Many research challenges in all areas of computer
    science
  • Database community included
  • Some agreement that a declarative interface is
    right
  • TinyDB and other early work are an important
    first step
  • But theres lots more to be done!
Write a Comment
User Comments (0)
About PowerShow.com