Title: Multi-resolution Resource Behavior Queries Using Wavelets
1Multi-resolution Resource Behavior Queries Using
Wavelets
- Jason Skicewicz
- Peter A. Dinda
- Jennifer M. Schopf
- Northwestern University
2The Tension
Video App
Sensor
Fine-grain measurement
Resource-appropriate measurement
Grid App
Resource Signal (periodic sampling) Example
host load
Course-grain measurement
3Video Scheduling
Video App
Sensor
Fine-grain measurements needed
4Grid Scheduling
Grid App
Sensor
Coarse-grain measurements sufficient
5Interval Averages
Application
Sensor
Average over interval
Average over interval
Ideal Result
Adequate Result
6Contributions / Outline
- Application-sensor tension
- Query model to address tension
- Wavelets as basis for query model
- Promising early results
- Delay conundrum
7Schematic Representation of Query Model
Application
Sensor
x
x
Lower bandwidth used
Measurements at fs samples/second
Desired rate at fq samples/second
The desired rate signal is an estimate error x
x
8Application
Sensor
Query
Stream Error
x
t
t
?
?q
9Application
Sensor
Query
Average CI
tnowinowD
(inow-N1)D
x
t
t
Application gets average over this interval
Application wants average over this interval
10Contributions / Outline
- Application-sensor tension
- Query model to address tension
- Wavelets as basis for query model
- Promising early results
- Delay conundrum
11Wavelets As Basis for Query Model
- Natural time/frequency decomposition
- Provides a multi-resolution view of a resource
- Well known mathematical tool
- Invented in the 80s, hot in 90s and today
- Linear complexity
- Non-stationarity, other normal behaviors
acceptable - Burrus, Gopinath, Gao, intro to wavelets and
wavelet transforms A primer - Analytic enabler
- Prediction on different resolutions
- Compression of measurement streams
Queries over wavelet domain representation of
signal
12Multi-resolution Views
13High Level View of a 4-level Wavelet Decomposition
Sensor
Level 0
Wavelet Transform
Level 1
Wavelet Coefficients
Level 2
Level 3
- Resource Signal is decomposed into levels
- Samples at each level are at a different rate
- Each level captures different frequency content
- Corresponding inverse transform
144-level Wavelet DecompositionTime-frequency
Localization
Level
Frequency
0
0 fs/16
fs/16 fs/8
1
fs/8 fs/4
2
fs/4 fs/2
3
xn
0 fs/2
?
fs1/?
time
15Example Decomposition of Host Load
Lossless representation of resource signal
16Computing Wavelet Coefficients
- Streaming operation
- Number of levels, M, chosen arbitrarily
- Amortized work per sample O(1)
- O(n) for n samples
- Block by block operation
- Block of samples, n2k
- Levels, M lg(n) 1
- Circular convolution over block, O(n)
17Proposed System
Application
Sensor
Network
Stream
Interval
Level 0
Level 0
Wavelet Transform
Inverse Wavelet Transform
Level L
Level M-1
Level M
Application receives levels based on its needs
18Multi-resolution Views Using 14 Levels
19Wavelet Compression Gains, 14 Levels
Typical appropriate number of levels for host
load, error lt 20
20Contributions / Outline
- Application-sensor tension
- Query model to address tension
- Wavelets as basis for query model
- Promising early results
- Delay conundrum
21Offline Analysis System
22Load Traces
- DEC Unix 5 second exponential average
- 1 Hz sample rate
- Traces collected in August 1997
- AXP0-PSC Interactive machine with high load
- AXP7-PSC Batch machine
- Sahara-CMU Large-memory compute server
- Themis-CMU Desktop workstation
- Windows 2000 percentage of CPU
- 1Hz sample rate
- Trace collected in May 2001
- Tlab-03-NU Desktop, teaching lab machine
23Testcases
- Stream Queries
- One million samples per trace
- Interval Queries
- 2, 8, 32, 128, 512, 2048, 8192 second intervals
- 1000 randomized queries per interval length per
trace
24Performance Evaluation
- Streaming queries metrics
- Error variance
- Error histograms
- Error mean
- Energy in error auto-covariance
- Interval query metrics
- Error variance
- Error histograms
- Error mean
Error mean 0 for all evaluations
25Streaming Queries, Relative Error Variance
Fewer than 1 of coefficients, error lt 20
26Streaming Queries, Error Histogram at Level 6
Errors follow a near-Gaussian distribution
27Interval Queries, Error Variance
Error variance approaches zero as interval
increases
28Interval Queries, Error Histograms at Level 5
Distributions not always Gaussian
29Contributions / Outline
- Application-sensor tension
- Query model to address tension
- Wavelets as basis for query model
- Promising early results
- Delay conundrum
30Block By Block System Delay
M Levels
Wavelet Transform
Inverse Wavelet Transform
xn
xrn
Block
Block
n samples in block
n samples in block
Sample Acquisitions
Wavelet transform
Inverse transform
time
Samples delayed by block size
31Streaming System Delay, Example with Length 4
Wavelets (D4), 4 Levels
Level 0
Length 22
Length 22
Level 1
Length 22
Length 22
xn
xrn-d
Level 2
Length 10
Length 10
Delay K1
Level 3
Length 4
Length 4
Delay K2
High levels delayed waiting for low frequency
computations, output delayed by high order filter
32Delay Conclusions
- System implementation
- Delay must be taken into account
- Prediction may help reduce streaming delay
- Application scheduling
- Fine-grain apps more sensitive to delay
- Coarse-grain apps less sensitive to delay
- Suggestions?
We are working on a solution!
33Related Work
- Database queries over wavelet coefficients
- Shahabi, et al SSDBM 2000
- Chakrabarti, et al VLDB 2000
- Vitter, et al CIKM 98, SIGMOD 99
- Network traffic analysis and modeling
- Ribeiro, et al IEEE INFOCOM 2000
- Riedi, et al IEEE DSPCS 99
- Feldman, et al SIGCOMM 98
- Wavelet theory
- Daubechies Ten Lectures on Wavelets 92, SIAM
- Mallat IEEE Trans. on Pattern Analysis and
Machine Intelligence, 89
34Conclusions
- Application-sensor tension
- Query model to address tension
- Wavelets as basis for query model
- Promising early results
- Delay conundrum
35Future Work
- Wavelets are an enabler of other techniques
- Prediction over wavelet coefficients
- Possibility of better results
- Can reduce system delay
- Further compression through processing
- Adaptive decompositions based on resource
- Looking at other resource streams
- RPS implementation
36Contact Information
- Webpage
- http//www.cs.northwestern.edu/jskitz
- Email address
- jskitz_at_cs.northwestern.edu
- Load traces and tools
- http//www.cs.northwestern.edu/pdinda/LoadTraces
- Matlab scripts
- Available by request (jskitz_at_cs.northwestern.edu)
37Frequency Information Vs. Rate
Input Signal, xn
Decomposition
- Frequency information retained fs/2
- Measurement rate, fs
Q Why is this true?
A The Nyquist Criterion- sampling theory
38Wavelet Transform, 1 Stage
LPF, HPF FIR filters
xn
yn
hn
39Increasing Stages, Mallats Tree Algorithm
xn
Stages can be arbitrarily increased
40Frequency Response
HPF
LPF
- Filters must be even order for PR
- Other special properties to retain PR
- The filters are order N8 (D8 wavelet)
41Reconstruction From the Wavelet Coefficients, 1
Stage
Upsampler
LPF, HPF time reversed filters, same response
42Reconstruction From Multiple Stages, The Inverse
Wavelet Transform
Reconstructed signal is exactly the resource
43Q How are the number of levels determined?
Answers
- Determined by accuracy constraints
- Determined by what levels are available
- Determined by the rate (fq) at which measurements
are requested
44Example, Choosing Levels
Solution
L 2
fq fs / 6
M 4 levels
Equation Satisfied!
Levels 0, 1 and 2 coefficients returned
45Streaming Query Tradeoffs
- Measurement rate, fq high
- Lower error variance
- Higher communication costs
- Measurement rate, fq low
- Higher error variance
- Very low communication costs
Wavelet approach yields accuracy at low rates
46Interval Query Tradeoffs
- Interval length N long
- Less dynamic rate
- Tighter confidence intervals
- Interval length N short
- More dynamic rate
- Wider confidence intervals
- Rate, fq high
- Shorter interval length
- Tighter confidence intervals
- Rate, fq low
- Longer interval length
- Wider confidence intervals
Confidence interval (c) provides flexibility
47Streaming Queries, Energy in Auto-covariance
Error becomes uncorrelated as levels added
48Interval Queries, Error Mean (32 seconds)
Error mean is zero at 8 levels, 3 of coefficients
49Interval Queries, Error Mean (512 seconds 8½
minutes)
As interval increases, need fewer levels