Sketching and Streaming Entropy via Approximation Theory

About This Presentation

Title:

Sketching and Streaming Entropy via Approximation Theory

Description:

Interpolate degree-k polynomial q(zj) = S1 zj. Output q(0) Multiplicative ... For what other problems can we use this 'generalize-then-interpolate' strategy? ... – PowerPoint PPT presentation

Number of Views:39

Avg rating:3.0/5.0

Slides: 22

Provided by: nic19

Learn more at: http://archive.dimacs.rutgers.edu

Category:

more less

Transcript and Presenter's Notes

Title: Sketching and Streaming Entropy via Approximation Theory

1
Sketching and Streaming Entropy via Approximation
Theory
Nick Harvey (MSR/Waterloo) Jelani Nelson
(MIT) Krzysztof Onak (MIT)
2
Streaming Model
m updates
Increment x4
Increment x1
x ? Zn
Goal Compute statistics, e.g. x1, x2
Trivial solution Store x (or store all
updates) O(nlog(m))
space
Goal Compute using O(polylog(nm)) space
3
Streaming Algorithms(a very brief introduction)

Fact Alon-Matias-Szegedy 99, Bar-Yossef et
al. 02, Indyk-Woodruff 05, Bhuvanagiri et
al. 06, Indyk 06, Li 08, Li 09
Can compute (1?) (1?)Fp using O(?-2
logc n) bits of space (if 0? p?2) O(?-O(1)
n1-2/p logO(1)(n)) bits (if 2ltp??)
Another Fact Mostly optimal Alon-Matias-Szegedy
99, Bar-Yossef et al. 02, Saks-Sun 02,
Chakrabarti-Khot-Sun 03, Indyk-Woodruff 03,
Woodruff 04
Proofs using communication complexity and
information theory

4
Practical Motivation

General goal Dealing with massive data sets
Internet traffic, large databases,
Network monitoring anomaly detection
Stream consists of internet packets
xi packets sent to port i
Under typical conditions, x is very concentrated
Under port scan attack, x less concentrated
Can detect by estimating empirical entropy
Lakhina et al. 05, Xu et al. 05, Zhao et
al. 07

5
Entropy

Probability distribution a (a1, a2, , an)
Entropy H(a) -S ailg(ai)
Examples
a (1/n, 1/n, , 1/n) H(a) lg(n)
a (0, , 0, 1, 0, , 0) H(a) 0
small when concentrated, LARGE when not

6
Streaming Algorithms for Entropy

How much space to estimate H(x)?
Guha-McGregor-Venkatasubramanian 06,
Chakrabarti-Do Ba-Muthu 06,
Bhuvanagiri-Ganguly 06
Chakrabarti-Cormode-McGregor 07
multiplicative (1?) approx O(?-2 log2 m) bits
additive ? approx O(?-2 log4 m)
bits O(?-2) lower bound for both
Our contributions
Additive ? or multiplicative (1?) approximation
Õ(?-2 log3 m) bits, and can handle deletions
Can sketch entropy in the same space

7
First Idea

If you can estimate Fp for p1,
then you can estimate H(x)

Why?
Rényi entropy
8
Review of Rényi

Definition
Convergence to Shannon

Hp(x)
1
0
2

Alfred Rényi
Claude Shannon
p
9
Overview of Algorithm
Analysis

Set p1.01 and let x
Compute
Set
So

(using Lis compressed counting)

10
Making the tradeoff

How quickly does Hp(x) converge to H(x)?
Theorem Let x be distr., with mini xi 1/m.
Let . Then
Let . Then
Plugging in O(?-3 log4 m) bits of space suffice
for additive ? approximation

Multiplicative Approximation

Additive Approximation

11
Proof A trick worth remembering

Let f R ? R and g R ? R be such that

lHopitals rule says that

It actually says more! It says
converges toat least as fast as
does.

12
Improvements

Status additive ? approx using O(?-3 log4 m)
bits
How to reduce space further?
Interpolate with multiple points Hp1(x), Hp2(x),
...

13
Analyzing Interpolation

Let f(z) be a Ck1 function
Interpolate f with polynomial q with q(zi)f(zi),
0ik
Fact
where y, zi
a,b
Our case Set f(z) H1z(x)
Goal Analyze f(k1)(z)

14
Bounding Derivatives

Rényi derivatives are messy to analyze
Switch to Tsallis entropy f(z) S1z(x),
Can prove Tsallis also converges to Shannon

Fact
(when a-O(1/(klog m)), b0) can set k
log(1/e)loglog m
15
Key IngredientNoisy Interpolation

We dont have f(zi), we have f(zi)e
How to interpolate in presence of noise?
Idea we pick our zi very carefully

16
Chebyshev Polynomials

Rogosinskis Theorem
q(x) of degree k and q(ßj) 1 (0jk)
q(x) Tk(x) for x gt 1
Map -1,1 onto interpolation interval z0,zk
Choose zj to be image of ßj, j0,,k
Let q(z) interpolate f(zj)e and q(z) interpolate
f(zj)
r(z) (q(z)-q(z))/ e satisfies Rogosinskis
conditions!

17
Tradeoff in Choosing zk
Tk grows quickly once leaving z0, zk

zk close to 0 Tk(preimage(0))still
small
but zk close to 0 high space complexity
Just how close do we need 0 and zk to be?

0
z0
zk
18
The Magic of Chebyshev

Paturi 92Tk(1 1/kc) e4k1-(c/2). Set c
2.
Suffices to set zk-O(1/(k3log m))
Translates to Õ(?-2 log3 m) space

19
The Final Algorithm(additive approximation)

Set k lg(1/?) lglg(m),
zj (k2cos(jp/k)-(k21))/(9k3lg(m)) (0
j k)
Estimate S1zj (1-(F1zj/(F1)1zj))/zj for 0
j k
Interpolate degree-k polynomial q(zj) S1zj
Output q(0)

20
Multiplicative Approximation

How to get multiplicative approximation?
Additive approximation is multiplicative, unless
H(x) is small
H(x) small large CCM 07
Suppose and define
We combine (1e)RF1 and (1e)RF1zj to get
(1e)f(zj)
Question How do we get (1e)RFp?
Two different approaches
A general approach (for any p, and negative
frequencies)
An approach exploiting p 1, only for
nonnegative freqs(better by log(m))

21
Questions / Thoughts

For what other problems can we use this
generalize-then-interpolate strategy?
Some non-streaming problems too?
The power of moments?
The power of residual moments?CountMin (CM 05)
CountSketch (CCF 02) ? HSS (Ganguly et al.)
WANTED Faster moment estimation (some progress
in Cormode-Ganguly 07)

Write a Comment

User Comments (0)