Blind online optimization Gradient descent without a gradient - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Blind online optimization Gradient descent without a gradient

Description:

Company produces certain numbers of cars per month. Vector x 2 d (#Corollas, #Camrys, ... Exponentially many slot machines? Finite dimensions. Exploration ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 29
Provided by: tauman
Category:

less

Transcript and Presenter's Notes

Title: Blind online optimization Gradient descent without a gradient


1
Blind online optimizationGradient descent
without a gradient
  • Abie Flaxman CMU
  • Adam Tauman Kalai TTI
  • Brendan McMahan CMU

2
Standard convex optimization
  • Convex feasible set S ½ ltd
  • Concave function f S ! lt

x
3
Steepest ascent
  • Move in the direction of steepest ascent
  • Compute f(x) (rf(x) in higher dimensions)
  • Works for convex optimization
  • (and many other problems)

x1
x2
x3
x4
4
Typical application
  • Company produces certain numbers of cars per
    month
  • Vector x 2 ltd (Corollas, Camrys, )
  • Profit of company is concave function of
    production vector
  • Maximize total (eq. average) profit

PROBLEMS
5
Problem definition and results
  • Sequence of unknown concave functions
  • period t pick xt 2 S, find out only ft(xt)
  • convex

Theorem
6
Online model
expected regret
  • Holds for arbitrary sequences
  • Stronger than stochastic model
  • f1, f2, , i.i.d. from D
  • x arg minx2S EDf(x)

7
Outline
  • Problem definition
  • Simple algorithm
  • Analysis sketch
  • Variations
  • Related work applications

8
First try
Zinkevich 03 If we could only compute
gradients
f4(x4)
f3(x3)
f2(x2)
f4
PROFIT
f1(x1)
f3
f2
f1
x1
x2
x3
x4
x
CAMRYS
9
Idea one point gradient
With probability ½, estimate f(x ?)/?
With probability ½, estimate f(x ?)/?
PROFIT
E estimate ¼ f(x)
x
x?
x-?
CAMRYS
10
d-dimensional online algorithm
x3
x4
x1
x2
S
11
Outline
  • Problem definition
  • Simple algorithm
  • Analysis sketch
  • Variations
  • Related work applications

12
Analysis ingredients
  • E1-point estimate is gradient of
  • is small
  • Online gradient ascent analysis Z03
  • Online expected gradient ascent analysis
  • (Hidden complications)

13
1-pt gradient analysis
PROFIT
x?
x-?
CAMRYS
14
1-pt gradient analysis (d-dim)
  • E1-point estimate is gradient of
  • is small 2
  • 1

15
Online gradient ascent Z03

(concave, bounded gradient)
16
Expected gradient ascent analysis
  • Regular deterministic gradient ascent on gt

(concave, bounded gradient)
17
Adaptive adversary
18
Hidden complication
S
19
Hidden complication
S
20
Hidden complication
S
21
Hidden complication
  • Thin sets are bad

S
22
Hidden complication
  • Round sets are good

reshape into isotropic position LV03
23
Outline
  • Problem definition
  • Simple algorithm
  • Analysis sketch
  • Variations
  • Related work applications

24
Variations
diameter
gradient bound
  • Works against adaptive adversary
  • Chooses ft knowing x1, x2, , xt-1
  • Also works if we only get a noisy estimate of
    ft(xt), i.e. Eht(xt)xtft(xt)

25
Related convex optimization
Sighted (see entire function(s)) Blind (evaluations only)
Regular (single f)
Stochastic (dist over fs or dist over errors)
Online (f1, f2, f3, )

Gradient descent, ...
Ellipsoid, Random walk BV02, Sim. annealing
KV05, Finite difference
Gradient descent (stoch.)
1-pt. gradient appx. G89,S97
Finite difference
Gradient descent (online) Z03
1-pt. gradient appx. BKM04 Finite difference
Kleinberg04
26
Related discrete optimization
Linear function(s) over discrete set Sighted (see entire function(s)) Blind aka bandit (evaluations only)
Regular (single f) Shortest path, max,
Stochastic (dist over fs) Huffman trees,
Online (f1, f2, f3, ) Weighted majority, Online linear optimization Hannan57,KV03 Adversarial bandits, Blind linear optimization AK04, MB04 (adaptive adversary)

27
Switching lanes (experts)
2
3
5
0
3
1
2
3
5
5
0
3
2
2
5
0
3
4
2
3
5
2
3
0
28
Multi-armed bandit (experts)
2
3
5
1
2
3
5
0
2
2
5
0
2
3
5
0
R52,ACFS95,
29
Driving to work (online routing)
TW02,KV02, AK04,BM04
25
Exponentially many paths Exponentially many slot
machines? Finite dimensions Exploration/exploitati
on tradeoff
S
30
Online product design
31
High dimensions
One-dimensional problem easy Discretize,
special case of multi-armed bandit
problem 1/? slot machines No need for convexity

?
d-dimensional problem harder Discretizing at ?
granularity Exp many (1/?d) slot machines )
exponential regret
32
Non-linear applications
33
Conclusions and future work
  • Can learn to optimize a sequence of unrelated
    functions from evaluations
  • Answer toWhat is the sound of one hand
    clapping?
  • Applications
  • Cholesterol
  • Paper airplanes
  • Advertising
  • Future work
  • Many players using same algorithm (game theory)
Write a Comment
User Comments (0)
About PowerShow.com