Approximation Algorithms

About This Presentation

Title:

Approximation Algorithms

Description:

Select k centers C so that maximum distance from a site to nearest center is minimized. ... Algorithm terminates since at least one new node becomes tight after each ... – PowerPoint PPT presentation

Number of Views:109

Avg rating:3.0/5.0

Slides: 60

Provided by: kevin589

Category:

more less

Transcript and Presenter's Notes

Title: Approximation Algorithms

1
Approximation Algorithms

Load Balancing
k-center selection
Pricing Method
Vertex Cover
Set Cover
Bin Packing
TSP

2
Approximation Algorithms

Q. Suppose I need to solve an NP-hard problem.
What should I do?
A. Theory says you're unlikely to find a
poly-time algorithm.
Must sacrifice one of three desired features.
Solve problem to optimality.
Solve problem in poly-time.
Solve arbitrary instances of the problem.
?-approximation algorithm.
Guaranteed to run in poly-time.
Guaranteed to solve arbitrary instance of the
problem
Guaranteed to find solution within ratio ? of
true optimum.
Challenge. Need to prove a solution's value is
close to optimum, without even knowing what
optimum value is!

3
11.1 Load Balancing
4
Load Balancing

Input. m identical machines n jobs, job j has
processing time tj.
Job j must run contiguously on one machine.
A machine can process at most one job at a time.
Def. Let J(i) be the subset of jobs assigned to
machine i. The
load of machine i is Li ?j ? J(i) tj.
Def. The makespan is the maximum load on any
machine L maxi Li.
Load balancing. Assign each job to a machine to
minimize makespan.

5
Load Balancing List Scheduling

List-scheduling algorithm.
Consider n jobs in some fixed order.
Assign job j to machine whose load is smallest so
far.
Implementation. O(n log n) using a priority
queue.

List-Scheduling(m, n, t1,t2,,tn) for i 1
to m Li ? 0 J(i) ? ? for j
1 to n i argmink Lk J(i) ? J(i)
? j Li ? Li tj
load on machine i
jobs assigned to machine i
machine i has smallest load
assign job j to machine i
update load of machine i
6
Load Balancing List Scheduling Analysis

Theorem. Graham, 1966 Greedy algorithm is a
2-approximation.
First worst-case analysis of an approximation
algorithm.
Need to compare resulting solution with optimal
makespan L.
Lemma 1. The optimal makespan L ? maxj tj.
Pf. Some machine must process the most
time-consuming job. ?
Lemma 2. The optimal makespan
Pf.
The total processing time is ?j tj .
One of m machines must do at least a 1/m fraction
of total work. ?

7
Load Balancing List Scheduling Analysis

Theorem. Greedy algorithm is a 2-approximation.
Pf. Consider load Li of bottleneck machine i.
Let j be last job scheduled on machine i.
When job j assigned to machine i, i had smallest
load. Its load before assignment is Li - tj ?
Li - tj ? Lk for all 1 ? k ? m.

blue jobs scheduled before j
machine i
j
0
L Li
Li - tj
8
Load Balancing List Scheduling Analysis

Theorem. Greedy algorithm is a 2-approximation.
Pf. Consider load Li of bottleneck machine i.
Let j be last job scheduled on machine i.
When job j assigned to machine i, i had smallest
load. Its load before assignment is Li - tj ?
Li - tj ? Lk for all 1 ? k ? m.
Sum inequalities over all k and divide by m
(correct the second eqn. to j)
Now ?

Lemma 1
Lemma 2
9
Load Balancing List Scheduling Analysis

Q. Is our analysis tight?
A. Essentially yes.
Ex m machines, m(m-1) jobs length 1 jobs, one
job of length m

machine 2 idle
machine 3 idle
machine 4 idle
machine 5 idle
m 10
machine 6 idle
machine 7 idle
machine 8 idle
machine 9 idle
machine 10 idle
list scheduling makespan 19
10
Load Balancing List Scheduling Analysis

Q. Is our analysis tight?
A. Essentially yes.
Ex m machines, m(m-1) jobs length 1 jobs, one
job of length m

m 10
optimal makespan 10
11
Load Balancing LPT Rule

Longest processing time (LPT). Sort n jobs in
descending order of processing time, and then run
list scheduling algorithm.

LPT-List-Scheduling(m, n, t1,t2,,tn) Sort
jobs so that t1 t2 tn for i 1 to
m Li ? 0 J(i) ? ? for j
1 to n i argmink Lk J(i) ? J(i) ?
j Li ? Li tj
load on machine i
jobs assigned to machine i
machine i has smallest load
assign job j to machine i
update load of machine i
12
Load Balancing LPT Rule

Observation. If at most m jobs, then
list-scheduling is optimal.
Pf. Each job put on its own machine. ?
Lemma 3. If there are more than m jobs, L ? 2
tm1.
Pf.
Consider first m1 jobs t1, , tm1.
Since the ti's are in descending order, each
takes at least tm1 time.
There are m1 jobs and m machines, so by
pigeonhole principle, at least one machine gets
two jobs. ?
tj lt t(m1) lt ½ L
Theorem. LPT rule is a 3/2 approximation
algorithm.
Pf. Same basic approach as for list scheduling.
?

Lemma 3( by observation, can assume number of
jobs gt m )
13
Load Balancing LPT Rule

Q. Is our 3/2 analysis tight?
A. No.
Theorem. Graham, 1969 LPT rule is a
4/3-approximation.
Pf. More sophisticated analysis of same
algorithm.
Q. Is Graham's 4/3 analysis tight?
A. Essentially yes.
Ex m machines, n 2m1 jobs, 2 jobs of length
m1, m2, , 2m-1 and one job of length m.

14
11.2 Center Selection
15
Center Selection Problem

Input. Set of n sites s1, , sn.
Center selection problem. Select k centers C so
that maximum distance from a site to nearest
center is minimized.

k 4
site
16
Center Selection Problem

Input. Set of n sites s1, , sn.
Center selection problem. Select k centers C so
that maximum distance from a site to nearest
center is minimized.
Notation.
dist(x, y) distance between x and y.
dist(si, C) min c ? C dist(si, c) distance
from si to closest center.
r(C) maxi dist(si, C) smallest covering
radius.
Goal. Find set of centers C that minimizes r(C),
subject to C k.
Distance function properties.
dist(x, x) 0 (identity)
dist(x, y) dist(y, x) (symmetry)
dist(x, y) ? dist(x, z) dist(z, y) (triangle
inequality)

17
Center Selection Example

Ex each site is a point in the plane, a center
can be any point in the plane, dist(x, y)
Euclidean distance.
Remark search can be infinite!

r(C)
center
site
18
Greedy Algorithm A False Start

Greedy algorithm. Put the first center at the
best possible location for a single center, and
then keep adding centers so as to reduce the
covering radius each time by as much as possible.
Remark arbitrarily bad!

greedy center 1
center
k 2 centers
site
19
Center Selection Greedy Algorithm

Greedy algorithm. Repeatedly choose the next
center to be the site farthest from any existing
center.
Observation. Upon termination all centers in C
are pairwise at least r(C) apart.
Pf. By construction of algorithm.

Greedy-Center-Selection(k, n, s1,s2,,sn) C
? repeat k times Select a site si
with maximum dist(si, C) Add si to C
return C
site farthest from any center
20
Center Selection Analysis of Greedy Algorithm

Theorem. Let C be an optimal set of centers.
Then r(C) ? 2r(C).
Pf. (by contradiction) Assume r(C) lt ½ r(C).
For each site ci in C, consider ball of radius ½
r(C) around it.
Exactly one ci in each ball let ci be the site
paired with ci.
Consider any site s and its closest center ci in
C.
dist(s, C) ? dist(s, ci) ? dist(s, ci)
dist(ci, ci) ? 2r(C).
Thus r(C) ? 2r(C). ?

?-inequality
? r(C) since ci is closest center
½ r(C)
½ r(C)
ci
½ r(C)
C
ci
sites
s
21
Center Selection

Theorem. Let C be an optimal set of centers.
Then r(C) ? 2r(C).
Theorem. Greedy algorithm is a 2-approximation
for center selection problem.
Remark. Greedy algorithm always places centers
at sites, but is still within a factor of 2 of
best solution that is allowed to place centers
anywhere.
Question. Is there hope of a 3/2-approximation?
4/3?

e.g., points in the plane
Theorem. Unless P NP, there no ?-approximation
for center-selectionproblem for any ? lt 2.
22
11.4 The Pricing Method Vertex Cover
23
Weighted Vertex Cover

Weighted vertex cover. Given a graph G with
vertex weights, find a vertex cover of minimum
weight.

4
2
4
2
9
2
9
2
weight 9
weight 2 2 4
24
Weighted Vertex Cover

Pricing method. Each edge must be covered by
some vertex i. Edge e pays price pe ? 0 to use
vertex i.
Fairness. Edges incident to vertex i should pay
? wi in total.
Claim. For any vertex cover S and any fair
prices pe ?e pe ? w(S).
Proof. ?

4
2
9
2
sum fairness inequalitiesfor each node in S
each edge e covered byat least one node in S
25
Pricing Method

Pricing method. Set prices and find vertex cover
simultaneously.

Weighted-Vertex-Cover-Approx(G, w) foreach e
in E pe 0 while (? edge i-j such that
neither i nor j are tight) select such an
edge e increase pe without violating
fairness S ? set of all tight nodes
return S
26
Pricing Method
price of edge a-b
vertex weight
Figure 11.8
27
Pricing Method Analysis

Theorem. Pricing method is a 2-approximation.
Pf.
Algorithm terminates since at least one new node
becomes tight after each iteration of while loop.
Let S set of all tight nodes upon termination
of algorithm. S is a vertex cover if some edge
i-j is uncovered, then neither i nor j is tight.
But then while loop would not terminate.
Let S be optimal vertex cover. We show w(S) ?
2w(S).

all nodes in S are tight
S ? V,prices ? 0
fairness lemma
each edge counted twice
28
Extra Slides
29
Load Balancing on 2 Machines

Claim. Load balancing is hard even if only 2
machines.
Pf. NUMBER-PARTITIONING ? P LOAD-BALANCE.

NP-complete by Exercise 8.26
a
d
b
c
f
g
e
length of job f
Machine 1
a
d
f
machine 1
yes
Machine 2
b
c
e
g
machine 2
Time
L
0
30
Center Selection Hardness of Approximation

Theorem. Unless P NP, there is no
?-approximation algorithm formetric k-center
problem for any ? lt 2.
Pf. We show how we could use a (2 - ?)
approximation algorithm for k-center to solve
DOMINATING-SET in poly-time.
Let G (V, E), k be an instance of
DOMINATING-SET.
Construct instance G' of k-center with sites V
and distances
d(u, v) 2 if (u, v) ? E
d(u, v) 1 if (u, v) ? E
Note that G' satisfies the triangle inequality.
Claim G has dominating set of size k iff there
exists k centers C with r(C) 1.
Thus, if G has a dominating set of size k, a (2 -
?)-approximation algorithm on G' must find a
solution C with r(C) 1 since it cannot use
any edge of distance 2.

see Exercise 8.29
31
Vertex Cover Approximation

A vertex cover is a subset of vertices such that
every edge in the graph is incident to at least
one of these vertices.
The vertex cover optimization problem is to ?nd a
vertex cover of minimum size.
For a good strategy, a heuristic is needed

32
Vertex Cover

Consider an arbitrary edge (u, v) in the graph.
One of its two vertices must be in the cover, but
we do not know which one.
The idea of this heuristic is to simply put both
vertices into the vertex cover.
Then we remove all edges that are incident to u
and v (since they are now all covered), and
recurse on the remaining edges.
For every one vertex that must be in the cover,
we put two into our cover, so it is easy to see
that the cover we generate is at most twice the
size of the optimum cover.

33
Proof of aprroximation ratio

Claim approx VC yields a factor-2 approximation
Proof Consider the set C output by ApproxVC. Let
C be the optimum VC. Let A be the set of edges
selected by the line marked with () in the
?gure. Observe that the size of C is exactly
2Abecause we add two vertices for each such
edge. However note that in the optimum VC one of
these two vertices must have been added to the
VC, and thus the size of C is at least A. Thus
we have
C
---- A lt C
2
Therefore
C
---- lt 2
C

34
Example
35
Approximate VC Algorithm Naive Approach

ApproxVC
C empty-set
while (E is nonempty) do
() let (u,v) be any edge of E
add both u and v to C
remove from E all edges incident to either u or
v
return C
Can we improve on it ?
Why not consider vertices with higher degrees
first (Greedy Strategy)

36
Greedy VC

Greedy Approximation for VC GreedyVC(G(V,E))
C empty-set
while (E is nonempty) do
let u be the vertex of maximum degree in G
add u to C
remove from E all edges incident to u
return C
For the example, it yields the optimum solution

37
Greedy VC Example

Can we prove Greedy VC outperforms the other one
?
NO !
It can even perform poorly than it.
However, it should also be pointed out that the
vertex cover constructed by the greedy heuristic
is (for typical graphs) smaller than that one
computed by the 2-for-1 heuristic, so it would
probably be wise to run both algorithms and take
the better of the two.

38
Third Attempt Use Matching

A matching is a subset of edges that have no
vertices in common
A matching is maximal if no more edges can be
added to it.
Maximal matchings will help us ?nd good vertex
covers, and moreover, they are easy to generate
repeatedly pick edges that are disjoint from the
ones chosen already, until this is no longer
possible.
Any vertex cover of a graph G must be at least as
large as the number of edges in any matching in
G that is, any matching provides a lower bound
on OPT. This is simply because each edge of the
matching must be covered by one of its endpoints
in any vertex cover!

39
Example

Figure below shows how to convert from Maximal
Matching to Vertex Cover
a) A matching b) Completion to MaxMatch c)
Its VC

40
Vertex Cover from Matching

let S be a set that contains both endpoints of
each edge in a maximal matching M.
Then S must be a vertex coverif it isnt, that
is, if it doesnt touch some edge e ? E, then M
could not possibly be maximal since we could
still add e to it. But our cover S has 2M
vertices
We know that any vertex cover must have size at
least M.
Algorithm
Find a maximal matching M 8 E
Return S all endpoints of edges in M

41
Vertex cover from Matching

This simple procedure always returns a vertex
cover whose size is at most twice optimal!
In summary, even though we have no way of ?nding
the best vertex cover, we can easily ?nd another
structure, a maximal matching, with two key
properties
1. Its size gives us a lower bound on the
optimal vertex cover.
2. It can be used to build a vertex cover,
whose size can be related to that of the optimal
cover using property 1.
Alpha lt 2

42
Set Cover Problem Revisited

Given a pair (X,F) where X x1,x2,...,xm is a
?nite set (a domain of elements) and F
S1,S2,...,Sn is a family of subsets of X, such
that every element of X belongs to at least one
set of F.
For C ? F. (This is a collection of sets over X.)
We say that C covers the domain if every element
of X is in some set of C
The problem is to ?nd the minimum-sized subset C
of F that covers X.

43
Set Cover

Vertex Cover is a type of set cover problem. The
domain to be covered are the edges, and each
vertex covers the subset of incident edges.
Decision-problem formulation of set cover (does
there exist a set cover of size at most k?) is
NP-complete
There is a factor-2 approximation for the vertex
cover problem, but it cannot be applied to
generate a factor2 approximation for set cover.

44
Set Cover

It is known that there is no constant factor
approximation to the set cover problem
There is however the greedy heuristic, which
achieves an approximation bound of ln m, where m
X, the size of the underlying domain, we will
leave the proof.
A simple greedy approach to set cover works by at
each stage selecting the set that covers the
greatest number of uncovered elements

45
Set Cover The Approx. Algorithm

Greedy-Set-Cover(X, F)
U X // U are the items to be
covered
C empty // C will be the sets in the
cover
while (U is nonempty) // there is someone left
to cover
select S in F that covers the most elements of U
addS to C
UU-S
return C

46
Set Cover Bad Example

The optimal set cover consists of sets S5 and S6,
each of size 16. Initially all three sets S1, S5,
and S6 have 16 elements. If ties are broken in
the worst possible way, the greedy algorithm will
first select set S1. We remove all the covered
elements. Now S2, S5 and S6 all cover 8 of the
remaining elements. Again, if we choose poorly,
S2 is chosen. The pattern repeats, choosing S3
(size 4), S4 (size 2) and finally S5 and S6 (each
of size 1).

47
Bin Packing

Bin packing is another well-known NP-complete
problem, which is a variant of the knapsack
problem
Given a set of n objects, where si denotes the
size of the ith object (0 lt si lt 1. for
simplification) , put objects into bins
Size of a bin is 1 at max.
Use fewest bins as possible
Ex Fit object sto a truck etc.

48
Bin Packing Example
49
Bin Packing Approximation Factor

Theorem The ?rst-?t heuristic achieves a ratio
bound of 2.
Proof Consider an instance s1,...,sn of the
bin packing problem. Let S S i si denote the sum
of all the object sizes. Let b denote the
optimal number of bins, and bff denote the number
of bins used by ?rst-?t.
b gt S since no bin can hold a total capacity
of more than 1 unit, and even if we were to fill
each bin exactly to its capacity, we would need
at least S bins

50
Bin Packing Analysis

We claim that bff lt 2S.
To see this, let ti denote the total size of the
objects that first-fit puts into bin i.
Consider bins i and i 1 filled by first-fit.
Assume that indexing is cyclical, so if i is the
last index (i bff ) then i1 1.
We claim that ti ti 1 gt 1. If not, then the
contents of bins i and i 1 could both be put
into the same bin, and hence first-fit would
never have started to fill the second bin,
preferring to keep everything in the first bin.
Thus we have
bff
Si1 ( ti ti1 ) gt bff

51
Bin Packing Analysis

But this sum adds up all the elements twice, so
it has a total value of 2S. Thus we have 2S gt
bff .
Combining this with the fact that b gt S we
have
bff lt 2S lt 2b showing bff /b lt 2 as
required
best fit attempts put the object into the bin
in which it fits most closely with the available
space (approx. ratio 17/10)
first fit decreasing, in which the objects are
first sorted
in decreasing order of size (approx. ratio
11/9)

52
Traveling Salesman Problem (TSP)

In the TSP, given a complete undirected graph
with nonnegative edge weights,
Find a cycle that visits all vertices and is of
minimum
cost. (NP-Complete)
Distances should satisfy the triangle
inequality
for all u, v, w ? c(u, w) lt c(u, v)c(v, w)
(c(u,v) cost on edge uv or cost of shortest
path)
There is an approx. Algorithm forTSP with a
ratio of 2 (the tour that it produces cannot be
worse than twice the cost of the optimal tour)

53
TSP Observations

A TSP with one edge removed is a spanning tree
(not necessarily a minimum spanning tree)
Therefore, the cost of the minimum TSP tour is at
least as large as the cost of the MST.
MST can be computed ef?ciently, using, for
example, either Kruskals or Prims algorithm
If we can ?nd some way to convert the MST into a
TSP tour while increasing its cost by at most a
constant factor, then we will have an
approximation for TSP.
We will see that if the edge weights satisfy the
triangle inequality, then this is possible.

54
TSP ? MST

Given any free tree there is a tour of the tree
called a twice around tour that traverses the
edges of the tree twice, once in each direction.
The ?gure below shows an example.
MST twice round tour Short Cut
Optimal TSP

55
TSP

This path is not simple because it revisits
vertices, but we can make it simple by
short-cutting, that is, we skip over previously
visited vertices
the ?nal order in which vertices are visited
using the short-cuts is exactly the same as a
preorder traversal of the MST
The triangle inequality assures us that the path
length will not increase when we take short-cuts.

56
Approximate Algorithm for TSP

ApproxTSP(G(V,E))
T minimum spanning tree for G
r any vertex
L list of vertices visited by a preorder walk
of T
starting with r
return L

57
Approx.TSP Analysis

Claim Approx-TSP has a ratio bound of 2.
Proof Let H denote the tour produced by this
algorithm and let H be the optimum tour. Let T
be the minimum spanning tree.
We can remove any edge of H resulting in a
spanning tree, and since T is the minimum cost
spanning tree we have
c(T) lt
c(H).
Twice around tour of T has cost 2c(T), since
every edge in T is hit twice. By the triangle
inequality, when we short-cut an edge of T to
form H we do not increase the cost of the tour,
and so we have
c(H) lt 2c(T).
Combining these we have
c(H) /2 lt c(T) lt c(H) ,
therefore
c(H) / c(H) lt 2.

58
Graph Partitioning

Input An undirected graph G (VE) with
nonnegative edge weights a real number a ? (0,
1/2.
Output A partition of the vertices into two
groups A and B, each of size at least a V
Goal Minimize the capacity of the cut (A,B).
Applications from circuit layout to program
analysis to image segmentation.
Graph Partitioning is NP Hard
Removing the restriction on the sizes of A and B
would give the MINIMUM CUT problem, which we know
to be efficiently solvable using flow techniques.

59
Acknowledgements