Approximation Algorithms - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Approximation Algorithms

Description:

Select k centers C so that maximum distance from a site to nearest center is minimized. ... Algorithm terminates since at least one new node becomes tight after each ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 60
Provided by: kevin589
Category:

less

Transcript and Presenter's Notes

Title: Approximation Algorithms


1
Approximation Algorithms
  • Load Balancing
  • k-center selection
  • Pricing Method
  • Vertex Cover
  • Set Cover
  • Bin Packing
  • TSP

2
Approximation Algorithms
  • Q. Suppose I need to solve an NP-hard problem.
    What should I do?
  • A. Theory says you're unlikely to find a
    poly-time algorithm.
  • Must sacrifice one of three desired features.
  • Solve problem to optimality.
  • Solve problem in poly-time.
  • Solve arbitrary instances of the problem.
  • ?-approximation algorithm.
  • Guaranteed to run in poly-time.
  • Guaranteed to solve arbitrary instance of the
    problem
  • Guaranteed to find solution within ratio ? of
    true optimum.
  • Challenge. Need to prove a solution's value is
    close to optimum, without even knowing what
    optimum value is!

3
11.1 Load Balancing
4
Load Balancing
  • Input. m identical machines n jobs, job j has
    processing time tj.
  • Job j must run contiguously on one machine.
  • A machine can process at most one job at a time.
  • Def. Let J(i) be the subset of jobs assigned to
    machine i. The
  • load of machine i is Li ?j ? J(i) tj.
  • Def. The makespan is the maximum load on any
    machine L maxi Li.
  • Load balancing. Assign each job to a machine to
    minimize makespan.

5
Load Balancing List Scheduling
  • List-scheduling algorithm.
  • Consider n jobs in some fixed order.
  • Assign job j to machine whose load is smallest so
    far.
  • Implementation. O(n log n) using a priority
    queue.

List-Scheduling(m, n, t1,t2,,tn) for i 1
to m Li ? 0 J(i) ? ? for j
1 to n i argmink Lk J(i) ? J(i)
? j Li ? Li tj
load on machine i
jobs assigned to machine i
machine i has smallest load
assign job j to machine i
update load of machine i
6
Load Balancing List Scheduling Analysis
  • Theorem. Graham, 1966 Greedy algorithm is a
    2-approximation.
  • First worst-case analysis of an approximation
    algorithm.
  • Need to compare resulting solution with optimal
    makespan L.
  • Lemma 1. The optimal makespan L ? maxj tj.
  • Pf. Some machine must process the most
    time-consuming job. ?
  • Lemma 2. The optimal makespan
  • Pf.
  • The total processing time is ?j tj .
  • One of m machines must do at least a 1/m fraction
    of total work. ?

7
Load Balancing List Scheduling Analysis
  • Theorem. Greedy algorithm is a 2-approximation.
  • Pf. Consider load Li of bottleneck machine i.
  • Let j be last job scheduled on machine i.
  • When job j assigned to machine i, i had smallest
    load. Its load before assignment is Li - tj ?
    Li - tj ? Lk for all 1 ? k ? m.

blue jobs scheduled before j
machine i
j
0
L Li
Li - tj
8
Load Balancing List Scheduling Analysis
  • Theorem. Greedy algorithm is a 2-approximation.
  • Pf. Consider load Li of bottleneck machine i.
  • Let j be last job scheduled on machine i.
  • When job j assigned to machine i, i had smallest
    load. Its load before assignment is Li - tj ?
    Li - tj ? Lk for all 1 ? k ? m.
  • Sum inequalities over all k and divide by m
  • (correct the second eqn. to j)
  • Now ?

Lemma 1
Lemma 2
9
Load Balancing List Scheduling Analysis
  • Q. Is our analysis tight?
  • A. Essentially yes.
  • Ex m machines, m(m-1) jobs length 1 jobs, one
    job of length m

machine 2 idle
machine 3 idle
machine 4 idle
machine 5 idle
m 10
machine 6 idle
machine 7 idle
machine 8 idle
machine 9 idle
machine 10 idle
list scheduling makespan 19
10
Load Balancing List Scheduling Analysis
  • Q. Is our analysis tight?
  • A. Essentially yes.
  • Ex m machines, m(m-1) jobs length 1 jobs, one
    job of length m

m 10
optimal makespan 10
11
Load Balancing LPT Rule
  • Longest processing time (LPT). Sort n jobs in
    descending order of processing time, and then run
    list scheduling algorithm.

LPT-List-Scheduling(m, n, t1,t2,,tn) Sort
jobs so that t1 t2 tn for i 1 to
m Li ? 0 J(i) ? ? for j
1 to n i argmink Lk J(i) ? J(i) ?
j Li ? Li tj
load on machine i
jobs assigned to machine i
machine i has smallest load
assign job j to machine i
update load of machine i
12
Load Balancing LPT Rule
  • Observation. If at most m jobs, then
    list-scheduling is optimal.
  • Pf. Each job put on its own machine. ?
  • Lemma 3. If there are more than m jobs, L ? 2
    tm1.
  • Pf.
  • Consider first m1 jobs t1, , tm1.
  • Since the ti's are in descending order, each
    takes at least tm1 time.
  • There are m1 jobs and m machines, so by
    pigeonhole principle, at least one machine gets
    two jobs. ?
  • tj lt t(m1) lt ½ L
  • Theorem. LPT rule is a 3/2 approximation
    algorithm.
  • Pf. Same basic approach as for list scheduling.
  • ?

Lemma 3( by observation, can assume number of
jobs gt m )
13
Load Balancing LPT Rule
  • Q. Is our 3/2 analysis tight?
  • A. No.
  • Theorem. Graham, 1969 LPT rule is a
    4/3-approximation.
  • Pf. More sophisticated analysis of same
    algorithm.
  • Q. Is Graham's 4/3 analysis tight?
  • A. Essentially yes.
  • Ex m machines, n 2m1 jobs, 2 jobs of length
    m1, m2, , 2m-1 and one job of length m.

14
11.2 Center Selection
15
Center Selection Problem
  • Input. Set of n sites s1, , sn.
  • Center selection problem. Select k centers C so
    that maximum distance from a site to nearest
    center is minimized.

k 4
site
16
Center Selection Problem
  • Input. Set of n sites s1, , sn.
  • Center selection problem. Select k centers C so
    that maximum distance from a site to nearest
    center is minimized.
  • Notation.
  • dist(x, y) distance between x and y.
  • dist(si, C) min c ? C dist(si, c) distance
    from si to closest center.
  • r(C) maxi dist(si, C) smallest covering
    radius.
  • Goal. Find set of centers C that minimizes r(C),
    subject to C k.
  • Distance function properties.
  • dist(x, x) 0 (identity)
  • dist(x, y) dist(y, x) (symmetry)
  • dist(x, y) ? dist(x, z) dist(z, y) (triangle
    inequality)

17
Center Selection Example
  • Ex each site is a point in the plane, a center
    can be any point in the plane, dist(x, y)
    Euclidean distance.
  • Remark search can be infinite!

r(C)
center
site
18
Greedy Algorithm A False Start
  • Greedy algorithm. Put the first center at the
    best possible location for a single center, and
    then keep adding centers so as to reduce the
    covering radius each time by as much as possible.
  • Remark arbitrarily bad!

greedy center 1
center
k 2 centers
site
19
Center Selection Greedy Algorithm
  • Greedy algorithm. Repeatedly choose the next
    center to be the site farthest from any existing
    center.
  • Observation. Upon termination all centers in C
    are pairwise at least r(C) apart.
  • Pf. By construction of algorithm.

Greedy-Center-Selection(k, n, s1,s2,,sn) C
? repeat k times Select a site si
with maximum dist(si, C) Add si to C
return C
site farthest from any center
20
Center Selection Analysis of Greedy Algorithm
  • Theorem. Let C be an optimal set of centers.
    Then r(C) ? 2r(C).
  • Pf. (by contradiction) Assume r(C) lt ½ r(C).
  • For each site ci in C, consider ball of radius ½
    r(C) around it.
  • Exactly one ci in each ball let ci be the site
    paired with ci.
  • Consider any site s and its closest center ci in
    C.
  • dist(s, C) ? dist(s, ci) ? dist(s, ci)
    dist(ci, ci) ? 2r(C).
  • Thus r(C) ? 2r(C). ?

?-inequality
? r(C) since ci is closest center
½ r(C)
½ r(C)
ci
½ r(C)
C
ci
sites
s
21
Center Selection
  • Theorem. Let C be an optimal set of centers.
    Then r(C) ? 2r(C).
  • Theorem. Greedy algorithm is a 2-approximation
    for center selection problem.
  • Remark. Greedy algorithm always places centers
    at sites, but is still within a factor of 2 of
    best solution that is allowed to place centers
    anywhere.
  • Question. Is there hope of a 3/2-approximation?
    4/3?

e.g., points in the plane
Theorem. Unless P NP, there no ?-approximation
for center-selectionproblem for any ? lt 2.
22
11.4 The Pricing Method Vertex Cover
23
Weighted Vertex Cover
  • Weighted vertex cover. Given a graph G with
    vertex weights, find a vertex cover of minimum
    weight.

4
2
4
2
9
2
9
2
weight 9
weight 2 2 4
24
Weighted Vertex Cover
  • Pricing method. Each edge must be covered by
    some vertex i. Edge e pays price pe ? 0 to use
    vertex i.
  • Fairness. Edges incident to vertex i should pay
    ? wi in total.
  • Claim. For any vertex cover S and any fair
    prices pe ?e pe ? w(S).
  • Proof. ?

4
2
9
2
sum fairness inequalitiesfor each node in S
each edge e covered byat least one node in S
25
Pricing Method
  • Pricing method. Set prices and find vertex cover
    simultaneously.

Weighted-Vertex-Cover-Approx(G, w) foreach e
in E pe 0 while (? edge i-j such that
neither i nor j are tight) select such an
edge e increase pe without violating
fairness S ? set of all tight nodes
return S
26
Pricing Method
price of edge a-b
vertex weight
Figure 11.8
27
Pricing Method Analysis
  • Theorem. Pricing method is a 2-approximation.
  • Pf.
  • Algorithm terminates since at least one new node
    becomes tight after each iteration of while loop.
  • Let S set of all tight nodes upon termination
    of algorithm. S is a vertex cover if some edge
    i-j is uncovered, then neither i nor j is tight.
    But then while loop would not terminate.
  • Let S be optimal vertex cover. We show w(S) ?
    2w(S).

all nodes in S are tight
S ? V,prices ? 0
fairness lemma
each edge counted twice
28
Extra Slides
29
Load Balancing on 2 Machines
  • Claim. Load balancing is hard even if only 2
    machines.
  • Pf. NUMBER-PARTITIONING ? P LOAD-BALANCE.

NP-complete by Exercise 8.26
a
d
b
c
f
g
e
length of job f
Machine 1
a
d
f
machine 1
yes
Machine 2
b
c
e
g
machine 2
Time
L
0
30
Center Selection Hardness of Approximation
  • Theorem. Unless P NP, there is no
    ?-approximation algorithm formetric k-center
    problem for any ? lt 2.
  • Pf. We show how we could use a (2 - ?)
    approximation algorithm for k-center to solve
    DOMINATING-SET in poly-time.
  • Let G (V, E), k be an instance of
    DOMINATING-SET.
  • Construct instance G' of k-center with sites V
    and distances
  • d(u, v) 2 if (u, v) ? E
  • d(u, v) 1 if (u, v) ? E
  • Note that G' satisfies the triangle inequality.
  • Claim G has dominating set of size k iff there
    exists k centers C with r(C) 1.
  • Thus, if G has a dominating set of size k, a (2 -
    ?)-approximation algorithm on G' must find a
    solution C with r(C) 1 since it cannot use
    any edge of distance 2.

see Exercise 8.29
31
Vertex Cover Approximation
  • A vertex cover is a subset of vertices such that
    every edge in the graph is incident to at least
    one of these vertices.
  • The vertex cover optimization problem is to ?nd a
    vertex cover of minimum size.
  • For a good strategy, a heuristic is needed

32
Vertex Cover
  • Consider an arbitrary edge (u, v) in the graph.
    One of its two vertices must be in the cover, but
    we do not know which one.
  • The idea of this heuristic is to simply put both
    vertices into the vertex cover.
  • Then we remove all edges that are incident to u
    and v (since they are now all covered), and
    recurse on the remaining edges.
  • For every one vertex that must be in the cover,
    we put two into our cover, so it is easy to see
    that the cover we generate is at most twice the
    size of the optimum cover.

33
Proof of aprroximation ratio
  • Claim approx VC yields a factor-2 approximation
  • Proof Consider the set C output by ApproxVC. Let
    C be the optimum VC. Let A be the set of edges
    selected by the line marked with () in the
    ?gure. Observe that the size of C is exactly
    2Abecause we add two vertices for each such
    edge. However note that in the optimum VC one of
    these two vertices must have been added to the
    VC, and thus the size of C is at least A. Thus
    we have
  • C
  • ---- A lt C
  • 2
  • Therefore
  • C
  • ---- lt 2
  • C

34
Example
35
Approximate VC Algorithm Naive Approach
  • ApproxVC
  • C empty-set
  • while (E is nonempty) do
  • () let (u,v) be any edge of E
  • add both u and v to C
  • remove from E all edges incident to either u or
    v
  • return C
  • Can we improve on it ?
  • Why not consider vertices with higher degrees
    first (Greedy Strategy)

36
Greedy VC
  • Greedy Approximation for VC GreedyVC(G(V,E))
  • C empty-set
  • while (E is nonempty) do
  • let u be the vertex of maximum degree in G
  • add u to C
  • remove from E all edges incident to u
  • return C
  • For the example, it yields the optimum solution

37
Greedy VC Example
  • Can we prove Greedy VC outperforms the other one
    ?
  • NO !
  • It can even perform poorly than it.
  • However, it should also be pointed out that the
    vertex cover constructed by the greedy heuristic
    is (for typical graphs) smaller than that one
    computed by the 2-for-1 heuristic, so it would
    probably be wise to run both algorithms and take
    the better of the two.

38
Third Attempt Use Matching
  • A matching is a subset of edges that have no
    vertices in common
  • A matching is maximal if no more edges can be
    added to it.
  • Maximal matchings will help us ?nd good vertex
    covers, and moreover, they are easy to generate
    repeatedly pick edges that are disjoint from the
    ones chosen already, until this is no longer
    possible.
  • Any vertex cover of a graph G must be at least as
    large as the number of edges in any matching in
    G that is, any matching provides a lower bound
    on OPT. This is simply because each edge of the
    matching must be covered by one of its endpoints
    in any vertex cover!

39
Example
  • Figure below shows how to convert from Maximal
    Matching to Vertex Cover
  • a) A matching b) Completion to MaxMatch c)
    Its VC

40
Vertex Cover from Matching
  • let S be a set that contains both endpoints of
    each edge in a maximal matching M.
  • Then S must be a vertex coverif it isnt, that
    is, if it doesnt touch some edge e ? E, then M
    could not possibly be maximal since we could
    still add e to it. But our cover S has 2M
    vertices
  • We know that any vertex cover must have size at
    least M.
  • Algorithm
  • Find a maximal matching M 8 E
  • Return S all endpoints of edges in M

41
Vertex cover from Matching
  • This simple procedure always returns a vertex
    cover whose size is at most twice optimal!
  • In summary, even though we have no way of ?nding
    the best vertex cover, we can easily ?nd another
    structure, a maximal matching, with two key
    properties
  • 1. Its size gives us a lower bound on the
    optimal vertex cover.
  • 2. It can be used to build a vertex cover,
    whose size can be related to that of the optimal
    cover using property 1.
  • Alpha lt 2

42
Set Cover Problem Revisited
  • Given a pair (X,F) where X x1,x2,...,xm is a
    ?nite set (a domain of elements) and F
    S1,S2,...,Sn is a family of subsets of X, such
    that every element of X belongs to at least one
    set of F.
  • For C ? F. (This is a collection of sets over X.)
    We say that C covers the domain if every element
    of X is in some set of C
  • The problem is to ?nd the minimum-sized subset C
    of F that covers X.

43
Set Cover
  • Vertex Cover is a type of set cover problem. The
    domain to be covered are the edges, and each
    vertex covers the subset of incident edges.
  • Decision-problem formulation of set cover (does
    there exist a set cover of size at most k?) is
    NP-complete
  • There is a factor-2 approximation for the vertex
    cover problem, but it cannot be applied to
    generate a factor2 approximation for set cover.

44
Set Cover
  • It is known that there is no constant factor
    approximation to the set cover problem
  • There is however the greedy heuristic, which
    achieves an approximation bound of ln m, where m
    X, the size of the underlying domain, we will
    leave the proof.
  • A simple greedy approach to set cover works by at
    each stage selecting the set that covers the
    greatest number of uncovered elements

45
Set Cover The Approx. Algorithm
  • Greedy-Set-Cover(X, F)
  • U X // U are the items to be
    covered
  • C empty // C will be the sets in the
    cover
  • while (U is nonempty) // there is someone left
    to cover
  • select S in F that covers the most elements of U
  • addS to C
  • UU-S
  • return C

46
Set Cover Bad Example
  • The optimal set cover consists of sets S5 and S6,
    each of size 16. Initially all three sets S1, S5,
    and S6 have 16 elements. If ties are broken in
    the worst possible way, the greedy algorithm will
    first select set S1. We remove all the covered
    elements. Now S2, S5 and S6 all cover 8 of the
    remaining elements. Again, if we choose poorly,
    S2 is chosen. The pattern repeats, choosing S3
    (size 4), S4 (size 2) and finally S5 and S6 (each
    of size 1).

47
Bin Packing
  • Bin packing is another well-known NP-complete
    problem, which is a variant of the knapsack
    problem
  • Given a set of n objects, where si denotes the
    size of the ith object (0 lt si lt 1. for
    simplification) , put objects into bins
  • Size of a bin is 1 at max.
  • Use fewest bins as possible
  • Ex Fit object sto a truck etc.

48
Bin Packing Example
49
Bin Packing Approximation Factor
  • Theorem The ?rst-?t heuristic achieves a ratio
    bound of 2.
  • Proof Consider an instance s1,...,sn of the
    bin packing problem. Let S S i si denote the sum
    of all the object sizes. Let b denote the
    optimal number of bins, and bff denote the number
    of bins used by ?rst-?t.
  • b gt S since no bin can hold a total capacity
    of more than 1 unit, and even if we were to fill
    each bin exactly to its capacity, we would need
    at least S bins

50
Bin Packing Analysis
  • We claim that bff lt 2S.
  • To see this, let ti denote the total size of the
    objects that first-fit puts into bin i.
  • Consider bins i and i 1 filled by first-fit.
    Assume that indexing is cyclical, so if i is the
    last index (i bff ) then i1 1.
  • We claim that ti ti 1 gt 1. If not, then the
    contents of bins i and i 1 could both be put
    into the same bin, and hence first-fit would
    never have started to fill the second bin,
    preferring to keep everything in the first bin.
    Thus we have
  • bff
  • Si1 ( ti ti1 ) gt bff

51
Bin Packing Analysis
  • But this sum adds up all the elements twice, so
    it has a total value of 2S. Thus we have 2S gt
    bff .
  • Combining this with the fact that b gt S we
    have
  • bff lt 2S lt 2b showing bff /b lt 2 as
    required
  • best fit attempts put the object into the bin
    in which it fits most closely with the available
    space (approx. ratio 17/10)
  • first fit decreasing, in which the objects are
    first sorted
  • in decreasing order of size (approx. ratio
    11/9)

52
Traveling Salesman Problem (TSP)
  • In the TSP, given a complete undirected graph
    with nonnegative edge weights,
  • Find a cycle that visits all vertices and is of
    minimum
  • cost. (NP-Complete)
  • Distances should satisfy the triangle
    inequality
  • for all u, v, w ? c(u, w) lt c(u, v)c(v, w)
  • (c(u,v) cost on edge uv or cost of shortest
    path)
  • There is an approx. Algorithm forTSP with a
    ratio of 2 (the tour that it produces cannot be
    worse than twice the cost of the optimal tour)

53
TSP Observations
  • A TSP with one edge removed is a spanning tree
    (not necessarily a minimum spanning tree)
  • Therefore, the cost of the minimum TSP tour is at
    least as large as the cost of the MST.
  • MST can be computed ef?ciently, using, for
    example, either Kruskals or Prims algorithm
  • If we can ?nd some way to convert the MST into a
    TSP tour while increasing its cost by at most a
    constant factor, then we will have an
    approximation for TSP.
  • We will see that if the edge weights satisfy the
    triangle inequality, then this is possible.

54
TSP ? MST
  • Given any free tree there is a tour of the tree
    called a twice around tour that traverses the
    edges of the tree twice, once in each direction.
    The ?gure below shows an example.
  • MST twice round tour Short Cut
    Optimal TSP

55
TSP
  • This path is not simple because it revisits
    vertices, but we can make it simple by
    short-cutting, that is, we skip over previously
    visited vertices
  • the ?nal order in which vertices are visited
    using the short-cuts is exactly the same as a
    preorder traversal of the MST
  • The triangle inequality assures us that the path
    length will not increase when we take short-cuts.

56
Approximate Algorithm for TSP
  • ApproxTSP(G(V,E))
  • T minimum spanning tree for G
  • r any vertex
  • L list of vertices visited by a preorder walk
    of T
  • starting with r
  • return L

57
Approx.TSP Analysis
  • Claim Approx-TSP has a ratio bound of 2.
  • Proof Let H denote the tour produced by this
    algorithm and let H be the optimum tour. Let T
    be the minimum spanning tree.
  • We can remove any edge of H resulting in a
    spanning tree, and since T is the minimum cost
    spanning tree we have
  • c(T) lt
    c(H).
  • Twice around tour of T has cost 2c(T), since
    every edge in T is hit twice. By the triangle
    inequality, when we short-cut an edge of T to
    form H we do not increase the cost of the tour,
    and so we have

  • c(H) lt 2c(T).
  • Combining these we have
  • c(H) /2 lt c(T) lt c(H) ,
    therefore
  • c(H) / c(H) lt 2.

58
Graph Partitioning
  • Input An undirected graph G (VE) with
    nonnegative edge weights a real number a ? (0,
    1/2.
  • Output A partition of the vertices into two
    groups A and B, each of size at least a V
  • Goal Minimize the capacity of the cut (A,B).
  • Applications from circuit layout to program
    analysis to image segmentation.
  • Graph Partitioning is NP Hard
  • Removing the restriction on the sizes of A and B
    would give the MINIMUM CUT problem, which we know
    to be efficiently solvable using flow techniques.

59
Acknowledgements
  • The last few algorithms are dependent on David
    Mountc 451 Course, University of Waterloo
Write a Comment
User Comments (0)
About PowerShow.com