Approximation Algorithms

About This Presentation

Title:

Approximation Algorithms

Description:

Problem: to find a Hamiltonian cycle of minimal cost. ... Problem: to find a Hamiltonian cycle of minimal cost. Polynomial Algorithm for TSP? – PowerPoint PPT presentation

Number of Views:331

Avg rating:3.0/5.0

Slides: 75

Provided by: educ5393

Category:

more less

Transcript and Presenter's Notes

Title: Approximation Algorithms

1
Approximation Algorithms
2
Motivation

By now weve seen many NP-Complete problems.
We conjecture none of them has polynomial time
algorithm.

3
Motivation

Is this a dead-end? Should we give up altogether?

?
4
Motivation

Or maybe we can settle for good approximation
algorithms?

5
Introduction

Objectives
To formalize the notion of approximation.
To demonstrate several such algorithms.
Overview
Optimization and Approximation
VERTEX-COVER, SET-COVER

6
Optimization

Many of the problems weve encountered so far are
really optimization problems.
I.e - the task can be naturally rephrased as
finding a maximal/minimal solution.
For example finding a maximal clique in a graph.

7
Approximation

An algorithm which returns an answer C which is
close to the optimal solution C is called an
approximation algorithm.
Closeness is usually measured by the ratio
bound ?(n) the algorithm produces.
Which is a function that satisfies, for any input
size n, maxC/C,C/C??(n).

8
VERTEX-COVER

Instance an undirected graph G(V,E).
Problem find a set C?V of minimal size s.t. for
any (u,v)?E, either u?C or v?C.

Example
9
Minimum VC NP-hard

Proof It is enough to show the decision problem
below is NP-Complete

Instance an undirected graph G(V,E) and a
number k.
Problem to decide if there exists a set V?V of
size k s.t for any (u,v)?E, u?V or v?V.

This follows immediately from the following
observation.
10
Minimum VC NP-hard

Observation Let G(V,E) be an undirected graph.
The complement V\C of a vertex-cover C is an
independent-set of G.
Proof Two vertices outside a vertex-cover cannot
be connected by an edge. ?

11
VC - Approximation Algorithm
COR(B) 523-524

C ? ?
E ? E
while E ? ?
do let (u,v) be an arbitrary edge of E
C ? C ? u,v
remove from E every edge incident to either
u or v.
return C.

12
Demo
Compare this cover to the one from the example
13
Polynomial Time

C ? ?
E ? E
while E ? ? do
let (u,v) be an arbitrary edge of E
C ? C ? u,v
remove from E every edge incident to either u or
v
return C

14
Correctness

The set of vertices our algorithm returns is
clearly a vertex-cover, since we iterate until
every edge is covered.

15
How Good an Approximation is it?
Observe the set of edges our algorithm chooses
? any VC contains 1 in each
our VC contains both, hence at most twice as
large
16
The Traveling Salesman Problem
17
The Mission A Tour Around the World
18
The Problem Traveling Costs Money
1795
19
Introduction

Objectives
To explore the Traveling Salesman Problem.
Overview
TSP Formal definition Examples
TSP is NP-hard
Approximation algorithm for special cases
Inapproximability result

20
TSP

Instance a complete weighted undirected graph
G(V,E) (all weights are non-negative).
Problem to find a Hamiltonian cycle of minimal
cost.

3
2
10
1
4
3
5
21
Polynomial Algorithm for TSP?
What about the greedy strategy At any point,
choose the closest vertex not explored yet?
22
The Greedy trategy Fails
10
?
?
5
12
2
3
?
0
1
23
The Greedy trategy Fails
10
?
?
5
12
2
3
?
0
1
24
TSP is NP-hard

The corresponding decision problem
Instance a complete weighted undirected graph
G(V,E) and a number k.
Problem to find a Hamiltonian path whose cost is
at most k.

25
TSP is NP-hard
verify!

Theorem HAM-CYCLE ?p TSP.
Proof By the straightforward efficient reduction
illustrated below

0
1
0
0
1
k0
0
HAM-CYCLE
TSP
26
What Next?

Well show an approximation algorithm for TSP,
which yields a ratio-bound of 2
for cost functions which satisfy a certain
property.

27
The Triangle Inequality

Definition Well say the cost function c
satisfies the triangle inequality, if
?u,v,w?V c(u,v)c(v,w)?c(u,w)

28
Approximation Algorithm
COR(B) 525-527

1. Grow a Minimum Spanning Tree (MST) for G.
2. Return the cycle resulting from a preorder
walk on that tree.

29
Demonstration and Analysis
The cost of a minimal hamiltonian cycle ? the
cost of a MST
?
30
Demonstration and Analysis
The cost of a preorder walk is twice the cost of
the tree
31
Demonstration and Analysis
Due to the triangle inequality, the hamiltonian
cycle is not worse.
32
What About the General Case?
COR(B) 528

Well show TSP cannot be approximated within any
constant factor ??1
By showing the corresponding gap version is
NP-hard.

33
gap-TSP?

Instance a complete weighted undirected graph
G(V,E).
Problem to distinguish between the following two
cases
There exists a hamiltonian cycle, whose cost
is at most V.
The cost of every hamiltonian cycle is more
than ?V.

YES
NO
34
Instances
min cost
35
What Should an Algorithm for gap-TSP Return?
min cost
DONT-CARE...
36
gap-TSP Approximation

Observation Efficient approximation of factor ?
for TSP implies an efficient algorithm for
gap-TSP?.

37
gap-TSP is NP-hard

Theorem For any constant ??1,
HAM-CYCLE ?p gap-TSP?.
Proof Idea Edges from G cost 1. Other edges cost
much more.

38
The Reduction Illustrated
1
?V1
1
1
?V1
1
HAM-CYCLE
gap-TSP
Verify (a) correctness (b) efficiency
39
Approximating TSP is NP-hard

gap-TSP? is NP-hard

Approximating TSP within factor ? is NP-hard
40
Summary
?

Weve studied the Traveling Salesman Problem
(TSP).
Weve seen it is NP-hard.
Nevertheless, when the cost function satisfies
the triangle inequality, there exists an
approximation algorithm with ratio-bound 2.

41
Summary
?

For the general case weve proven there is
probably no efficient approximation algorithm for
TSP.
Moreover, weve demonstrated a generic method for
showing approximation problems are NP-hard.

42
SET-COVER

Instance a finite set X and a family F of
subsets of X, such that
Problem to find a set C?F of minimal size which
covers X, i.e -

43
SET-COVER Example
44
SET-COVER is NP-Hard

Proof Observe the corresponding decision
problem.
Clearly, its in NP (Check!).
Well sketch a reduction from (decision)
VERTEX-COVER to it

45
VERTEX-COVER ?p SET-COVER
one element for every edge
one set for every vertex, containing the edges it
covers
46
Greedy Algorithm
COR(B) 530-533

C ? ?
U ? X
while U ? ? do
select S?F that maximizes S?U
C ? C ? S
U ? U - S
return C

47
Demonstration
compare to the optimal cover
0
1
2
3
4
5
48
Is Being Greedy Worthwhile? How Do We Proceed
From Here?

We can easily bound the approximation ratio by
logn.
A more careful analysis yields a tight bound of
lnn.

49
Loose Ratio-Bound

Claim If ? cover of size k, then after k
iterations the algorithm covered at least ½ of
the elements.

Suppose it doesnt and observe the situation
after k iterations
50
Loose Ratio-Bound

Claim If ? cover of size k, then after k
iterations the algorithm covered at least ½ of
the elements.

Since this part ? can also be covered by k sets...
gt½
what we covered
51
Loose Ratio-Bound

Claim If ? cover of size k, then after k
iterations the algorithm covered at least ½ of
the elements.

there must be a set not chosen yet, whose size is
at least ½n1/k
gt½
what we covered
52
Loose Ratio-Bound

Claim If ? cover of size k, then after k
iterations the algorithm covered at least ½ of
the elements.

gt½
Thus in each of the k iterations weve covered at
least ½n1/k new elements
what we covered
53
Loose Ratio-Bound

Claim If ? cover of size k, then after k
iterations the algorithm covered at least ½ of
the elements.

Therefore after klogn iterations (i.e - after
choosing klogn sets) all the n elements must be
covered, and the bound is proved.
54
Tight Ratio-Bound

Claim The greedy algorithm approximates the
optimal set-cover within factor
H(max S S?F )
Where H(d) is the d-th harmonic number

55
Tight Ratio-Bound
56
Claims Proof

Whenever the algorithm chooses a set, charge 1.

Split the cost between all covered vertices.

57
Analysis

That is, we charge every element x?X with
Where Si is the first set which covers x.

cx
58
Lemma
Number of members of S left uncovered after i
iterations

Lemma For every S?F,

Let k be the smallest index, for which uk0.
?1?i?k Si covers ui-1-ui elements from S
59
Lemma
This last observation yields
Our greedy strategy promises Si (1?i?k) covers at
least as many new elements as S.
Since for any 1?i?C we defined ui as
S-(S1?...?Si)...
For any bgta?N, H(b)-H(a)1/(a1)...1/(b)?(b-a)1
/b
This is a telescopic sum
uk0
H(0)0
u0S
60
Analysis

Now we can finally complete our analysis

61
Summary
?

As it turns out, we can sometimes find efficient
approximation algorithms for NP-hard problems.
Weve seen two such algorithms
for VERTEX-COVER (factor 2)
for SET-COVER (logarithmic factor).

62
The Subset Sum Problem

Problem definition
Given a finite set S and a target t, find a
subset S ? S whose elements sum to t
All possible sums
S x1, x2, .., xn
Li set of all possible sums of x1, x2, .., xi
Example
S 1, 4, 5
L1 0, 1
L2 0, 1, 4, 5 L1 ? (L1 x2)
L3 0, 1, 4, 5, 6, 9, 10 L2 ? (L2 x3)
Li Li-1 ? (Li-1xi)

63
Subset Sum, revisited

Given a set S of numbers, find a subset S that
adds up to some target number t.
To find the largest possible sum that doesnt
exceed t
T 0
for each x in S
T union(T, xT)
remove elements from T that exceed t
return largest element in T
(Aside How should we implement T?)

x T adds x to each element in the set T
Potential doubling at each step
Complexity O(2n)
64
Trimming

To reduce the size of the set T at each stage, we
apply a trimming process.
For example, if z and y are consecutive elements
and (1-d)y ? z lt y, then remove z.
If d0.1, 10,11,12,15,20,21,22,23,24,29 ?
10,12,15,20,23,29

65
Subset Sum with Trimming

Incorporate trimming in the previous algorithm
T 0
for each x in S
T union(T, xT)
T trim(d, T)
remove elements from T that exceed t
return largest element in T
Trimming only eliminates values, it doesnt
create new ones. So the final result is still
the sum of a subset of S that is less than t.

0 ? d ? 1/n
66

At each stage, values in the trimmed T are within
a factor somewhere between (1-d) and 1 of the
corresponding values in the untrimmed T.
The final result (after n iterations) is within a
factor somewhere between (1-d)n and 1 of the
result produced by the original algorithm.

After trimming, the ratio between successive
elements in T is at least 1/(1-d), and all of the
values are between 0 and t.
Hence the maximum number of elements in T is
log(1/(1-d)) t ? (log t / d).
This is enough to give us a polynomial bound on
the running time of the algorithm.

68
Subset Sum Trim

Want to reduce the size of a list by trimming
L An original list
L The list after trimming L
d trimming parameter, 0..1
y an element that is removed from L
z corresponding (representing) element in L
(also in L)
(y-z)/y ? d
(1-d)y ? z ? y
Example
L 10, 11, 12, 15, 20, 21, 22, 23, 24, 29
d 0.1
L 10, 12, 15, 20, 23,
29
11 is represented by 10. (11-10)/11 ? 0.1
21, 22 are represented by 20. (21-20)/21 ? 0.1
24 is represented by 23. (24-23)/24 ? 0.1

69
Subset Sum Trim (2)

Trim(L, d) // L y1, y2, .., ym
L y1
last y1 // most recent element z in L which
represent elements in L
for i 2 to m do
if last lt (1-d) yi then // (1-d)y ? z ? y
// yi is appended into L when it cannot
be represented by last
append yi onto the end of L
last yi
return L
Example
L 10, 11, 12, 15, 20, 21, 22, 23, 24, 29
d 0.1
L 10, 12, 15, 20, 23,
29
O(m)

70
Subset Sum Approximate Algorithm

Approx_subset_sum(S, t, e) // Sx1,x2,..,xn
L0 0
for i 1 to n do
Li Li-1 ? (Li-1xi)
Li Trim(Li, e/n)
Remove elements that are greater than t from
Li
return the largest element in Ln
Example
L 104, 102, 201, 101, t308, e0.20, d
e/n0.05
L0 0
L1 0, 104
L2 0, 102, 104, 206
After trimming 104 L2 0, 102, 206
L3 0, 102, 201, 206, 303, 407
After trimming 206 L3 0, 102, 201, 303, 407
After removing 407 L3 0, 102, 201, 303
L4 0, 101, 102, 201, 203, 302, 303, 404
After trimming 102, 203, 303 L4 0, 101, 201,
302, 404
After removing 404 L4 0, 101, 201, 302

71
Subset Sum - Correctness

The approximation solution C is not smaller than
(1-e) times of an optimal solution C
i.e., C(1-e) ? C
Proof
for every element y in L there is a z in L such
that
(1-e/n)y ? z ? y
for every element y in Li there is a z in Li
such that
(1-e/n)i y ? z ? y
If y is an optimal solution in Ln, then there is
a corresponding z in Ln
(1-e/n)n y ? z ? y
Since (1-e) lt (1-e/n)n (1-e/n)n is increasing
(1-e) y ? (1-e/n)n y ? z
(1-e) y ? z
So the value z returned is not smaller than 1-e
times the optimal solution y

72
Subset Sum Correctness (2)

The approximation algorithm is fully polynomial
Proof
Successive elements z and z in Li must have the
relationship
z/z 1/(1-e/n)
i,e, they differ by a factor of at least
1/(1-e/n)
The number of elements in each Li is at most
log 1/(1-e/n) t t is the largest
ln t / (-ln(1-e/n))
? (ln t) / (-(-e/n)) Eq. 2.10 x/(1x) ?
ln(1x) ? x, for x gt -1
? (n ln t) / e
So the length of Li is polynomial
So the running time of the algorithm is polynomial

73
Summary

Not all problems are computable.
Some problems can be solved in polynomial time
(P).
Some problems can be verified in polynomial time
(NP).
Nobody knows whether PNP.
But the existence of NP-complete problems is
often taken as an indication that P?NP.
In the meantime, we use approximation to find
good-enough solutions to hard problems.