Title: Chapter 9: Greedy Technique
1Greedy Algorithms Technique Dr. M. Sakalli,
modified from Levitin and CLSR
2Greedy Technique
- The first key ingredient is the greedy-choice
property a globally optimal solution can be
arrived at by making a locally optimal (greedy)
choice. - In a greedy algorithm, choice is determined on
fly at each step, (while algorithm progresses),
may seem to be best at the moment and then solves
the subproblems after the choice is made. The
choice made by a greedy algorithm may be depend
on choices so far, but it cannot depend on any
future choices or on the solutions to
subproblems. Thus, unlike dynamic programming,
which solves the subproblems bottom up, a greedy
strategy usually progresses in a top-down
fashion, making one greedy choice after another,
iteratively reducing each given problem instance
to a smaller one.
3- Algorithms for optimization problems typically go
through a sequence of steps, using dynamic
programming to determine the best choices,
(bottom-up) for subproblem solutions. - But in many cases much simpler, more efficient
algorithms are possible. A greedy algorithm
always makes the choice for a locally optimal
solution which seems the best at the current
moment with the hope of a globally optimal
solution. In the most cases does not always yield
optimal solutions, but for many problems. - The activity-selection problem, for which a
greedy algorithm efficiently computes a solution,
works well for a wide range of problems, i.e,
minimum-spanning-tree algorithms, Dijkstra's
algorithm for shortest paths form a single
source, and Chvátal's greedy set-covering
heuristic. - Greedy algorithms constructs a solution to an
optimization problem piece by piece through a
sequence of choices that are - feasible
- locally optimal
- Irrevocable (binding and abiding)
4Applications of the Greedy Strategy
- Optimal solutions
- change making for normal coin denominations
- minimum spanning tree (MST)
- single-source shortest paths
- simple scheduling problems
- Huffman codes
- Approximations
- traveling salesman problem (TSP)
- knapsack problem
- other combinatorial optimization problems
5Change-Making Problem
- Given unlimited amounts of coins of denominations
d1 gt gt dm , - give change for amount n with the least number of
coins - Example d1 25c, d2 10c, d3 5c, d4 1c
and n 48c - Greedy solution d1 2 d2 3 d1
- Greedy solution is
- optimal for any amount and normal set of
denominations - may not be optimal for arbitrary coin
denominations
6Induced subgraph
- Let a graph be G V, E, we say that H is a
subgraph of G and we write H?G if the vertices
and edges of H V, E is a subset of the
graph G, V'?V, E'?E. - H need not to accommodate all the edges of G.
- If the vertices of H are connected and if its all
the connecting edges overlay with the edges in G
connecting the same adjacent vertices then, H is
called induced subgraph. Edge induced, vertex
induced, or neither edge nor vertex induced. - Note if at least one path does exist between
every pair of vertices, then this is a connected
(but not directed yet) graph, directed one is the
one whose every edge can only be followed from
one vertex to another.
7Directed acyclic grph, DAG
- Cycle definition seems to be ambiguous, two or
three vertices, (only one vertex), if there is a
path returning back to the same initial vertex
again then there should be a cyclic case too.
But!! - A cyclic graph is a graph that has at least one
cycle (through another vertex -at least one for
bi-directional and at least two for
uni-directional edged graphs) an acyclic graph
is the one that contains no cycles. The girth is
the number of the edges involved in the shortest
cycle, ggt3 is triangle free graph. - A source is a vertex with having no incident
edges, while a sink is a vertex with no diverging
edges from itself. A directed acyclic graph (DAG)
is defined from a source to a vertex or from a
vertex to a sink, or from a source to a sink,
with no directed cycles. - A finite DAG must have at least one source and at
least one sink. - The depth of a vertex in a finite DAG is the
length of the longest path from a source to that
vertex, while its height is the length of the
longest path from that vertex to a sink. - The length of a finite DAG is the length (number
of edges) of a longest directed path. It is equal
to the maximum height of all sources and equal to
the maximum depth of all sinks. Wikipedia.
8Minimum Weight Spanning Tree
- A spanning subgraph is a subgraph that contains
all the vertices of the original graph. A
spanning tree is a spanning subgraph that is of a
acyclic graph of G (a connected acyclic
subgraph) that includes all of vertices of G. - Minimum spanning tree of a weighted, connected
graph G a spanning tree of G of minimum total
weight. Graph G V, E and the weight function
w of a spanning tree T, that connects all V. - w(T)?(E, V)?T w(u, v)
- All weights are distinct, injective
O(E lg V) using binary heaps. O(E V lg V),
using fib heap.
9Partitioning V of G into A and V-A
- GENERIC-MST(G, w)
- A ? Empty set //Initialize
- while A does not form a spanning tree //Terminate
- do if edge (u, v) is safe edge for A
- A?A? (u, v) //Add safe edges
- return A
10How to decide to the light edge
- Definitions a cut (S, V - S) of an undirected
graph G (V, E) is a partition of V, and a light
edge is the one, one of its endpoints (vertices)
resides in S, and the other in V - S, with
minimum possible weight, and a cut respects to A
ifthere is no any edge of A crossing the cut. - . do if edge (u, v) is safe for A
- In line 3 of the pseudo code given for generic
mst(g, w), there must be a spanning tree T such
that A ? T, and if there is an edge of (u, v)?T
such that (u, v)?A, then (u, v) is the one said
safe for A. The rule Theorem
11How to decide to the light edge
y
x
V-S
T
u
S
Overlapping subproblems! DP. but MST leads to an
efficient algorithm.
- Theorem Suppose G (V, E) is a connected,
undirected graph with a real-valued weight
function w defined on E, and let A be a subset of
E that is included in some MST in G, (S, V - S)
be a cut that respects A, and (u, v) be a light
edge crossing (S, V - S). Then, edge (u, v) is
safe to be united to A. - Proof by contradiction Let T be a mst of G, that
A?T and suppose another mst T, with A?T and a
light edge of u, v?T such that (u, v)?T. T'
(of G) having A?(u, v) by using a cut-and-paste
technique is possible, then (u, v) must be a safe
edge for A. But we have another path of T, x, y
crossing the cut S, V-S, (in which one side
includes A), This generates a cycle (for which we
are paying for two crossings and contradicts to
the definition of mst). To minimize the cost we
have to exclude the one with the heavier edge,
and include the light one. - T T - x, v ?(u, v)
- w(u, v) w(x, y). Therefore, w(T') w(T) -
w(x, y) w(u, v) w(T)
T
S
v
12- Both MST algorithms Kruskal, and Prim, determine
a safe edge in line 3 of generic-mst. - In Kruskal's algorithm, the set A is a forest.
- - The safe edge added to A is always a
least-weight edge connecting two distinct
components - - many induced subtrees, and gradually merge
into each other. - In Prim's algorithm, the set A forms a single
tree grows like a snowball as one mass. The safe
edge added to A is always a least-weighted edge
connecting the tree to a vertex not in the tree. - Corollary
- Let
- G (V, E) be a connected, undirected graph,
weight function w on E, let A be a subset of E
that is included in some MST for G, and C be a
connected component (tree) in the forest GA (V,
A). If (u, v) is a light edge connecting C to
some other component in GA, then edge (u, v) is
safe for A.
13Kruskal Algorithm
- It finds a safe edge to add to the growing
forest - A safe edge (u, v)
- connecting any two trees in the forest and
- having the least weight.
- Let C1 and C2 denote the two trees that are
connected by (u, v). Since (u,v) must be a light
edge connecting C1 to some other tree, from
corollary it must be a safe edge for C1. - Then the next step is union of two disjoint trees
C1 and C2. - Kruskal's algorithm is a greedy algorithm,
because at each step it adds to the forest an
edge of the least possible weight. - Implementation of Kruskal's algorithm employs a
disjoint-set data structure to maintain several
disjoint sets of elements. Each set contains the
vertices in a tree of the current forest. The
operation FIND-SET(u) returns a representative
element from the set that contains u. - Thus, FIND-SET(u) FIND-SET(v), then vertices u
and v belong to the same tree otherwise combine
two trees, if u, v is a light edge - UNION
procedure.
14- MST-KRUSKAL(G, w)
- 1 A ? Empty set
- 2 for each v ?VG // for each vertex
- 3 do MAKE-SET (v) //create V trees,
- 4 sort the edges of E by nondecreasing weight w
- 5 for each e(u, v) ?E, in order by nondecreasing
weight - 6 do if FIND-SET(u) ? FIND-SET(v) //Check
if - 7 then A ? A ? (u, v) //if
not in the same set - 8 UNION (u, v) //merge two
components - 9 return A
- set A to the empty set and
- The running time for a G (V, E) depends on the
implementation of the disjoint-set data
structure. - Assume the disjoint-set-forest implementation
with the union-by-rank and path-compression
heuristics, since it is the asymptotically
fastest implementation known. - Initialization O(V), time to sort the edges in
line 4 O(E lg E). - There are O(E) operations on the disjoint-set
forest, which in total take O(E a(E, V)) time,
where a is the functional inverse of Ackermann's
function defined. Since (E, V) O(lg E), the
total running time of Kruskal's algorithm is O(E
lg E).
15(No Transcript)
16(No Transcript)
17Prims MST algorithm
- Operates much like Dijkstra's algorithm for
finding shortest paths in a graph. Prim's
algorithm has the property that the edges in the
set A always form a single tree. - Starting from an arbitrary root vertex r (tree A)
and expanding one vertex at a time, until the
tree spans all the vertices in V. At each turn, a
light edge connecting a vertex in A to a vertex
in V - A is added to the tree. - On each iteration, construct Ti1 from Ti by
adding vertex not in Ti (A) that is closest to
those already in Ti (this is a greedy step!) - The same Corollary The rule allows merging of
the edges that are safe for A therefore,
Terminates when the all vertices are included in
A, there the edges in A form a minimum spanning
tree. - This strategy is "greedy" since the tree is
augmented at each step with an edge that
contributes the minimum amount possible to the
tree's weight. - Needs priority queue for locating closest fringe
vertex - Next analysis rom the notes of CLRS, and listen
from MIT
18 19Notes about Kruskals algorithm
- Algorithm looks easier than Prims but is harder
to implement (checking for cycles!) - Cycle checking a cycle is created iff added edge
connects vertices in the same connected component - Union-find algorithms
20Shortest paths Dijkstras algorithm
- Single Source Shortest Paths Problem Given a
weighted - connected graph G, find shortest paths from
source vertex s - to each of the other vertices
- Dijkstras algorithm Similar to Prims MST
algorithm, with - a different way of computing numerical labels
Among vertices - not already in the tree, it finds vertex u with
the smallest sum - dv
w(v,u) - where
- v is a vertex for which shortest path has been
already found on preceding iterations (such
vertices form a tree) - dv is the length of the shortest path form
source to v w(v,u) is the length (weight) of
edge from v to u
21Example
4
d
d
Tree vertices Remaining vertices
a(-,0) b(a,3) c(-,8) d(a,7) e(-,8)
4
b(a,3) c(b,34) d(b,32)
e(-,8)
b
c
3
6
5
2
a
d
e
7
4
4
d(b,5) c(b,7) e(d,54)
b
c
3
6
5
2
a
d
e
7
4
4
c(b,7) e(d,9)
b
c
3
6
2
5
a
d
e
7
4
e(d,9)
22Notes on Dijkstras algorithm
- Doesnt work for graphs with negative weights
- Applicable to both undirected and directed graphs
- Efficiency
- O(V2) for graphs represented by weight matrix
and array implementation of priority queue - O(EVlogV) for graphs represented by adj.
lists and min-heap implementation of priority
queue - Fib heap O(EVlogV, amertized.)
- Dont mix up Dijkstras algorithm with Prims
algorithm!
23Coding Problem
- Coding assignment of bit strings to alphabet
characters - Codewords bit strings assigned for characters of
alphabet - Two types of codes
- fixed-length encoding (e.g., ASCII)
- variable-length encoding (e,g., Morse code)
- Prefix-free codes no codeword is a prefix of
another codeword - Problem If frequencies of the character
occurrences are - known, what is the best binary
prefix-free code?
24Huffman codes
- Any binary tree with edges labeled with 0s and
1s yields a prefix-free code of characters
assigned to its leaves - Optimal binary tree minimizing the expected
(weighted average) length of a codeword can be
constructed as follows - Huffmans algorithm
- Initialize n one-node trees with alphabet
characters and the tree weights with their
frequencies. - Repeat the following step n-1 times join two
binary trees with smallest weights into one (as
left and right subtrees) and make its weight
equal the sum of the weights of the two trees. - Mark edges leading to left and right subtrees
with 0s and 1s, respectively.
25Example
- character A B C D _
- frequency 0.35 0.1 0.2 0.2 0.15
- codeword 11 100 00 01 101
- average bits per character 2.25
- for fixed-length encoding 3
- compression ratio (3-2.25)/3100 25
26(No Transcript)