Title: Graphs and Finding your way in the wilderness
1Graphs andFinding your way in the wilderness
- Chapter 14 in DSPS
- Chapter 9 in DSAA
2General Problems
- What is the shortest path from A to B?
- What is the shortest path from A to all nodes?
- What is the shortest/cheapest path between any
two nodes?. - Search for Goal node, i.e. a node with specific
properties, like a win in chess. - What is shortest tour? (visit all vertices)
- no known polynomial algorithm in number of edges
- What is longest path from A to B
3Some Applications
- Route Finding
- metacrawler, on net, claims to find shortest path
between two points - Game-Playing
- great increase in chess/checkers end-game play
occurred when recognized as graph search, not
tree search - Critical Path Analysis
- multiperson/task job schedule analysis
- answers what are the key tasks that cant slip
- Travel arrangement
- cheapest cost to meet constraints
4Definitions
- Graph set of edges E and vertices V.
- Edge is a pair of vertices (v,w)
- edge may be directed or undirected
- edge may have a cost
- v and w are said to be adjacent
- Digraph or directed graph directed edges
- Path sequence of vertices v1,vn where each
ltvi,vi1gt is an edge. - Cycle path where v1vn
- Tour cycle that contains every vertex
- DAG directed acyclic graph
- graph with no cycles
- Tree algorithms usually work with DAGs just fine
5Adjacency Matrix Representation
- Matrix A new Boolean(V,V).
- O(V2) memory costs acceptable only if dense
- If ltvi,vjgt is an edge, set Aij true, else
false - Special matrix multiple operator
- rowi _at_ colj rowi1col1j or
rowi2col2j.. - In A_at_A, if entry ij is true, what does that
mean? - There is some k so that vi-gtvk and vk-gtvj or a
length 2 path from vi to vk. - Similarly, Ak indicates where any two vertices
are connected by a length k path. - Cost O(kn3).
6Matrix Review
- If A has n rows and k columns and
- B has k rows and m columns then AB C
- Where C has n rows and m columns and (standard)
- Cij Ai1B1j.AikBkj
- or i-j entry of C is dot product of row i of A
and column J of B. Total time cost is O(nkm). - Example matrix of size 3-3 represents a linear
transformation from R3 into R3. - Points in R3 represented as a 3 by 1 vector
(column) - Essentially matrices stretch/shrink or rotate
points. - Determinant defines amount of stretch/shrink.
- Theorem Locally every differentiable function
can be approximated by a matrix (linear
transformation).
7Cost Matrix Representation
- Now Aij cost of edge from vi to vj.
- If no edge, either set cost to Infinity or add
Boolean attribute to indicate no edge. - New multiplication operation
- rowi_at_colj min rowi1col1j,
-
rowi2col2j, -
rowin colnj - Now A2 contains minimum cost path of length 2
between any 2 vertices. - An has complexity O(n4). Not good.
- If add Aii 0, then Ak records minimum cost
path of length k. (how to change to allow all
paths lt k)
8Adjacency List Representation
- Here each vertex is the head of linked list which
stores all the adjacent vertices. The cost to a
node, if appropriate is also stored. - The linked lists are usually stored in an array,
since you probably know how many vertices there
are. - Memory cost O(E) so if E ltlt V2, use list
representation. - Graph is sparse if E is O(V).
- To simplify the discussion, we will assume that
each vertices has a method sons which returns
all adjacent vertices. - Now, will redo same problems with list
representation.
9General Graph Search AlgorithmLooking for a node
with a property
- Set Store equal to the some node
- while ( Store is non-empty) do
- choose a node n in Store
- if n is solution, stop
- Decisions
- else add SOME sons of n to store
- What should store be?
- How do we choose a node?
- what does add mean?
- How do pick which sons to store.
- Cycles are a problem
10Problem Is Graph connected? (matrix rep)
- Note n by n boolean matrix where n is number of
vertices. - Set Aii to true
- Set Aij to true if there is an edge between i
and j. - Let B A2, using boolean arithmetic
- Note Bij is true iff there is a k such that
Bik is true and Bkj is true, i.e if there
is a 2-path from i to j. - Ak represents whether a k-path exists between
any vertices. - Let C boolean sum of Ai where i 1n-1. (why?)
- Graph connected if C is all ones.
- Time complexity O(N4)!
- How about directed graphs? Basically the same
algorithm. - For directed graphs, strongly connected means
directed path between any two vertices.
11Is Undirected Graph G Connected? (adjacency
list representation)
- Suppose G is an undirected graph with N vertices.
- Let S be any node
- Do a (depth/breadth) first search of G, counting
the number of nodes. Be careful not to double
count. - If number of nodes does not equal N,
disconnected. - Searching a Graph is like searching a tree,
except that nodes may be revisited. - Need to keep track of revisits, else infinite
loop.
12Depth First Search Pseudo-Code
- Store Stack
- Choose pop
- Initial node any node
- Add push only new sons (unvisited ones)
- keep a boolean field visited, initialized to
false. - When a node is popped, mark it as visitied.
- Graph connected if all nodes visited.
- Properties
- Memory cost number of nodes
- Guarantee to find a solution, if one exists (not
shortest solution) How could we guarantee that? - Time number of nodes (exponential for k-ary
trees)
13Breadth First Search
- G is a undirected Graph
- As before each node has a boolean visited field.
- Initial node is arbitrary
- Store Queue
- Choose dequeue and mark as visited
- Add enqueue only those sons that have not been
visited - Properties
- Time Number of nodes to solution
- Space Number of nodes
- Guaranteed to find shortest solution
14Is Directed Graph Acyclic? (array represntation)
- Let Aij be true if there is a directed edge
from i to j. - Similar to previous case, if B A2 with boolean
multiplication, then Bij is true iff there is
a directed 2-path from i to j. - Algorithm
- For i 1 to n-1 (why?)
- Compute Ai.
- If some diagonal element is true, exit
with true - end for
- Exit with false.
15Is Directed Graph Acyclic? (adjacency list rep)
- With care, breadth first search works.
- Define the indegree of a node v as the number of
edges of the form (u,v), with u arbitrary. - Define the outdegree of a node v as the number of
edges of the form (v, u) with u arbitrary. - A node with indegree 0 is like the root of a
tree. - A node with outdegree 0 is a terminal node.
16Breadth First Search Pseudo-code
- Algorithm Idea (has numerous variations/implementa
tions) - Store Queue
- Compute indegree of all nodes
- Enqueue all nodes of indegree 0
- While Queue is not empty
- Dequeue node n and lower indegrees of nodes of
form (n,v) - Enqueue any node whose indegree is 0.
- If any node still has positive indegree, then
cyclic. - Why does algorithm terminate?
- Properties
- Time Space Number of nodes
- find node closest to roots.
17Best-First Search
- Goal find least cost solution
- Here edges have a cost (positive)
- Store priority queue
- Add enqueue(), which puts in right order
- Choose dequeue(), chooses element of least cost
- Properties
- Find cheapest solution
- Time and Memory exponential in
- depth of tree.
18Best First Pseudo-Code
- Set distance from S to S to 0
- Priority Queue PQ lt- S
- while (PQ is not empty)
- vertex lt- PQ.deque()
- sons lt- vertex.sons()
- for each son in sons
- PQ.enqueue(son, cost to son)
19Topological Sort
- Given a directed acyclic graph
- Produce a linear ordering of the vertices such
that if a path exist from v1 to v2, then v1 is
before v2. - If v1 is before v2, is there a path from v1 to
v2? - NO
- Note there may be multiple correct topological
sorts - Algorithm Idea
- any vertex with indegree 0 can be first
- Output and delete that vertex
- update indegrees of its sons
- Repeat until empty
- So we need to compute and keep track of indegrees
20Algorithm Implementation
- HashTable of (vertex, indegree, sons)
- Queue of vertices
- Step 1 read each edge (v,w) and add 1 to
indegree of w - linear
- Step 2 Add all vertices with indegree 0 to queue
Q. - Step 3 Process Q by
- dequeue vertex
- update indegrees of its sons (constant by
hashing) - enqueue any son whose indegree become 0.
- Time complexity linear
- Space linear
- Proof Does everything get enqueued?
21UnWeighted Single-Source Shortest path algorihtm
- Input Directed graph and start node S
- Output the minimum cost, in terms of number of
edges traversed, from start node to all other
nodes. - Idea do a level order search (breadth-first
search) - Well use a hashtable to mark elements as seen,
- i.e. well track vertices that weve visited
- use hashtable to hold this information by
marking vertices that have been visited - Well use a queue to store the vertices to be
opened - to open a node means to consider its sons
- Each vertex will have a field for distance to
start node. -
22Shortest Path Algorithm
- Set distance from S to S to 0
- queue lt- S
- while (queue is not empty)
- vertex lt- queue.dequeue()
- mark vertex as visited (enter in
hashtable) - record cost to vertex
- sons lt- vertex.sons()
- newSons lt- sons that are unmarked
- fill in distance measure to newSons
- queue.enqueue(newSons)
- Essentially, breadth-first search
-
-
-
23Discussion
- Will this terminate?
- Will we ever revisit a node?
- Computational cost?
- O(E)
- What are the memory requirements?
- Suppose we dont count graph (virtual graphs)
- Can we bound queue?
- Only O(E) and if m-ary tree, this is
exponential. - Did we need the hashtable?
- This avoids a linear search of the constructed
graph - remove a factor of O(G).
24Positive-Weighted Single-source Cheapest Path
- Suppose we have positive costs associated with
every edge in a directed graph. - Problem Find the shortest path(total cost) from
given vertex S to every vertex. - Solution Dijstras algorithm
- BFS idea still works, with slight modifications
- Replace Queue by Priority queue.
- As before, replace newSons by betterSons.
- As before, replace add entry to update entry
(which may be add) - Note may reopen an old son( if return with
better path) - Dense graphs O(V2), sparse graphs
O(ElogV)
25Weighted Shortest Path Pseudo-Code
- Set distance from S to S to 0 (on node)
- Priority Queue PQ lt- S
- while (PQ is not empty)
- vertex lt- PQ.dequeue() remove min
- mark vertex as visited (enter in
hashtable) - record cost to vertex
- sons lt- vertex.sons()
- goodSons lt- new sons OR old sons with
better costs estimates - queue.enqueue(goodSons) enqueue puts in
proper order.
26Graphs with negative edge costs
- Dijsktra doesnt work (since we may have cycles
which lower the cost) - Input Directed graph with arbitrary edge costs
and vertex v. - Output Minimum cost from S to every vertex OR
graph has a negative cost cycle. - Note If no negative cost cycles, then a vertex
can be visited (expanded) at most V times. - Algorithm Add counter to each vertex so each
time it is visited with lower cost, counter goes
up. If counter exceeds V, then graph has
negative cost cycle and we exit. Otherwise queue
will be empty.
27Weighted Single-Source shortest-path problems for
Acyclic graphs
- Easy since no cycles
- Edge costs may be positive or negative
- Best-first search works
- Node may be reentrant
- Reentrant node require may required updating
cost. - Or apply topological sorting algorithm. (text)
2
4
7
3
6
1
28Algorithm Display
- Idea
- Iterative use of breadth first search
- addition of edges to effect other choices
- See Diagrams provided
- Analysis (requires augmented path be cheapest)
- Runs in linear time
29Minimum Spanning Tree
- Given an undirected connected graph with edge
costs - Output a subtree of graph such that
- contains all vertices
- sum of costs of edges is minimum
- If costs not given, assume 1. What then?
- Note all spanning trees have same number of edges
- Application
- Is undirected Graph with n vertices connected?
- IFF minimal spanning tree has n-1 edges.
30Prims Algorithm
- Let G be given as (V,E) where V has n vertices
- Let T empty
- Algorithm Idea grow cheapest tree
- Choose a random v to start and add to T
- Repeat (until T has n vertices)
- select edge of minimum length that does not form
a cycle and that attaches to current tree (how
to check?) - add edge to T
- The proof is more difficult than the code.
- Complexity depends on G and code
- O(V2) for dense graphs
- O(Elog(V)) for sparse graphs (use binary heap)
31Kruskals Algorithm
- Given graph G (V,E)
- Sort edges on the basis of cost.
- Add least cost edge to Forest, as long as no
cycle is formed. - Cost of cycle checking is?
- If implement as adjacency list, O(E2)
- If implement as hash table O(1)
- Proof more difficult.
- Time complexity O(E log E)
32Finding the least cost between pairs of points
- Idea Dynamic programming
- Let cij be the edge cost between vi and vj.
- Define Cij as minimum cost for going from vi
to vj. - Finding the subproblems
- Suppose P is the path from vi to vj which
realizes the minimum cost and vk is an
intermediary node. - Then the subpaths from i to k and from k to j
must be optimal, otherwise P would not be
optimal. - Now define Dikj as the minimum cost for
going from vi to vj using any of v1,v2..,vk as an
intermediary. - Define Dij as the minimum cost for going from
i to j. - Dij min over k of Dikj and cij.
33All-Pairs (Floyds)Pseudo-Code
- Initialization
- Di0j cost(i,j) for all vertices i, j
O(V2) - Dik1j
- min(Dikj, Dikk1Dk1kj)
- This last statement is true since any path from
the shortest path from vi to vj using v1,vk1
either doesnt use vk1, or the path divides
into a path from vi to vk1 and one from
vk1 to vj. - The cost of this is O(V3) - i.e. single loop
over all vertices with V2 per loop.
34NP vs P
- Multiple ways to define
- Define new computational model (Imaginary)
- add to programming language
- choose S1, S2,.Sn where Si are statements
- Semantics algorithm always chooses best Si to
execute. - This is the Non-Deterministic model
- If problem can be solve in polynomial time with
non-deterministic it is in the class NP. - If problem can be solved in polynomial time on
standard computer (deterministic) then in class
P. - Unsolved (and possibly unsolvable) does NP P?
35NP-Completeness
- A problem is in NP or NP-hard if it can be solved
in polynomial time on a non-deterministic
machine. - A problem p is NP complete if any problem in NP
can be polynomial reduced to p. - A problem P1 can be polynomial reduced to P2 if
P1 can be solved in polynomially time assuming
that P2 can be solved in polynomially time. - Alternatively, if P1 can be transformed into P2
and solutions of P2 mapped back to P1 and all the
transformations take polynomial time. - This is a way of forming a taxonomy of difficulty
of various problems
36NP-Complete problems
- Boolean Satisfiability
- Traveling Salesmen
- Bin Packing given packages of size a1an and
bins of size k, what is the fewest numbers of
bins needed to store all the packages. - Scheduling Given tasks whose time take t1tn
and k processors, what is minimum completion
time? - Graph Given a graph find the clique of maximum
size. - A clique is a completely connected subgraph.
- Subset-sum Given a finite set S of n numbers and
a target number t, does some subset of S sum to
t. - Vertex Cover A vertex cover is a subset of
vertices which hits every edge. The problem is to
find a cover with the fewest number of vertices.