Title: Mergesort, Analysis of Algorithms
1Mergesort, Analysis of Algorithms
Jon von Neumann and ENIAC (1945)
2Why Does It Matter?
Run time(nanoseconds)
1.3 N3
10 N2
47 N log2N
48 N
1000
Time tosolve aproblemof size
1.3 seconds
10 msec
0.4 msec
0.048 msec
10,000
22 minutes
1 second
6 msec
0.48 msec
100,000
15 days
1.7 minutes
78 msec
4.8 msec
million
41 years
2.8 hours
0.94 seconds
48 msec
10 million
41 millennia
1.7 weeks
11 seconds
0.48 seconds
920
second
Max sizeproblemsolvedin one
10,000
1 million
21 million
3,600
minute
77,000
49 million
1.3 billion
14,000
hour
600,000
2.4 trillion
76 trillion
41,000
day
2.9 million
50 trillion
1,800 trillion
1,000
100
10
10
N multiplied by 10,time multiplied by
3Orders of Magnitude
Meters PerSecond
ImperialUnits
Example
Seconds
Equivalent
1
1 second
10-10
1.2 in / decade
Continental drift
10
10 seconds
10-8
1 ft / year
Hair growing
102
1.7 minutes
10-6
3.4 in / day
Glacier
103
17 minutes
10-4
1.2 ft / hour
Gastro-intestinal tract
104
2.8 hours
10-2
2 ft / minute
Ant
105
1.1 days
1
2.2 mi / hour
Human walk
106
1.6 weeks
102
220 mi / hour
Propeller airplane
107
3.8 months
104
370 mi / min
Space shuttle
108
3.1 years
106
620 mi / sec
Earth in galactic orbit
109
3.1 decades
108
62,000 mi / sec
1/3 speed of light
1010
3.1 centuries
forever
210
thousand
. . .
Powersof 2
1021
age ofuniverse
220
million
230
billion
4Impact of Better Algorithms
- Example 1 N-body-simulation.
- Simulate gravitational interactions among N
bodies. - physicists want N atoms in universe
- Brute force method N2 steps.
- Appel (1981). N log N steps, enables new
research. - Example 2 Discrete Fourier Transform (DFT).
- Breaks down waveforms (sound) into periodic
components. - foundation of signal processing
- CD players, JPEG, analyzing astronomical data,
etc. - Grade school method N2 steps.
- Runge-König (1924), Cooley-Tukey (1965).FFT
algorithm N log N steps, enables new technology.
5Mergesort
- Mergesort (divide-and-conquer)
- Divide array into two halves.
A
L
G
O
R
I
T
H
M
S
6Mergesort
- Mergesort (divide-and-conquer)
- Divide array into two halves.
- Recursively sort each half.
A
L
G
O
R
I
T
H
M
S
A
L
G
O
R
I
T
H
M
S
divide
sort
A
G
L
O
R
H
I
M
S
T
7Mergesort
- Mergesort (divide-and-conquer)
- Divide array into two halves.
- Recursively sort each half.
- Merge two halves to make sorted whole.
A
L
G
O
R
I
T
H
M
S
A
L
G
O
R
I
T
H
M
S
divide
sort
A
G
L
O
R
H
I
M
S
T
merge
A
G
H
I
L
M
O
R
S
T
8Mergesort Analysis
- How long does mergesort take?
- Bottleneck merging (and copying).
- merging two files of size N/2 requires N
comparisons - T(N) comparisons to mergesort N elements.
- to make analysis cleaner, assume N is a power of
2 - Claim. T(N) N log2 N.
- Note same number of comparisons for ANY file.
- even already sorted
- We'll prove several different ways to illustrate
standard techniques.
9Proof by Picture of Recursion Tree
T(N)
N
2(N/2)
T(N/2)
T(N/2)
T(N/4)
T(N/4)
T(N/4)
4(N/4)
T(N/4)
log2N
. . .
2k (N / 2k)
T(N / 2k)
. . .
T(2)
T(2)
T(2)
T(2)
T(2)
T(2)
T(2)
T(2)
N/2 (2)
N log2N
10Proof by Telescoping
- Claim. T(N) N log2 N (when N is a power of
2). - Proof. For N gt 1
11Mathematical Induction
- Mathematical induction.
- Powerful and general proof technique in discrete
mathematics. - To prove a theorem true for all integers k ? 0
- Base case prove it to be true for N 0.
- Induction hypothesis assuming it is true for
arbitrary N - Induction step show it is true for N 1
- Claim 0 1 2 3 . . . N N(N1) / 2
for all N ? 0. - Proof (by mathematical induction)
- Base case (N 0).
- 0 0(01) / 2.
- Induction hypothesis assume 0 1 2 . . .
N N(N1) / 2 - Induction step 0 1 . . . N N 1 (0
1 . . . N) N1 N (N1) /2
N1 (N2)(N1) / 2
12Proof by Induction
- Claim. T(N) N log2 N (when N is a power of
2). - Proof. (by induction on N)
- Base case N 1.
- Inductive hypothesis T(N) N log2 N.
- Goal show that T(2N) 2N log2 (2N).
13Proof by Induction
- What if N is not a power of 2?
- T(N) satisfies following recurrence.
- Claim. T(N) ? N ?log2 N?.
- Proof. See supplemental slides.
14Computational Complexity
- Framework to study efficiency of algorithms.
Example sorting. - MACHINE MODEL count fundamental operations.
- count number of comparisons
- UPPER BOUND algorithm to solve the problem
(worst-case). - N log2 N from mergesort
- LOWER BOUND proof that no algorithm can do
better. - N log2 N - N log2 e
- OPTIMAL ALGORITHM lower bound upper bound.
- mergesort
15Decision Tree
printa1, a2, a3
printa2, a1, a3
printa1, a3, a2
printa3, a1, a2
printa2, a3, a1
printa3, a2, a1
16Comparison Based Sorting Lower Bound
- Theorem. Any comparison based sorting algorithm
must use?(N log2N) comparisons. - Proof. Worst case dictated by tree height h.
- N! different orderings.
- One (or more) leaves corresponding to each
ordering. - Binary tree with N! leaves must have height
- Food for thought. What if we don't use
comparisons? - Stay tuned for radix sort.
Stirling's formula
17Extra Slides
18Proof by Induction
- Claim. T(N) ? N ?log2 N?.
- Proof. (by induction on N)
- Base case N 1.
- Define n1 ?N / 2? , n2 ?N / 2?.
- Induction step assume true for 1, 2, . . . , N
1.
19Implementing Mergesort
uses scratch array
20Implementing Mergesort
copy to temporary array
merge two sorted sequences
21Profiling Mergesort Empirically
Striking featureAll numbers SMALL!
comparisonsTheory N log2 N 9,966Actual
9,976
22Sorting Analysis Summary
- Running time estimates
- Home pc executes 108 comparisons/second.
- Supercomputer executes 1012 comparisons/second.
- Lesson 1 good algorithms are better than
supercomputers. - Lesson 2 great algorithms are better than good
ones.
Insertion Sort (N2)
Mergesort (N log N)
computer
thousand
million
billion
thousand
million
billion
home
instant
2.8 hours
317 years
instant
1 sec
18 min
super
instant
1 second
1.6 weeks
instant
instant
instant
Quicksort (N log N)
thousand
million
billion
instant
0.3 sec
6 min
instant
instant
instant