Title: Analysis of Computer Algorithm
1Analysis of Computer Algorithm
- Pradondet Nilagupta
- (pom_at_ku.ac.th)
- Department of Computer Engineering
- Kasetsart University
2What is algorithm? (1/2)
- Algorithm
- Recipe for getting things done successfully
- "Recipe" - well defined sequence of computational
steps - "things" - computational problems specifying an
input/output relation - "done" - in finite steps and time
- "successfully" - correctly
- a step-by-step procedure for solving a problem in
a finite amount of time.
3What is algorithm? (2/2)
- Any special method of solving a certain kind of
problem (Webster Dictionary) - An algorithm is "a process, or rules, for
mechanical calculation" (Oxford Concise).
4Example What is an Algorithm?
Problem Input is a sequence of integers stored
in an array. Output the
minimum.
Algorithm
INPUT
OUTPUT
instance
m a1 for I2 to size of input if m gt
aI then maI return s
11
25, 90, 53, 23, 11, 34
Data-Structure
m
5What is a program?
- A program is the expression of an algorithm in a
programming language - a set of instructions which the computer will
follow to solve a problem
6Why do we analyze about them?
- understand their behavior, and (Job -- Selection,
performance, modify) - improve them. (Research)
7What do we analyze about them?
- Correctness
- Does the input/output relation match algorithm
requirement? - Amount of work done (aka complexity)
- Basic operations to do task
- Amount of space used
- Memory used
8What do we analyze about them?
- Simplicity, clarity
- Verification and implementation.
- Optimality
- Is it impossible to do better?
9Importance of Analyze Algorithm
- Need to recognize limitations of various
algorithms for solving a problem - Need to understand relationship between problem
size and running time - When is a running program not good enough?
- Need to learn how to analyze an algorithm's
running time without coding it
10Importance of Analyze Algorithm
- Need to learn techniques for writing more
efficient code - Need to recognize bottlenecks in code as well as
which parts of code are easiest to optimize
11Complexity
- The complexity of an algorithm is simply the
amount of work the algorithm performs to complete
its task.
12RAM model
- has one processor
- executes one instruction at a time
- each instruction takes "unit time
- has fixed-size operands, and
- has fixed size storage (RAM and disk).
13What is the running time of this algorithm?
- PUZZLE(x)
- while x ! 1    if x is even     then x x
/ 2     else x 3x 1 - Sample run 7, 22, 11, 34, 17, 52, 26, 13, 40,
20, 10, 5, 16, 8, 4, 2, 1
14The Selection Problem (1/2)
- Problem given a group of n numbers, determine
the kth largest - Algorithm 1
- Store numbers in an array
- Sort the array in descending order
- Return the number in position k
15The Selection Problem(2/2)
- Algorithm 2
- Store first k numbers in an array
- Sort the array in descending order
- For each remaining number, if the number is
larger than the kth number, insert the number in
the correct position of the array - Return the number in position k
- Which algorithm is better?
16Define Problem
- Problem Description of Input-Output
relationship - Algorithm A sequence of computational step
that transform the input into the output. - Data Structure An organized method of
storing and retrieving data. - Our task Given a problem, design a correct
and good algorithm that solves it.
17Example Algorithm A
Problem The input is a sequence of integers
stored in array. Output the minimum.
Algorithm A
18Example Algorithm B
This algorithm uses two temporary arrays.
- copy the input a to array t1
- assign n ? size of input
- While n gt 1
- For i ? 1 to n /2
- t2 i ? min (t1 2i , t1 2i
1 ) - copy array t2 to t1
- n ?n/2
- 3. Output t21
19Visualize Algorithm B
7
8
9
5
6
11
34
20
Loop 1
5
8
7
6
Loop 2
7
5
Loop 3
5
20Example Algorithm C
Sort the input in increasing order. Return
the first element of the sorted data.
black box
Sorting
21Example Algorithm D
For each element, test whether it is the minimum.
22Which algorithm is better?
- The algorithms are correct, but which is the
best? - Measure the running time (number of operations
needed). - Measure the amount of memory used.
- Note that the running time of the algorithms
increase as the size of the input increases.
23What do we need?
- Correctness Whether the algorithm computes
the correct solution for all instances. - Efficiency Resources needed by the algorithm
- 1. Time Number of steps.
- 2. Space amount of memory used.
- Measurement model Worst case, Average case
and Best case.
24Many important problems have no useful,
algorithmic solution
- intractable computation known algorithms for the
problem require too much time and/or memory to be
useful (example finding an optimal route) - noncomputable problem a problem which no
algorithm can solve (example determining whether
two programs are equivalent) - unknown algorithm a problem for which no one
knows an algorithm (example understanding
English)
25Algorithms vary in efficiency
example sum the numbers from 1 to n
efficiency space 3 memory cells time
t(step1) t(step 2)
n t(step 4) n t(step 5)
- space requirement is constant (i.e. independent
of n) - time requirement is linear (i.e. grows linearly
with n). This is written O(n)
to see this graphically...
26Algorithm Is time requirements
The exact equation for the line is unknown
because we lack precise values for the constants
m and b. But, we can say time is a linear
function of the size of the problem time O(n)
27Algorithm II for summation
First, consider a specific case n 100.
The key insight, due to Gauss the numbers can
be grouped into 50 pairs of the form 1 100
101 2 99 101 . . . 50
51 101
Second, generalize the formula for any (even) n
sum (n / 2) (n 1)
Time requirement is constant. time O(1)
28Sequential Search of a student database
What is this algorithms time requirements? Hint
focus on the loop body
29Time requirements for sequential search
because the amount of work is a constant multiple
of n, the time requirement is O(n) in the worst
case and the average case.
30O(n) searching is too slow
Consider searching UTs student database using
sequential search on a computer capable of
20,000 integer comparisons per second n
150,000 (students registered during past 10
years) average case 150,000 comparisons
1 seconds 3.75
seconds 2
20,000 comparisons
x
31Searching an ordered list is faster an example
of binary search
32The binary search algorithm
assuming that the entries in student are sorted
in increasing order,
1. ask user to input studentNum to search for 2.
set found to no 3. while not done searching
and found no 4. set middle to the
index counter at the middle of the student
list 5. if studentNum studentmiddle then set
found to yes 6. if studentNum lt
studentmiddle then chop off the last half of
the student list 7. If studentNum gt
studentmiddle then chop off the first half of
the student list 8. if found no then print
no such student else ltstudentNum found at
array index middlegt
33The binary search algorithm
assuming that the entries in student are sorted
in increasing order,
1. ask user to input studentNum to search for 2.
set beginning to 1 3. set end to n 4. set found
to no 5. while beginning lt end and found
no 6. set middle to (beginning end) / 2
round down to nearest integer 7. if
studentNum studentmiddle then set found to
yes 8. if studentNum lt studentmiddle then
set end to middle - 1 9. if studentNum gt
studentmiddle then set beginning to middle
1 10.if found no then print no such
student else ltstudentNum found at array
index middlegt
34Time requirements for binary search
At each iteration of the loop, the algorithm cuts
the list (i.e. the list called student) in
half. In the worst case (i.e. when studentNum
is not in the list called student) how many times
will this happen?
n 16 1st iteration 16/2
8 2nd iteration 8/2 4 3rd iteration 4/2
2 4th iteration 2/2 1
the number of times a number n can be cut in half
and not go below 1 is log2 n. Said another way
log2 n m is equivalent to 2m n
In the average case and the worst case, binary
search is O(log2 n)
35This is a major improvement
36Conclusions
- The design of an algorithm can make the
difference between useful and impractical. - Efficient algorithms for searching require lists
that are sorted.
37Time vs. Size of Input
- Measurement parameterized by the size of the
input. - The algorihtms A,B,C are implemented and run in a
PC. - Algorithms D is implemented and run in a
supercomputer. - Let Tk( n ) be the amount of time taken by the
Algorithm 2(k).
1000
500
38Methods of Proof
- Proof by Contradiction
- Assume a theorem is false show that this
assumption implies a property known to be true is
false -- therefore original hypothesis must be
true - Proof by Counterexample
- Use a concrete example to show an inequality
cannot hold - Mathematical Induction
- Prove a trivial base case, assume true for k,
then show hypothesis is true for k1 - Used to prove recursive algorithms
39Review Induction
- Suppose
- S(k) is true for fixed constant k
- Often k 0
- S(n) ? S(n1) for all n gt k
- Then S(n) is true for all n gt k
40Proof By Induction
- ClaimS(n) is true for all n gt k
- Basis
- Show formula is true when n k
- Inductive hypothesis
- Assume formula is true for an arbitrary n
- Step
- Show that formula is then true for n1
41Induction Example Gaussian Closed Form
- Prove 1 2 3 n n(n1) / 2
- Basis
- If n 0, then 0 0(01) / 2
- Inductive hypothesis
- Assume 1 2 3 n n(n1) / 2
- Step (show true for n1)
- 1 2 n n1 (1 2 n) (n1)
- n(n1)/2 n1 n(n1) 2(n1)/2
- (n1)(n2)/2 (n1)(n1 1) / 2
42Induction ExampleGeometric Closed Form
- Prove a0 a1 an (an1 - 1)/(a - 1) for
all a ? 1 - Basis show that a0 (a01 - 1)/(a - 1)
- a0 1 (a1 - 1)/(a - 1)
- Inductive hypothesis
- Assume a0 a1 an (an1 - 1)/(a - 1)
- Step (show true for n1)
- a0 a1 an1 a0 a1 an an1
- (an1 - 1)/(a - 1) an1 (an11 - 1)/(a -
1)
43What is Algorithm Analysis?
- How to estimate the time required for an
algorithm - Techniques that drastically reduce the running
time of an algorithm - A mathemactical framwork that more rigorously
describes the running time of an algorithm
44Running time for small inputs
45Running time for moderate inputs
46Important Question
- Is it always important to be on the most
preferred curve? - How much better is one curve than another?
- How do we decide which curve a particular
algorithm lies on? - How do we design algorithms that avoid being on
the bad curves?
47Algorithm Analysis(1/5)
- Measures the efficiency of an algorithm or its
implementation as a program as the input size
becomes very large - We evaluate a new algorithm by comparing its
performance with that of previous approaches - Comparisons are asymtotic analyze of classes of
algorithms - We usually analyze the time required for an
algorithm and the space required for a data
structure
48Algorithm Analysis (2/5)
- Many criteria affect the running time of an
algorithm, including - speed of CPU, bus and peripheral hardware
- design think time, programming time and debugging
time - language used and coding efficiency of the
programmer - quality of input (good, bad or average)
49Algorithm Analysis (3/5)
- Programs derived from two algorithms for solving
the same problem should both be - Machine independent
- Language independent
- Environment independent (load on the system,...)
- Amenable to mathematical study
- Realistic
50Algorithm Analysis (4/5)
- In lieu of some standard benchmark conditions
under which two programs can be run, we estimate
the algorithm's performance based on the number
of key and basic operations it requires to
process an input of a given size - For a given input size n we express the time T to
run the algorithm as a function T(n) - Concept of growth rate allows us to compare
running time of two algorithms without writing
two programs and running them on the same computer
51Algorithm Analysis (5/5)
- Formally, let T(A,L,M) be total run time for
algorithm A if it were implemented with language
L on machine M. Then the complexity class of
algorithm A is O(T(A,L1,M1) U O(T(A,L2,M2)) U
O(T(A,L3,M3)) U ... - Call the complexity class V then the complexity
of A is said to be f if V O(f) - The class of algorithms to which A belongs is
said to be of at most linear/quadratic/ etc.
growth in best case if the function TA best(n) is
such (the same also for average and worst case).
52Asymptotic Performance
- How does the algorithm behave as the problem size
gets very large? - Running time
- Memory/storage requirements
- Bandwidth/power requirements/logic gates/etc.
53Asymptotic Notation
- By now you should have an intuitive feel for
asymptotic (big-O) notation - What does O(n) running time mean? O(n2)?O(n lg
n)? - How does asymptotic running time relate to
asymptotic memory usage? - Our first task is to define this notation more
formally and completely
54Analysis of Algorithms
- Analysis is performed with respect to a
computational model - We will usually use a generic uniprocessor
random-access machine (RAM) - All memory equally expensive to access
- No concurrent operations
- All reasonable instructions take unit time
- Except, of course, function calls
- Constant word size
- Unless we are explicitly manipulating bits
55Input Size
- Time and space complexity
- This is generally a function of the input size
- E.g., sorting, multiplication
- How we characterize input size depends
- Sorting number of input items
- Multiplication total number of bits
- Graph algorithms number of nodes edges
- Etc
56Running Time
- Number of primitive steps that are executed
- Except for time of executing a function call most
statements roughly require the same amount of
time - y m x b
- c 5 / 9 (t - 32 )
- z f(x) g(y)
- We can be more exact if need be
57Analysis
- Worst case
- Provides an upper bound on running time
- An absolute guarantee
- Average case
- Provides the expected running time
- Very useful, but treat with care what is
average? - Random (equally likely) inputs
- Real-life inputs
58Function of Growth rate
59Asymptotic Analysis
60Asymptotic Notation (1/2)
- Think of n as the number of records we wish to
sort with an algorithm that takes f(n) to run.
How long will it take to sort n records? - What if n is big?
- We are interested in the range of a function as n
gets large.
61Asymptotic Notation (2/2)
- Will f stay bounded?
- Will f grow linearly?
- Will f grow exponentially?
- Our goal is to find out just how fast f grows
with respect to n.
623 major notations
- O(g(n)), Big-Oh of g of n, the Asymptotic Upper
Bound - Q(g(n)), Theta of g of n, the Asymptotic Tight
Bound and - W(g(n)), Omega of g of n, the Asymptotic Lower
Bound.
63Example (1/2)
- Example f(n) n2 - 5n 13.
- The constant 13 doesn't change as n grows, so it
is not crucial. The low order term, -5n, doesn't
have much effect on f compared to the quadratic
term, n2. - We will show that f(n) Q(n2) .
- Q What does it mean to say f(n) Q(g(n)) ?
- A Intuitively, it means that function f is the
same order of magnitude as g.
64Example (2/2)
- Q What does it mean to say f1(n) Q(1)?
- A f1(n) Q(1) means after a few n, f1 is
bounded above below by a constant. - Q What does it mean to say f2(n) Q(n lg n)?
- A f2(n) Q(n lg n) means that after a few n, f2
is bounded above and below by a constant times
nlg n. In other words, f2 is the same order of
magnitude as nlg n.
65Big-Oh
- The O symbol was introduced in 1927 to indicate
relative growth of two functions based on
asymptotic behavior of the functions now used to
classify functions and families of functions
66Upper Bound Notation
- We say Insertion Sorts run time is O(n2)
- Properly we should say run time is in O(n2)
- Read O as Big-O (youll also hear it as
order) - In general a function
- f(n) is O(g(n)) if ? positive constants c and n0
such that f(n) ? c ? g(n) ? n ? n0 - e.g. if f(n)1000n and g(n)n2, n0 gt 1000 and c
1 then f(n0) lt 1.g(n0) and we say that f(n)
O(g(n)) - The O notation indicates 'bounded above by a
constant multiple of.'
67Example 1
for all ngt6, g(n) gt 1 f(n). Thus the
function f is in the big-O of g. that is, f(n)
in (g(n)).
68Example 2
There exists a n0 s.t. for all ngtn0, f(n) lt 1
g(n). Thus, f(n) is in O(g(n)).
69Example 3
There exists a n05, c3.5, s.t. for all ngtn0,
f(n) lt c h(n). Thus, f(n) is in O(h(n)).
70Classification of Function BIG O (1/2)
- A function f(n) is said to be of at most
logarithmic growth if f(n) O(log n) - A function f(n) is said to be of at most
quadratic growth if f(n) O(n2) - A function f(n) is said to be of at most
polynomial growth if f(n) O(nk), for some
natural number k gt 1 - A function f(n) is said to be of at most
exponential growth if there is a constant c, such
that f(n) O(cn), and c gt 1
71Classification of Function BIG O (2/2)
- A function f(n) is said to be of at most
factorial growth if f(n) O(n!). - A function f(n) is said to have constant running
time if the size of the input n has no effect on
the running time of the algorithm (e.g.,
assignment of a value to a variable). The
equation for this algorithm is f(n) c - Other logarithmic classifications
- f(n) O(n log n)
- f(n) O(log log n)
72Big O Fact
- A polynomial of degree k is O(nk)
- Proof
- Suppose f(n) bknk bk-1nk-1 b1n b0
- Let ai bi
- f(n) ? aknk ak-1nk-1 a1n a0
73Some Rules
- Transitivity
- f(n) O(g(n)) and g(n) O(h(n)) gt f(n)
O(h(n)) - Addition
- f(n) g(n) O(max f(n) ,g(n))
- Polynomials
- a0 a1n adnd O(nd)
- Heirachy of functions
- n log n O(n) 2n n3 O(2n)
74Some Rules
- Base of Logs ignored
- logan O(logbn)
- Power inside logs ignored
- log(n2) O(log n)
- Base and powers in exponents not ignored
- 3n is not O(2n)
- 2
- a(n ) is not O(an)
75Lower Bound Notation
- We say InsertionSorts run time is ?(n)
- In general a function
- f(n) is ?(g(n)) if ? positive constants c and n0
such that 0 ? c?g(n) ? f(n) ? n ? n0 - Proof
- Suppose run time is an b
- Assume a and b are positive (what if b is
negative?) - an ? an b
76Example Omega
- Example n 1/2 W( lg n) .
- Use the definition with c 1 and n0 16.
Checks OK. - Let n gt 16 n 1/2 (1) lg n if and only if n
( lg n )2 by squaring both sides. - This is an example of polynomial vs. log.
77Theta, the Asymptotic Tight Bound
- Theta means that f is bounded above and below by
g BigTheta implies the "best fit". - f(n) does not have to be linear itself in order
to be of linear growth it just has to be between
two linear functions,
78Examples
- 1. 2n3 3n2 n 2n3 3n2 O(n)
- 2n3 O( n2
n) 2n3 O( n2 ) - O(n3 ) O(n4)
- 2. 2n3 3n2 n 2n3 3n2 O(n)
- 2n3 Q(n2 n)
- 2n3 Q(n2)
Q(n3)
79Examples
- 0.5n2 - 5n 2 W( n2). Let c 0.25 and n0 25
- 0.5 n2 - 5n 2 0.25( n2) for all n 25
- 0.5 n2 - 5n 2 O( n2). Let c 0.5 and n0 1.
- 0.5( n2) 0.5 n2 - 5n 2 for all n 1
- 0.5 n2 - 5n 2 Q( n2) from (a) and (b) above.
- Use n0 25, c1 0.25, c2 0.5 in the
definition.
80Examples
- log 100/97 n O(n) . (Notation lg n log2 n)
- Try this one on your own.
- It is okay to say that 2 n2 3n - 1 2 n2
Q(n). - Recall the mergesort work with T(n)
2T(n/2) Q(n). - 6 2n n2 W(2n). Let c 1 and n0 1.
81Practical Complexity t lt 250
82Practical Complexity t lt 500
83Practical Complexity t lt 1000
84Practical Complexity t lt 5000
85Practical Complexity
86Things To Remember in Analysis
- Constants or low-order terms are ignore
- e.g. if f(n) 2n2 then f(n) O(n2)
- Most important resource to analyze is running
time other factors are algorithm used and input
to the algorithm - Parameter N, usually referring to number of data
items processed, affects running time most
significantly. - N may be degree of polynomial, size of file
to be sorted or searched, number of nodes in a
graph, etc.
87Things To Remember in Analysis
- Worst case is amount of time program would take
with worst possible input configuration - worst case is bound for input and easier to find
usually the metric chosen - Average case is amount of time a program is
expected to take using "typical" input data - definition of "average" can affect results
- average case is much harder to compute
88GENERAL RULES FOR ANALYSIS(1/5)
- 1. Consecutive statements
- Maximum statement is the one counted
- e.g. a fragment with single for-loop followed by
double for- loop is O(n2).
t1t2 max(t1,t2)
89GENERAL RULES FOR ANALYSIS(2/5)
- 2. If/Else
- if cond then
- S1
- else
- S2
Max(t1,t2)
90GENERAL RULES FOR ANALYSIS(3/5)
- 3. For Loops
- Running time of a for-loop is at most the running
time of the statements inside the for-loop times
number of iterations - for (i sum 0 i lt n
i) - sum ai
- for loop iterates n times, executes 2 assignment
statements each iteration gt asymptotic
complexity of O(n)
91GENERAL RULES FOR ANALYSIS(4/5)
- 4. Nested For-Loops
- Analyze inside-out. Total running time is running
time of the statement multiplied by product of
the sizes of all the for-loops - e.g. for (i 0 i lt n i)
- for (j 0, sum a0 j lt i j)
- sum aj
- printf("sum for subarray - through d is d\n",
i, sum)
92GENERAL RULES FOR ANALYSIS(5/5)
93General Rule for Analysis
- Strategy for analysis
- analyze from inside out
- analyze function calls first
- if recursion behaves like a for-loop, analysis is
trivial otherwise, use recurrence relation to
solve
94Asymptotic Analysis Example
95Analysis of Algorithm A
- Number of assignments is 1 (n-1) (dont
know but less than n) - Number of comparisons is (n-1)
- Thus, the number of assigncomp is less than
3n-1.Note that (3n-1) is in O(n). - So we say that the worst case running time of
Algorithm A is in O(n).
assignment
comparison
96Analysis of Algorithm B
- This algorithm uses two temporary arrays.
Step 1 O(n). Step 2 n n/2 n/4 .
1 2n-1. Step 3 O(1). Thus, Algorithm B is
in O(n).
97Analysis of Algorithm C
- For each element, test whether it is the minimum.
- Correctness
- Is this algorithm correct?
- Will it enter the infinite loop (within )?
98Complexity
- Worst case occurred when the minimum is the last
element. - For convenient, we could consider step and
ignore other steps. - Number of elements need to be tested on is n.
- Each testing takes n steps of .
- Thus, the worst case running time is in O(n2).
- the best case running time is in
O(n) . - Asuming that the minimum is equally likely to
appear in every location of the array, - the average case running time is
- (1/n)n (1/n)n .... (1/n)n n 2
which is in O(n2).
99Worst-case Running times
- An algorithm may take different amounts of time
on inputs of the same size. - For example, Algorithm B,C
Worst-case analysis The performance of the
algorithm is measured by its running time on the
worst input instance.
Average-case analysis The performance of the
algorihtm is measured by the average running time
of all the possible input instances.
Best-case analysis The performance of the
algorithm is measured by its running time on the
best input instance.
100Example
- Example 1 calculate 12333 ... n3
- unsigned int sum(n)
-
- unsigned int i,partial_sum
- (1) partial_sum 0 // Lines 14 1 unit
each - (2) for (i 1 ilt n i) // Init 1, test
n1, incr n - (3) partial_sum iii // 4 units per
execution 4n - (4) return partial_sum
-
- No cost to call function or return value
- Total cost 24n1n1n 6n 4 gt O(n)
101Example 2 Analyze the code fragment
- (1) assignment statement takes constant time c1
- (5) for loop like ex.1 takes c2n Q(n) time
- line (4) takes constant time c3
- inner loop (3) executes j times --gt cost c3j
- outer loop (2) executes n times, but cost of each
inner loop is different with each j --gt c3
summation of j from 1 to n - cost c1 c2n c3 (n(n1) / 2)
- O(c1c2nc3 n2) Q(n2)
- (1) sum 0
- (2) for (j1 jltn j)
- (3) for (i1 iltj i)
- (4) sum
- (5) for (k1 kltn k)
- (6) Ak k-1
102Example 3. Compare asymptotic Analysis for
following
- In (1) the inner loop executes n times for each
of n executions made by outer loop --gt sum1
executes Q(n2) times - In (2) the inner loop executes with cost
(summation from 1 to n of j), which is about
n2/2. - Hence, both loops cost Q(n2), but the second
requires about half the time of the first
(1) sum1 0 for (i1 iltn i)
for (j1 jltn j)
sum1 (2) sum2 0 for
(i1 iltn i) for (j1 jlti
j) sum2
103Example 4. A for-loop that does not run in Q (n2)
time
sum0 for (k1 kltn k2) for (j1
jltn j) sum
- Outer loop executes log n times since k is
multiplied by 2 on each iteration up to n - The inner loop executes n times
- total cost --gt S i1..log n n n log n
104MAXIMUM SUBSEQUENCE SUM PROBLEM
- Given (possibly negative) integers a1,a2,...,an,
find the maximum value of S ak. - Ex. for input -2, 11, -4, 13, -5, -2, answer is
20 (a2 through a4)
105ALGORITHM 1 Obvious O(N3)
- int MaxSubsequenceSum( const int A , int N )
- int ThisSum, MaxSum, i, j, k
- / 1/ MaxSum 0
- / 2/ for( i 0 i lt N i )
- / 3/ for( j i j lt N j )
- / 4/ ThisSum 0
- / 5/ for( k i k lt j k )
- / 6/ ThisSum A k
- / 7/ if (ThisSumgtMaxSum )
- / 8/ MaxSum ThisSum
- / 9/ return MaxSum
- / END /
106Algorithm 1
- Line 1 is O(1)
- Loop line 2 is size n
- Loop line 3 could be small or could be size n as
well (worst case) - Loop line 5 is size n
- Lines 7 to 10 take O(n2)
- Total Running time is O(n3). More precisely, Q(n3)
107Algorithm 2 Improve O(N2)
- int MaxSubsequenceSum( const int A , int N )
- int ThisSum, MaxSum, i, j, k
- / 1/ MaxSum 0
- / 2/ for( i 0 i lt N i )
- / 3/ ThisSum 0
- / 4/ for( j i j lt N j )
- / 5/ ThisSum A j
- / 6/ if (ThisSumgtMaxSum )
- / 7/ MaxSum ThisSum
- / 8/ return MaxSum
- / END /
108Algorithm 3 Divide and Conquer (n log n)
- Step 1 Break a big problem into two small sub
- problems
- Step 2 Solve each of them efficiently.
- Step 3 Combine the sub solutions
109Maximum subsequence sum by divide and conquer
Step 1 Divide the array into two parts left
part, right part
Max. subsequence lies completely in left, or
completely in right or spans the middle.
If it spans the middle, then it includes the max
subsequence in the left ending at the center and
the max subsequence in the right starting from
the center!
110Example
- 4 3 5 2 -1 2 6 -2
- Max subsequence sum for first half 6 (4,
-3, 5) - second
half 8 (2, 6) - Max subsequence sum for first half ending at the
last - element is 4 (4, -3, 5, -2)
- Max subsequence sum for sum second half starting
at the - first element is 7 (-1, 2, 6)
- Max subsequence sum spanning the middle is 11?
- Max subsequence spans the middle 4, -3, 5, -2,
-1, 2, 6
111Maxsubsum(A, left, right) if left
right, maxsum max(Aleft, 0) Center ?
(left right)/2? maxleftsum Maxsubsum(A
,left, center) maxrightsum Maxsubsum(A
,center1,right) maxleftbordersum 0
leftbordersum 0 for (icenter igtleft
i--) leftbordersumAi
Maxleftbordersummax(maxleftbordersum,
leftbordersum) Find maxrightbordersum..
return(max(maxleftsum, maxrightsum,
maxrightbordersum maxleftbordersum)
112Complexity Analysis
T(1)1 T(n) 2T(n/2) cn
2(2T(n/4)cn/2)cn22T(n/22)2cn 22(2T(n/23
)cn/22)2cn 23T(n/23)3cn
(2k)T(n/2k) kcn (let n2k, then klog
n) T(n) nT(1)kcn n1cnlog n O(n log n)
113Algorithm 4 Linear O(N)
- int MaxSubsequenceSum( const int A , int N )
- int ThisSum, MaxSum, j
- / 1/ ThisSum MaxSum 0
- / 2/ for ( j 0 j lt N j )
- / 3/ ThisSum A j
- / 4/ if (ThisSum gt MaxSum )
- / 5/ MaxSum ThisSum
- / 6/ else if ( ThisSum lt 0)
- / 7/ ThisSum 0
- / 8/ return MaxSum
- / END /
114Insertion Sort Analysis (1/4)
- // A is an array of n elements with an order (lt).
- // n length(A).
- // Pseudocode uses liberal amounts of English.
- // Instead of begin - end or combinations use
indentation for block structure. - // Omission of error handling and other such
things needed in real programs. - // To do each step is machine dependent, constant
time per step (i.e. c1, c2,..., c10).
115Insertion Sort Analysis (2/4)
- line pseudocode
cost
times - 1 // At start, the singleton sequence
A1 is trivially sorted c1 0 - 2 for j 2 to n
c2 n - 3 // Insert Aj into the sorted
sequence A1..(j-1) c3 0 - 4 do key Aj
c4
n - 1 - 5 i j - 1
c5 n - 1 - 6 // let tj be the
number of times the following - //while-loop is
tested for the value j c6 0 - n
- 7 while (i gt 0 and Aj gt key)
c7 S tj - j 2
- n
- 8 do Ai1 Ai c8
S (tj-1) - j 2
- n
- 9 i i - 1 c9
S (tj-1) - j 2
- 10 Ai 1 key
c10 n - 1
116Insertion Sort Analysis (3/4)
- // c1 c3 c6 0 since comments are ignored and
take no time. - // c2 time to increment and compare the
for-loop variable. - // c4 c8 c10 time to make an array element
assignment. - // c5 c9 time to decrement an integer
- // c7 time to evaluate the while-loop Boolean
condition.
117Insertion Sort Analysis (4/4)
- Example 1 2 3 4 5
6 - A 50 20 40 60 10
30 Insert, j 2, key 20 - 20 50 40 60 10
30 Insert, j 3, key 40 - 20 40 50 60 10
30 Insert, j 4, key 60 - 20 40 50 60 10
30 Insert, j 5, key 10 - 10 20 40 50 60
30 Insert, j 6, key 30 - 10 20 30 40 50
60 done
118Analysis of the Algorithm
- The running time depends on
- the number of primitive operations or steps
executed - the input size (in sorting, the number of
elements) and - the input itself (e.g. partially sorted).
- We generally want an upper bound on the running
time. - Multiplying the cost of each step times the
number of times it is run one gets
119Best Case Analysis
- BEST CASE here is when array is already sorted.
- For each j 2, 3,..., n we find that Ai Aj
- 1 lt key Aj since the array is presorted.
This means that the while loop is tested (n - 1
times) and never entered. tj 1 for each j.
120Best Case Analysis
- The best case run time is
- Tbest(n) c2n c4(n - 1) c5(n - 1) c7(n
- 1) c10(n - 1) - (c2 c4 c5 c7 c10) n -
(c4 c5 c7 c10) - An B since the ci are machine
dependent constants. - a linear function of n
- Generally the best case scenario is not important
and is seldom of interest. It is good only for
showing a bad lower bound.
121Worst Case Analysis (1/2)
- WORST CASE is when the array is already reverse
sorted in decreasing order. - For each j 2, 3, ..., n we find that Ai key
since the array is decreasing. - This means that the while loop is tested for
every value of j. That is, tj j.
122Worst Case Analysis (2/2)
- The worst case run time is Tworst(n) max time
on any input of size n - Tworst(n) c2n c4(n - 1) c5(n - 1)
c7(n(n 1)/2) - 1 - c8(n(n - 1)/2) c9(n(n -
1)/2) c10(n - 1) - (c7 c8 c9) n2 /2 (c2
c4 c5 c7/2 - c8/2 - c9/2 c10) n
- (c4 c5 c7 c10) - An2 Bn C a quadratic
function of n - This scenario(worst case) is what is usually
wanted in - analyzing an algorithm.
123Average Case Analysis (1/2)
- AVERAGE CASE assumes a random distribution of
data. On the average, 1/2 of all the inner loops
are performed since 1/2 of the elements Ai will
be greater. This means that tj j/2 and the
above sums will yield - Taverage(n) c2n c4(n - 1) c5(n - 1)
c7(n(n 1)/2) - 1/2 - c8(n(n - 1)/4) c9(n(n -
1)/4) c10(n - 1) - (c7 c8 c9) n2 /4 (c2
c4 c5 c7/4 - c8/4 - c9/4 c10)
n - (c4 c5 c7/2 c10) - An2 Bn C
- quadratic function in n
124Average Case Analysis (2/2)
- This scenario (average case) is sometimes wanted
in analyzing an algorithm although it is often
difficult to determine what "average" input is. - Running time functions are often too complex to
derive exactly. - In this course we use asymptotic analysis.
125Example Q(n3) vs Q(n2)
- As n gets large Q(n2) is better (faster).
126Average Case
- Average case assumes all permutations are equally
likely. The inner loop is Q(j/2). - Often the average case is not much better than
the worst case. - Is this a fast sorting algorithm?
- YES, for small n.
- NO, for large n.
127Practical Complexities
- Time complexity function can be used to compare
two programs P and Q that perform the same task. - e.g. assume program P is Q(n) and program Q is
Q(n2) - Assertion program P is faster than program Q for
sufficiently large n - if program P actually runs in 106 n milliseconds,
and program Q runs in n2 milliseconds, and if we
always have n lt 106, then, other factors
being equal, we should use program Q.
128Function growth for various values of n