Title: CSCI 1302
1CSCI 1302
- Algorithmic Cost Complexity
- Analysis of Data Structures and Algorithms
2Correctness is Not Enough
- It isnt sufficient that our algorithms perform
the required tasks. - We want them to do so efficiently, making the
best use of - Space
- Time
3Time and Space
- Time
- Instructions take time.
- How fast does the algorithm perform?
- What affects its runtime?
- Space
- Data structures take space.
- What kind of data structures can be used?
- How does the choice of data structure affect the
runtime?
4Time vs. Space
- Very often, we can trade space for time
- For example maintain a collection of students
with SSN information. - Use an array of a billion elements and have
immediate access (better time) - Use an array of 35 elements and have to search
(better space)
5The Right Balance
- The best solution uses a reasonable mix of space
and time. - Select effective data structures to represent
your data model. - Utilize efficient methods on these data
structures.
6Scenarios
- Ive got two algorithms that accomplish the same
task - Which is better?
- Given an algorithm, can I determine how long it
will take to run? - Input is unknown
- Dont want to trace all possible paths of
execution - For different input, can I determine how an
algorithms runtime changes?
7Measuring the Growth of Work
- While it is possible to measure the work done by
an algorithm for a given set of input, we need a
way to - Measure the rate of growth of an algorithm based
upon the size of the input - Compare algorithms to determine which is better
for the situation
8Why Use Big-O Notation
- Used when we only know the asymptotic upper
bound. - If you are not guaranteed certain input, then it
is a valid upper bound that even the worst-case
input will be below. - May often be determined by inspection of an
algorithm. - Thus we dont have to do a proof!
9Size of Input
- In analyzing rate of growth based upon size of
input, well use a variable - For each factor in the size, use a new variable
- N is most common
- Examples
- A linked list of N elements
- A 2D array of N x M elements
- A Binary Search Tree of P elements
10Formal Definition
- For a given function g(n), O(g(n)) is defined to
be the set of functions - O(g(n)) f(n) there exist positive
constants c and n0 such that 0 ? f(n) ?
cg(n) for all n ? n0
11Visual O() Meaning
cg(n)
Upper Bound
f(n)
f(n) O(g(n))
Work done
Our Algorithm
n0
Size of input
12Simplifying O() Answers(Throw-Away Math!)
- We say 3n2 2 O(n2) drop constants!
- because we can show that there is a n0 and a c
such that - 0 ? 3n2 2 ? cn2 for n ? n0
- i.e. c 4 and n0 2 yields
- 0 ? 3n2 2 ? 4n2 for n ? 2
13Correct but Meaningless
- You could say
- 3n2 2 O(n6) or 3n2 2 O(n7)
- But this is like answering
- Whats the world record for the mile?
- Less than 3 days.
- How long does it take to drive to Chicago?
- Less than 11 years.
14Comparing Algorithms
- Now that we know the formal definition of O()
notation (and what it means) - If we can determine the O() of algorithms
- This establishes the worst they perform.
- Thus now we can compare them and see which has
the better performance.
15Comparing Factors
N2
N
Work done
log N
1
Size of input
16Correctly Interpreting O()
- O(1) or Order One
- Does not mean that it takes only one operation
- Does mean that the work doesnt change as N
changes - Is notation for constant work
- O(N) or Order N
- Does not mean that it takes N operations
- Does mean that the work changes in a way that is
proportional to N - Is a notation for work grows at a linear rate
17Complex/Combined Factors
- Algorithms typically consist of a sequence of
logical steps/sections - We need a way to analyze these more complex
algorithms - Its easy analyze the sections and then combine
them!
18Example Insert in a Sorted Linked List
- Insert an element into an ordered list
- Find the right location
- Do the steps to create the node and add it to the
list
//
head
17
38
142
Step 1 find the location O(N)
Inserting 75
19Example Insert in a Sorted Linked List
- Insert an element into an ordered list
- Find the right location
- Do the steps to create the node and add it to the
list
//
head
17
38
142
75
Step 2 Do the node insertion O(1)
20Combine the Analysis
- Find the right location O(N)
- Insert Node O(1)
- Sequential, so add
- O(N) O(1) O(N 1) O(N)
Only keep dominant factor
21Example Search a 2D Array
- Search an unsorted 2D array (row, then column)
- Traverse all rows
- For each row, examine all the cells (changing
columns)
Row
1 2 3 4 5
O(N)
1 2 3 4 5 6 7 8 9 10
Column
22Example Search a 2D Array
- Search an unsorted 2D array (row, then column)
- Traverse all rows
- For each row, examine all the cells (changing
columns)
Row
1 2 3 4 5
1 2 3 4 5 6 7 8 9 10
Column
O(M)
23Combine the Analysis
- Traverse rows O(N)
- Examine all cells in row O(M)
- Embedded, so multiply
- O(N) x O(M) O(NM)
24Sequential Steps
- If steps appear sequentially (one after another),
then add their respective O(). - loop
- . . .
- endloop
- loop
- . . .
- endloop
N
O(N M)
M
25Embedded Steps
- If steps appear embedded (one inside another),
then multiply their respective O(). - loop
- loop
- . . .
- endloop
- endloop
O(NM)
M
N
26Correctly Determining O()
- Can have multiple factors
- O(NM)
- O(logP N2)
- But keep only the dominant factors
- O(N NlogN) O(NlogN)
- O(NM P) remains the same
- O(V2 VlogV) O(V2)
- Drop constants
- O(2N 3N2) O(N N2) O(N2)
27What You Should Know So Far
- We use O() notation to discuss the rate at which
the work of an algorithm grows with respect to
the size of the input. - O() is an upper bound, so only keep dominant
terms and drop constants
28Analyzing Data Structures
- Weve talked about data structures and methods to
act on these structures - Linked lists, arrays, trees
- Inserting, Deleting, Searching, Traversal,
Sorting, etc. - Now that we know about O() notation, lets
discuss how each of these methods perform on
these data structures!
29Recipe for Determining O()
- Break algorithm down into known pieces
- Well learn the Big-Os in this section
- Identify relationships between pieces
- Sequential is additive
- Nested (loop / recursion) is multiplicative
- Drop constants
- Keep only dominant factor for each variable
30Array Size and Complexity
- How can an array change in size?
- const int N 30
- int DataN
- We need to know what N is in advance to declare
an array, but for analysis of complexity, we can
still use N as a variable for input size.
31Traversals
- Traversals involve visiting every element in a
collection. - Because we must visit every node, a traversal
must be O(N) for any data structure. - If we visit less than N elements, then it is not
a traversal.
32Searching for an Element
- Searching involves determining if an element is a
member of the collection. - Simple/Linear Search
- If there is no ordering in the data structure
- If the ordering is not applicable
- Binary Search
- If the data is ordered or sorted
- Requires non-linear access to the elements
33Simple Search
- Worst case the element to be found is the Nth
element examined. - Simple search must be used for
- Sorted or unsorted linked lists
- Unsorted array
- Binary tree
- Binary Search Tree if it is not full and balanced
34Balanced Binary Search Trees
- If a binary search tree is not full, then in the
worst case it takes on the structure and
characteristics of a sorted linked list with N/2
elements.
35Binary Search Trees
- If a binary search tree is not full or balanced,
then in the worst case it takes on the structure
and characteristics of a sorted linked list.
7
11
14
42
58
36Example Linked List
- Lets determine if the value 83 is in the
collection
Head
\\
42
5
19
35
83 Not Found!
37Simple/Linear Search Algorithm
- cur head
- while ((cur ! null) (cur.Data ! target))
- cur cur.Next
-
- if(cur ! null)
- DISPLAY Yes, target is there
- else
- DISPLAY No, target isnt there
38Pre-Order Search Traversal Algorithm
- As soon as we get to a node, check to see if we
have a match - Otherwise, look for the element in the left
sub-tree - Otherwise, look for the element in the right
sub-tree
14
Left ???
Right ???
39Find 9
40Find 9
cur
41Find 9
42Find 9
22
43Find 9
22
44Find 9
22
45Find 9
22
14
46Find 9
22
14
47Find 9
22
14
48Find 9
22
14
49Find 9
22
14
67
50Find 9
22
14
67
519 Found!
22
14
67
52Big-O of Simple Search
- The algorithm has to examine every element in the
collection - To return a false
- If the element to be found is the Nth element
- Thus, simple search is O(N).
53Binary Search
- We may perform binary search on
- Sorted arrays
- Full and balanced binary search trees
- Tosses out ½ the elements at each comparison.
54Full and Balanced Binary Search Trees
- Contains approximately the same number of
elements in all left and right sub-trees
(recursively) and is fully populated.
55Binary Search Example
Looking for 89
56Binary Search Example
Looking for 89
57Binary Search Example
Looking for 89
58Binary Search Example
89 not found 3 comparisons 3 Log(8)
59Binary Search Big-O
- An element can be found by comparing and cutting
the work in half. - We cut work in ½ each time
- How many times can we cut in half?
- Log2N
- Thus binary search is O(Log N).
60Insertion
- Inserting an element requires two steps
- Find the right location
- Perform the instructions to insert
- If the data structure in question is unsorted,
then it is O(1) - Simply insert to the front
- Simply insert to end in the case of an array
- There is no work to find the right spot and only
constant work to actually insert.
61Insert into a Sorted Linked List
- Finding the right spot is O(N)
- Recurse/iterate until found
- Performing the insertion is O(1)
- 4-5 instructions
- Total work is O(N 1) O(N)
62Inserting into a Sorted Array
- Finding the right spot is O(Log N)
- Binary search on the element to insert
- Performing the insertion
- Shuffle the existing elements to make room for
the new item
63Shuffling Elements
Note we must have at least one empty cell
Insert 29
64Shuffling Elements
Note we must have at least one empty cell
Insert 29
65Shuffling Elements
Note we must have at least one empty cell
Insert 29
66Shuffling Elements
Note we must have at least one empty cell
Insert 29
67Shuffling Elements
Note we must have at least one empty cell
77
Insert 29
68Shuffling Elements
Note we must have at least one empty cell
77
35
Insert 29
69Shuffling Elements
Note we must have at least one empty cell
77
29
35
70Big-O of Shuffle
Worst case inserting the smallest number
101
35
77
Would require moving N elements Thus shuffle is
O(N)
71Big-O of Inserting into Sorted Array
- Finding the right spot is O(Log N)
- Performing the insertion (shuffle) is O(N)
- Sequential steps, so add
- Total work is O(Log N N) O(N)
72Inserting into a Full and Balanced BST
- Always insert when current null.
- Find the right spot
- Binary search on the element to insert
- Perform the insertion
- 4-5 instructions to create add node
73Full and Balanced BST Insert
Add 4
12
41
3
98
7
35
2
74Full and Balanced BST Insert
Add 4
12
41
3
98
7
35
2
75Full and Balanced BST Insert
Add 4
12
41
3
98
7
35
2
76Full and Balanced BST Insert
Add 4
12
41
3
98
7
35
2
77Full and Balanced BST Insert
Add 4
12
41
3
98
7
35
2
78Full and Balanced BST Insert
12
41
3
98
7
35
2
4
79Big-O of Full Balanced BST Insert
- Find the right spot is O(Log N)
- Performing the insertion is O(1)
- Sequential, so add
- Total work is O(Log N 1) O(Log N)
80Comparing Data Structures and Methods
- Data Structure Traverse Search Insert
- Unsorted L List N N 1
- Sorted L List N N N
- Unsorted Array N N 1
- Sorted Array N Log N N
- Binary Tree N N 1
- BST N N N
- FB BST N Log N Log N
81Two Sorting Algorithms
- Bubblesort
- Brute-force method of sorting
- Loop inside of a loop
- Mergesort
- Divide and conquer approach
- Recursively call, splitting in half
- Merge sorted halves together
82Bubblesort
- Bubblesort works by comparing and swapping values
in a list
1 2 3 4 5
6
12
101
5
35
42
77
83Bubblesort
- Bubblesort works by comparing and swapping values
in a list
1 2 3 4 5
6
Swap
12
101
5
35
42
77
84Bubblesort
- Bubblesort works by comparing and swapping values
in a list
1 2 3 4 5
6
Swap
12
101
5
35
77
42
85Bubblesort
- Bubblesort works by comparing and swapping values
in a list
1 2 3 4 5
6
Swap
12
101
5
77
35
42
86Bubblesort
- Bubblesort works by comparing and swapping values
in a list
1 2 3 4 5
6
77
101
5
12
35
42
No need to swap
87Bubblesort
- Bubblesort works by comparing and swapping values
in a list
1 2 3 4 5
6
Swap
77
101
5
12
35
42
88Bubblesort
- Bubblesort works by comparing and swapping values
in a list
1 2 3 4 5
6
101
77
5
12
35
42
Largest value correctly placed
89- void Bubblesort(ref int A)
-
- int to_do, index
- to_do N 1 // N size of A
- while (to_do ! 0)
- index 1
- while (index lt to_do)
- if(Aindex gt Aindex 1)
- Swap(Aindex, Aindex 1)
- index
-
- to_do--
-
- // Bubblesort
to_do
N-1
90Analysis of Bubblesort
- How many comparisons in the inner loop?
- to_do goes from N-1 down to 1, thus
- (N-1) (N-2) (N-3) ... 2 1
- Average N/2 for each pass of the outer loop.
- How many passes of the outer loop?
- N 1
91Bubblesort Complexity
- Look at the relationship between the two loops
- Inner is nested inside outer
- Inner will be executed for each iteration of
outer - Therefore the complexity is
- O((N-1)(N/2)) O(N2 N/2) O(N2)
92Mergesort
67
45
23
14
6
33
98
42
67
45
23
14
6
33
98
42
Log N
45
23
14
98
67
6
33
42
23
98
45
14
67
6
33
42
23
98
45
14
67
6
42
33
Log N
6
33
42
67
14
23
45
98
6
14
23
33
42
45
67
98
93Analysis of Mergesort
- Phase I
- Divide the list of N numbers into two lists of
N/2 numbers - Divide those lists in half until each list is
size 1 - Log N steps for this stage.
- Phase II
- Build sorted lists from the decomposed lists
- Merge pairs of lists, doubling the size of the
sorted lists each time - Log N steps for this stage.
94Analysis of the Merging
23
98
45
14
67
6
33
42
Merge
Merge
Merge
Merge
23
98
45
14
67
6
42
33
Merge
Merge
6
33
42
67
14
23
45
98
Merge
6
14
23
33
42
45
67
98
95Mergesort Complexity
- Each of the N numerical values is compared or
copied during each pass - The total work for each pass is O(N).
- There are a total of Log N passes
- Therefore the complexity is
- O(Log N N Log N) O (N Log N)
Break apart
Merging
96Summary
- You now have the O() for basic methods on varied
data structures. - You can combine these in more complex situations.
- Break algorithm down into known pieces
- Identify relationships between pieces
- Sequential is additive
- Nested (loop/recursion) is multiplicative
- Drop constants
- Keep only dominant factor for each variable
97FIN