Algorithms - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Algorithms

Description:

Be familiar with the concepts of growth rate, and upper and lower ... If a file is N kilobytes, the time to download is described by formula. T(N) = N/1.6 2 ... – PowerPoint PPT presentation

Number of Views:65

Avg rating:3.0/5.0

Slides: 29

Provided by: aidenmc

Category:

more less

Transcript and Presenter's Notes

Title: Algorithms

1
Algorithms Data Structures

COM328
Algorithm Analysis

2
Learning Outcomes

At the end of this lecture you should
Understand the need to measure algorithm
efficiency.
Understand how we can compare efficiency of
algorithms.
Be familiar with the concepts of growth rate, and
upper and lower performance bounds.
Be able to perform some simple algorithm
analysis.
Understand what is meant by asymptotic complexity
and Big Oh.
Be able to determine the Big Oh value for a given
algorithm complexity.
Understand the limitations of Big Oh.

3
Reading Material

Recommended chapters to review include
Chapter 2 R Lafore
Chapter 5 MA Weiss (DS PS)
Chapter 2 MA Weiss (DS AA)
Chapter 1 DS Malik, PS Nair
Chapter 3 Goodrich Tommassia

4
Introduction

"Languages come and go,
but algorithms stand the test of time"
"An algorithm must be seen to be believed."
Donald Knuth

5
Introduction

We are interested in design of good algorithms
and data structures
An algorithm is a step-by-step procedure for
performing tasks in a finite amount of time
A data structure is a systematic way of
organizing and accessing data
Analysis of standard algorithms aims to provide
you with
An understanding of algorithm constraints
An understanding of the trade-off between
execution time memory usage
A guide for your choice of algorithm for a
particular purpose
A repertoire of standard algorithms

6
Measuring the Efficiency of Algorithms

Consider two algorithms which carry out the same
task
What does it mean to compare the algorithms?
How can we conclude that one is better or more
efficient than the other?
What are the factors that affect the efficiency
of an algorithm?

7
Performance Factors

Our Primary interest is in running time (time
complexity) of algorithms and data structure
operations
We are interested in determining the dependency
of the running time on the size of the input
Our Secondary interest is in space usage (space
complexity) i.e. how much memory is required to
solve problem
We will concentrate on analysing time complexity
and use it to allow the relative costs of two or
more algorithms to be compared e.g. which search
or sort to use etc.

8
Time Complexity

The time an algorithm takes to solve a problem
clearly depends on two sets of parameters
The number of computational steps.
The time take to complete each of these steps
Computational Steps can be calculated by
examining the code
Time taken for each step depends on many factors
performance of computer
efficiency of compiler etc.
We will therefore analyse the computational steps

9
Problem Analysis File Download

Suppose when downloading a file over the internet
there is an initial two second delay (setup
connection) and then download proceeds at
1.6K/sec. If a file is N kilobytes, the time to
download is described by formula
T(N) N/1.6 2
This is a linear function as downloading an 80k
file takes approx 52 seconds while a file twice
as large (160k) takes approx 102 seconds (twice
as long)
Time taken is proportional to the amount of input
(N)
The property where time is directly proportional
to amount of input is the signature of a linear
algorithm
A linear algorithm is one of the most efficient
algorithms

10
Problem Analysis Delivering Packages

Delivering packages to 50 houses each one mile
apart.

Solution 1 Collect all 50 packages from shop.
Drive to closest house to shop, deliver package
and then drive to next closest shop (1mile away)
and so on. Finally driving back to the shop when
all packages are delivered.

Solution 2 Collect first package from shop drive
to first house, deliver package and drive back to
shop to collect second package. Repeat this
process for each package.

11
Problem Analysis Delivering Packages

In solution 1, the distance the driver travels to
deliver all 50 packages is 1 1 1 1
50 milesTherefore total distance travelled to
deliver the packages and return to the shop is
50 50 100 miles
In Solution 2, the distance the driver travels to
deliver all 50 packages is 2 ( 1 2 3 4
50) 2550 miles
So suppose there are n packages to deliver.
Solution 1 1 1 1 n 2n
Solution 2 2 (1 2 3 n)
2(n(n-1)/2) n2 n

n times
12
Values of n

In solution 1 we say the distance travelled is a
function of n.
In solution 2, for large values of n, the
dominant term is n2 and the term containing n is
negligible.

So when analysing an algorithm we usually count
the number of operations performed (number of
steps taken) as this is a constant factor no
matter what computer or programming language is
used to implement the algorithm.

13
A Simple Analysis

Sum of an array of integers (a) of size N

int sum(int a, int N) int s0 for (int
i0 ilt N i) s s ai return s
2
4
3
1
5
6
7
8

Statements
1,2,8 Executed once
3,4,5,6,7 Executed once per each iteration of
for loop, N i
Thus giving total 5n 3
The complexity function of the algorithm is
f(n) 5n 3

14
How 5n3 Grows

Estimated running time for different values of n
n 10 gt 53 steps
n 100 gt 503 steps
n 1,000 gt 5003 steps
n 1,000,000 gt 5,000,003 steps
As n grows, the number of steps grows in linear
proportion to n for this Sum function.
This makes sense since
f(n) 5n3 is a linear function in n.

15
Asymptotic Complexity

What term in the previous complexity function
dominates?
What about the 5 in 5n3 ?
What about the 3 ?
As n gets large, the 3 becomes insignificant
The 5 is inaccurate as different operations
require varying amounts of time
What is fundamental is that the time is linear in
n.
Asymptotic Complexity As n gets large, ignore
all lower order terms and concentrate on the
highest order term only i.e.
Drop lower order terms such as 3
Drop the constant coefficient of the highest
order term.

16
Asymptotic Complexity (2)

The 5n3 time bound is said to "grow
asymptotically" like n.
This gives us an approximation of the complexity
of the algorithm. (i.e. f(n) n )
Ignores lots of (machine dependent) details,
concentrates on the bigger picture. Why is this
useful?
As inputs get larger, any algorithm of a smaller
order will be more efficient than an algorithm of
a larger order.

Time (Steps)
0.01n2
10n
Input Size (N)
17
Asymptotic Complexity (3)

Note how for small values of n the value in the
third column (10n) overwhelms the quantity in the
second column (0.01n2)
But as n increases the differences between n2 and
n increases so quickly that it eventually more
than compensates for the difference between 10
and 0.01.
Thus when n1000 the time taken by both terms is
roughly equal. But after that the term 0.01n2
overwhelms 10n so much that 10n becomes
insignificant to the overall performance of the
algorithm.

18
Another Analysis

Statement sequences

int sum 0 for (int i0 i lt n i)
sum for (int j0 j lt 2n j) if
((j2)0) sum else sum--

f(n)loop1 3n 2
f(n)loop2 5n 1
f(n) f(n)loop1 f(n)loop2 8n 3

19
A More Complex Analysis

Loops and nested loops

int sum 0 for (int i0 i lt n i) for
(int j0 j lt n j) sum

Analysis shows
sum, jltn and j are executed nn times (n2)
iltn and i and j0 are executed n times
sum0 and i0 are executed once
Therefore
f(n) 3n2 3n 2

20
An Asymptotic Notation Big-O

We use a convention called Big-O Notation to
represent different complexity classes.
Big Oh makes no attempt to provide exact running
times for an algorithm, but estimates how fast
the execution time grows as the size of the input
grows.
Big Oh simplifies analysis of complexity
expressions
Big Oh defines an upper bound on the asymptotic
growth of a complexity class

21
Big-O Notation - Simplification

Only consider the term which grows fastest as N
increases
e.g. if f(N) is N2N, then the relationship is
O(N2)
Drop any constants before the largest term
e.g. if f(N) is 5N24N, then the relationship is
O(N2)
The base of any logs can be ignored
e.g. if f(N) is logbN, then the relationship is
O(log N)
What is Big-O notation for earlier code example
f(N) is 3N2 3N 2, then relationship is
________?

22
Best, Worst, Average Cases

Consider searching for an element in an array of
elements

23 45 21 19 56 33 89 67 40 11

Best case is when element is first
f(n) best 1 O(1)
Worst case is when element is last
f(n) worst n O(N)
Average is probably midway (but depends on
distribution)
f(n) average n/2 O(N)
Usually we analyze an algorithms worst-case
behaviour (because its easier). Worst case is
sometimes uncommon and can be ignored, at other
times it is very common and cannot be ignored.
In some cases, average-case behaviour is more
useful (but its usually much more difficult to
evaluate)

23
Classes of Complexity

Many algorithms can be written in multiple ways.
Usually the simplest, most obvious algorithm will
also be the slowest. More sleek, clever, less
intuitive algorithms developed over the years may
be faster. Some common running times for
algorithms are
Notation Complexity Example
O(1) constant array access
O(log N) logarithmic binary search
O(N) linear sequential search
O(N log N) N log N quicksort, heap sort, merge
sort
O(N2) quadratic selection sort, insertion sort
O(N3) cubic multiply 2 nXn matrices
O(Nn) exponential travelling salesman problem
(TSP)
We will examine some of these algorithms in
future lectures

24
Graph of Big O Times
40 35 30 25 20 15 10 5 0
O(N2)
Number of steps
O(N)
Lafore p72
O(log N)
O(1)
0 5 10 15
20 25
Number of items (N)
25
Limitations of Big-Oh

Big-Oh is very effective in establishing an
algorithms complexity class, but it has some
limitations.
Its not appropriate for small amounts of input
here just use the simplest algorithm
Large constants (ignored by Big-Oh) may come into
play when an algorithm is excessively complex
e.g. In a complex algorithm 2N log N is probably
better than 1000N even though its growth rate is
larger
Large constants also come into play because our
analysis disregards constants and cannot
differentiate between memory access (cheap) and
disk access (expensive).
Analysis assumes infinite memory, but in large
data sets this can be a problem

26
Summary

The efficiency of an algorithm is determined in
terms of its growth-rate.
The growth-rate of an algorithm tells us how
quickly the algorithm grows as a function of the
size of the problem
Computational complexity of an algorithm is
represented using the Big-O notation.
Using Big-O notation, we can classify algorithms
to belong to different complexity classes.
We note that its easier to analyse an algorithms
worst case than an average case.
We would like to avoid algorithms that are of
exponential complexity.

27
Questions

Q1. Analyse the following segments of code,
estimate their complexity and provide an answer
in Big-Oh notation.

28
Questions

Q2. Calculate complexity orders for algorithms
to
a) Calculate minimum element in an array. (Given
an array of N items, find the smallest item).
Initialise min to first element of array
Scan array and update min as appropriate
b) Compute closest points in a plane. (Given N
points in a plane which is an x-y coordinate
system), find the pair of points that are closest
together.
Calculate distance between each pair of points
Retain minimum distance
Q3. An algorithm takes 0.5ms for input size 100.
How long will it take for input size 500 if
running time isa) Linear b) Quadratic c) Cubic