CS201: PART 1 - PowerPoint PPT Presentation

1 / 64
About This Presentation
Title:

CS201: PART 1

Description:

Uses a high-level description of the algorithm instead of an implementation ... Consistency in Input/Output behavior. User documentation. Analysis of Algorithms. 6 ... – PowerPoint PPT presentation

Number of Views:149
Avg rating:3.0/5.0
Slides: 65
Provided by: robert1788
Category:

less

Transcript and Presenter's Notes

Title: CS201: PART 1


1
CS201 PART 1
  • Data Structures Algorithms
  • S. Kondakci

2
Analysis of Algorithms
Algorithm
Input
Output
  • An algorithm is a step-by-step procedure for
  • solving a problem in a finite amount of time.
  • Theoretical Analysis of Algorithms
  • Uses a high-level description of the algorithm
    instead of an implementation
  • Characterizes running time as a function of the
    input size n.
  • Takes into account all possible inputs Allows us
    to evaluate the speed of any design independent
    of its implementation.

3
Program Efficiency
Program efficiency is a measure of the amount
of resources required to produce desired results.
Efficiency Aspects 1) What are the important
resources we should try to optimize? 2) Where are
the important efficiency gains to be made? 3) How
important is efficiency in the first place?
4
Efficiency Today
  • User Efficiency. The amount of time and effort
    users will spend to learn how o use the program,
    how to prepare the data, how to configure and
    customize the program, and how to interpret and
    use the output.
  • Maintenance Efficiency. The amount of time and
    effort maintenance group will spend reading a
    program and its technical documentation in order
    to understand it well enough to make any
    necessary modifications.
  • Algorithmic Complexity. The inherent efficiency
    of the method itself, regardless of which machine
    we run it on or how we code it.
  • Coding Efficiency. This is the traditional
    efficiency measure. Here we are concerned with
    how much processor time and memory space a
    computer program requires to produce desired
    results. Coding efficiency is the key step
    towards optimal usage of machine resources.

5
Programmers Duty
  • Programmers should should keep these in mind
  • Correct, robust, and reliable.
  • Easy to use for its intended end-user group.
  • Easy to understand and easy to modify.
  • Portable.
  • Consistency in Input/Output behavior.
  • User documentation.

6
Optimization
  • Optimization on CPU-Time Consider a network
    security assessment tool as a real-time
    application. The application works like a
    security scanner protocol designed to audit,
    monitor, and correct all aspects of network
    security. Real-time processing of the intercepted
    network packets containing inspection information
    requires faster data processing. Besides, such a
    process should generate some auditing
    information.
  • Optimization on Memory Developing programs that
    do not fit into the memory space available on
    your systems is often quite a bit demanding.
    Kernel level processing of the network packets
    requires kernel memory optimization and a
    powerful and failsafe memory management
    capability.
  • Providing Run-time Continuity Extensive
    machine-level optimization is a major requirement
    for continuously running programs, such as the
    security scanner daemons.
  • Reliability and Correctness One of the
    inevitable efficiency requirements is the
    absolute reliability. The second important
    efficiency factor is correctness. That is, your
    program should do exactly what it is supposed to
    do. Choosing and implementing a reliable
    inspection methodology should be done with
    precision.
  • Optimization on Programmers Time How efficient
    a programmer works depends on the choice of team
    policy and developmen tool selection.

7
Coding Efficiency Unstructured Code
/Efficient Programming/S. Kondakci-1999
8
Coding Efficiency Structured Code
/Efficient Programming/S. Kondakci-1999
9
Protecting Against Run-time Errors
  • Illegal pointer operations.
  • Array subscript out of bound.
  • Endless loops may cause stacks grow into the heap
    area.
  • Presentational errors, such as network byte
    order, number conversions, division by zero,
    undefined results, e.g., tan(90) undefined.
  • Trying to write over the kernels text area, or
    the data area.
  • Referencing objects declared as prototype but not
    defined.
  • Performing operations on a pointer pointing at
    NULL.
  • Operating system weaknesses.

10
Assertions
A general pitfall making assumptions that turn
out not to be justified. Most of the mistakes
arise from simply misunderstanding the
interaction between various pieces of code The
assertion rule states that you should always
express yourself boldly or forcefully of the fact
that there are some other things that you have
not covered clear enough yet. Any assumptions
you make in writing your programs should be
documented somewhere in the code itself,
particularly if you know or expect the assumption
to be false in other environments.
11
Does the Machine Understand Your Assumptions?
Remember those assumptions are yours They should
be presented to the machine by any means that you
are supposed to provide in your code. The machine
will not be able to check your assumptions. This
is simply a matter of including explicit checks
in your code, even for things that cannot
happen.
if (p NULL) panic(Driver routine p is
NULL\n) if (p-gtp_flags BUSY) / Safe to
continue / ltetceteragt  ASSERT(p !NULL) If
(p-gtp_flags BUSY) / Safe to continue
/ ltetceteragt
12
Guidelines for the implementation
  • Protect input parameters using call-by-value.
  • Avoid global variables and functions with side
    effects.
  • Make all temporary variables local to functions
    where they are used.
  • Never halt or sleep in a function. Spawn a
    dedicated function if necessary.
  • Avoid producing output within a function unless
    the sole purpose of the function is output.
  • Where appropriate use return values to return the
    status of function calls.
  • Avoid confusing programming tricks.
  • Always strive for simplicity and clarity. Never
    sacrifice clarity of expression for cleverness of
    expression.
  • If any keep your assertions local to your code.
  • Never sacrifice clarity of expression for minor
    reductions in execution time.

13
Debugging and Tracing
Making use of the preprocessor can allow you to
incorporate many debugging aids in your module,
for instance, the driver module. Later, in the
production version these debugging aids can be
removed.
ifdef DEBUG define TRACE_OPEN (debugging
0x01) define TRACE_CLOSE (debugging
0x02) define TRACE_READ (debugging
0x04) define TRACE_WRITE (debugging
0x08) int debugging -1 / enable all traces
output / else define TRACE_OPEN 0 define
TRACE_CLOSE 0 define TRACE_READ 0 define
TRACE_WRITE 0 endif ...
14
Tracing Later in the Program
Later, in the code the output of the trace
information can be done by a manner similar to
this
if (TRACE_READ) printf(Device driver read,
Packet number (d) \n,pack_no) ltetceteragt
15
Checking Programs With lint (Unix)
The lint utility is intended to verify some
facets of a C program, such as its potential
portability. lint derives from the idea of
picking the fluff out of a C program. It does
this, by advising on C constructs (including
functions) and usage which might turn out to be
bugs, portability problems, inconsistent
declarations, bad function and argument types, or
dead code. See the manual section lint(1) for
further explanations.
16
Now, Linting
lint hxa mytest.c   (8) warning loop not
entered at top (8) warning constant in
conditional context   variable unused in
function (3) z in main   implicitly declared
to return int (10) printf   declaration
unused in block (5) duble   function returns
value, which is always ignored printf
17
Test Coverage Analysis
Yet another tool born for execution tracing and
analysis of programs called tcov,it can be used
to trace and analyze a source code to report a
coverage test. tcov does this by analysing the
source code step-by-step. The extra code is
generated by giving the xa option to the
compiler command, i.e.,   gcc -xa -o src
src.c The xa option invokes a runtime recording
mechanism that creates a .d file for every .c
file. The .d file accumulates execution data for
the corresponding source file. The tcov utility
can then be run on the source file to generate
statistics about the program. The following
example source file, getmygid.c, is analysed
as   cc -xa -o getmygid getmygid.c tcov -a
getmygid.c ls l getmy??? -rwxr-xr-x 1 staff
25120 Feb 11 1207 getmygid -rw------- 1
staff 519 Sep 9 1994 getmygid.c -rw-r--r-
- 1 staff 9 Feb 11 1207
getmygid.d -rw-r--r-- 1 staff 1025 Feb 11
1208 getmygid.tcov
18
Example getmygid.c
cat getmygid.c   include ltstdio.hgt char msg
"I am sorry I cannot tell you everything" int
gid,egid int uid,euid, pid ,ppid, i int
main() gid getgid() if (gid
gt 0) printf("1- My GID is d\n", gid)
egid getegid() if (egid gt0 )
printf("2- My EGID is d\n", egid) uid
getuid() if ( uid gt0) printf("3- My
uid is d\n", uid) euid geteuid()
if (euid gt 0) printf("4- My Euid is d\n",
euid) pid getpid() if ( pid
gt0 ) printf("5- My pid is d\n", pid)
ppid getppid() if ( ppid gt 0)
printf("6- My ppid is d\n", ppid)
prt_msg("We came to end!!!") return 0
prt_msg(msg) prt_msg(char mesg)
printf("s \n", mesg)
19
Tcoving getmygid.c
cat getmygid.tcov   -gt include
ltstdio.hgt -gt char msg "I am sorry I
cannot tell you everything" -gt
-gt int gid,egid -gt int uid,euid, pid
,ppid, i -gt int main() -gt
2 -gt gid getgid() 2 -gt if
(gid gt 0) printf("1- My GID is d\n", gid)
2 -gt egid getegid() 2 -gt if
(egid gt0 ) printf("2- My EGID is d\n", egid)
2 -gt uid getuid() 2 -gt if
( uid gt0) printf("3- My uid is d\n", uid)
2 -gt euid geteuid() 2 -gt if
(euid gt 0) printf("4- My Euid is d\n", euid)
2 -gt pid getpid() 2 -gt if
( pid gt0 ) printf("5- My pid is d\n", pid)
2 -gt ppid getppid() 2 -gt if
( ppid gt 0) printf("6- My ppid is d\n",
ppid) 2 -gt prt_msg("We came to
end!!!") 2 -gt return 0 2 -gt
prt_msg(msg) 2 -gt 2
-gt prt_msg(mesg) 2 -gt char
mesg 2 -gt 2 -gt
printf("s \n", mesg) 2 -gt
20
Tcoving getmygid.c
As shown, tcov(1) generates an annotated listing
of the source file (getmygid.tcov), where each
line is prefixed with a number indicating the
count of execution of each statement on the line.
Finally per line and per block statistics are
shown.
Top 10 Blocks   Line
Count   9 2
11 2 13
2 15 2
17 2 19
2 21
2 29            2   8
Basic blocks in this file 8 Basic
blocks executed 100.00 Percent of the
file executed   16 Total basic block
executions 2.00 Average executions per
basic block
21
Have nice break!
22
Analysis of Algorithms
Algorithm
Input
Output
An algorithm is a step-by-step procedure
for solving a problem in a finite amount of time.
23
Running Time
  • Most algorithms transform input objects into
    output objects.
  • The running time of an algorithm typically grows
    with the input size.
  • Average case time is often difficult to
    determine.
  • We focus on the worst case running time.
  • Easier to analyze
  • Crucial to applications such as games, finance
    and robotics

24
Experimental Studies
  • Write a program implementing the algorithm
  • Run the program with inputs of varying size and
    composition
  • Use a function, like the built-in clock()
    function, to get an accurate measure of the
    actual running time
  • Plot the results

25
Limitations of Experiments
  • It is necessary to implement the algorithm, which
    may be difficult
  • Results may not be indicative of the running time
    on other inputs not included in the experiment.
  • In order to compare two algorithms, the same
    hardware and software environments must be used

26
Theoretical Analysis
  • Uses a high-level description of the algorithm
    instead of an implementation
  • Characterizes running time as a function of the
    input size, n.
  • Takes into account all possible inputs
  • Allows us to evaluate the speed of an algorithm
    independent of the hardware/software environment

27
Pseudocode
  • High-level description of an algorithm
  • More structured than English prose
  • Less detailed than a program
  • Preferred notation for describing algorithms
  • Hides program design issues


28
Pseudocode Details
  • Control flow
  • if then else
  • while do
  • repeat until
  • for do
  • Indentation replaces braces
  • Method declaration
  • Algorithm method (arg , arg)
  • Input
  • Output
  • Method/Function call
  • method (arg , arg)
  • Return value
  • return expression
  • Expressions
  • Assignment(like ? in C)
  • Equality testing(like ?? in C)
  • n2 Superscripts and other mathematical formatting
    allowed

29
The Random Access Machine (RAM) Model
  • A CPU
  • A potentially unbounded bank of memory cells,
    each of which can hold an arbitrary number or
    character
  • Memory cells are numbered and accessing any cell
    in memory takes unit time.

30
Primitive Operations
  • Basic computations performed by an algorithm
  • Identifiable in pseudocode
  • Largely independent from the programming language
  • Exact definition not important
  • Assumed to take a constant amount of time in the
    RAM model
  • Examples
  • Evaluating an expression
  • Assigning a value to a variable
  • Indexing into an array
  • Calling a method
  • Returning from a method

31
Counting Primitive Operations
  • By inspecting the pseudocode, we can determine
    the maximum number of primitive operations
    executed by an algorithm, as a function of the
    input size
  • Algorithm arrayMax(A, n)
  • operations
  • currentMax ? A0 2
  • for i ? 1 to n ? 1 do 2 n
  • if Ai ? currentMax then 2(n ? 1)
  • currentMax ? Ai 2(n ? 1)
  • increment counter i 2(n ? 1)
  • return currentMax 1
  • Total 7n ? 1

32
Estimating Running Time
  • Algorithm arrayMax executes 7n ? 1 primitive
    operations in the worst case. Define
  • a Time taken by the fastest primitive operation
  • b Time taken by the slowest primitive
    operation
  • Let T(n) be worst-case time of arrayMax. Then a
    (7n ? 1) ? T(n) ? b(7n ? 1)
  • Hence, the running time T(n) is bounded by two
    linear functions

33
Growth Rate of Running Time
  • Changing the hardware/ software environment
  • Affects T(n) by a constant factor, but
  • Does not alter the growth rate of T(n)
  • The linear growth rate of the running time T(n)
    is an intrinsic property of algorithm arrayMax

34
Growth Rates
  • Growth rates of functions
  • Linear ? n
  • Quadratic ? n2
  • Cubic ? n3
  • In a log-log chart, the slope of the line
    corresponds to the growth rate of the function

35
Constant Factors
  • The growth rate is not affected by
  • constant factors or
  • lower-order terms
  • Examples
  • 102n 105 is a linear function
  • 105n2 108n is a quadratic function

36
Big-Oh Notation
  • Given functions f(n) and g(n), we say that f(n)
    is O(g(n)) if there are positive constantsc and
    n0 such that
  • f(n) ? cg(n) for n ? n0
  • Example 2n 10 is O(n)
  • 2n 10 ? cn
  • (c ? 2) n ? 10
  • n ? 10/(c ? 2)
  • Pick c 3 and n0 10

37
Big-Oh Example
  • Example the function n2 is not O(n)
  • n2 ? cn
  • n ? c
  • The above inequality cannot be satisfied since c
    must be a constant

38
More Big-Oh Examples
  • 7n-2
  • 7n-2 is O(n)
  • need c gt 0 and n0 ? 1 such that 7n-2 ? cn for n
    ? n0
  • this is true for c 7 and n0 1
  • 3n3 20n2 5

3n3 20n2 5 is O(n3) need c gt 0 and n0 ? 1
such that 3n3 20n2 5 ? cn3 for n ? n0 this
is true for c 4 and n0 21
  • 3 log n log log n

3 log n log log n is O(log n) need c gt 0 and n0
? 1 such that 3 log n log log n ? clog n for n
? n0 this is true for c 4 and n0 2
39
Big-Oh and Growth Rate
  • The big-Oh notation gives an upper bound on the
    growth rate of a function
  • The statement f(n) is O(g(n)) means that the
    growth rate of f(n) is no more than the growth
    rate of g(n)
  • We can use the big-Oh notation to rank functions
    according to their growth rate

40
Big-Oh Rules
  • If is f(n) a polynomial of degree d, then f(n) is
    O(nd), i.e.,
  • Drop lower-order terms
  • Drop constant factors
  • Use the smallest possible class of functions
  • Say 2n is O(n) instead of 2n is O(n2)
  • Use the simplest expression of the class
  • Say 3n 5 is O(n) instead of 3n 5 is O(3n)

41
Asymptotic Algorithm Analysis
  • The asymptotic analysis of an algorithm
    determines the running time in big-Oh notation
  • To perform the asymptotic analysis
  • We find the worst-case number of primitive
    operations executed as a function of the input
    size
  • We express this function with big-Oh notation
  • Example
  • We determine that algorithm arrayMax executes at
    most 7n ? 1 primitive operations
  • We say that algorithm arrayMax runs in O(n)
    time
  • Since constant factors and lower-order terms are
    eventually dropped anyhow, we can disregard them
    when counting primitive operations

42
Computing Prefix Averages
  • We further illustrate asymptotic analysis with
    two algorithms for prefix averages
  • The i-th prefix average of an array X is average
    of the first (i 1) elements of X
  • Ai (X0 X1 Xi)/(i1)

43
Prefix Averages (Quadratic)
  • The following algorithm computes prefix averages
    in quadratic time by applying the definition

Algorithm prefixAverages1(X, n) Input array X of
n integers Output array A of prefix averages of
X operations A ? new array of n integers
n for i ? 0 to n ? 1 do n s ? X0
n for j ? 1 to i do 1 2 (n ?
1) s ? s Xj 1 2 (n ? 1) Ai
? s / (i 1) n return A
1
44
Arithmetic Progression
  • The running time of prefixAverages1 isO(1 2
    n)
  • The sum of the first n integers is n(n 1) / 2
  • There is a simple visual proof of this fact
  • Thus, algorithm prefixAverages1 runs in O(n2)
    time

45
Prefix Averages (Linear)
  • The following algorithm computes prefix averages
    in linear time by keeping a running sum

Algorithm prefixAverages2(X, n) Input array X of
n integers Output array A of prefix averages of
X operations A ? new array of n
integers n s ? 0 1 for i ? 0 to n ? 1
do n s ? s Xi n Ai ? s / (i 1)
n return A 1
  • Algorithm prefixAverages2 runs in O(n) time

46
Computing Spans
  • We show how to use a stack as an auxiliary data
    structure in an algorithm
  • Given an an array X, the span Si of Xi is the
    maximum number of consecutive elements Xj
    immediately preceding Xi and such that Xj ?
    Xi
  • Spans have applications to financial analysis
  • E.g., stock at 52-week high

X
S
47
Quadratic Algorithm
Algorithm spans1(X, n) Input array X of n
integers Output array S of spans of X S
? new array of n integers n for i ? 0 to n ?
1 do n s ? 1 n while s ? i ? Xi - s ?
Xi 1 2 (n ? 1) s ? s 1 1 2
(n ? 1) Si ? s n return S
1
  • Algorithm spans1 runs in O(n2) time

48
Have nice break!
49
Recursion
Recursion a function calls itself as a function
for unknown times. We call this recursive call
for (i 1 i lt n-1 i) sum sum 1
int sum(int n) if (n lt 1) return
1 else return (n sum(n-1))
50
Recursive function
int f( int x ) if( x 0 ) return
0 else return 2 f( x - 1 ) x
x
51
Recursion
Calculate factorial (n!) of a positive
integer n! n(n-1)(n-2)...(n-n-1), 0! 1! 1
int factorial(int n) if (n lt 1) return
1 else return (n factorial(n-1))
52
Fibonacci numbers, Bad algorith for ngt40 !
long fib(int n) if (n lt 1) return
1 else return fib(n-1) fib(n-2)
53
Algorithm IterativeLinearSum(A,n)
Algorithm IterativeLinearSum(A,n) Input An
integer array A and an integer n (size) Output
The sum of the first n integers if n 1
then return A0 else while n ? 0 do sum
sum An n ? n - 1 return sum
54
Algorithm LinearSum(A,n)
Algorithm LinearSum(A,n) Input An integer array
A and an integer n (size) Output The sum of the
first n integers if n 1 then return A0 else
return LinearSum(A,n-1) An-1
55
Iterative Approach Algorithm IterativeReverseArr
ay(A,i,n)
Algorithm IterativeReverseArray(A,i,n) Input An
integer array A and an integers i and n Output
The reversal of n integers in A starting at index
i while n gt 1 do swap Ai and Ain-1 i ? i
1 n ? n-2 return
56
Algorithm ReverseArray(A,i,n)
Algorithm ReverseArray(A,i,n) Input An integer
array A and an integers i and n Output The
reversal of n integers in A starting at index
i if n gt 1 then swap Ai and Ain-1 call
ReverseArray(A, i1, n-2) return
57
Higher-Order Recursion
Making recursive calls more than a single call at
a time.
Algorithm BinarySum(A,i,n) Input An integer
array A and an integers i and n Output The sum
of n integers in A starting at index i if n 1
then return Ai return BinarySum(A,i,n/2)Bin
arySum(A,in/2,n/2)
58
Kth Fibonacci Numbers
59
kth Fibonacci Numbers
Linear recursion
Algorithm BinaryFib(k) Input An integer
k Output A pair ( ) such that is the kth
Fibonacci number and is the (k-1)st
Fibonacci number if (k lt 1) then return (k,0)
else (i,j) ? LinearFibonacci(k-1) return
(ij,i)
60
kth Fibonacci Numbers
Binary recursion
Algorithm BinaryFib(k) Input An integer
k Output The kth Fibonacci number if (k lt 1)
then return k else return
BinaryFib(k-1)BinaryFib(k-2)
61
Math you need to Review
  • Summations
  • Logarithms and Exponents
  • Proof techniques
  • Basic probability
  • properties of logarithms
  • logb(xy) logbx logby
  • logb (x/y) logbx - logby
  • logbxa alogbx
  • logba logxa/logxb
  • properties of exponentials
  • a(bc) aba c
  • abc (ab)c
  • ab /ac a(b-c)
  • b a logab
  • bc a clogab

62
Relatives of Big-Oh
  • big-Omega
  • f(n) is ?(g(n)) if there is a constant c gt 0
  • and an integer constant n0 ? 1 such that
  • f(n) ? cg(n) for n ? n0
  • big-Theta
  • f(n) is ?(g(n)) if there are constants c gt 0 and
    c gt 0 and an integer constant n0 ? 1 such that
    cg(n) ? f(n) ? cg(n) for n ? n0
  • little-oh
  • f(n) is o(g(n)) if, for any constant c gt 0, there
    is an integer constant n0 ? 0 such that f(n) ?
    cg(n) for n ? n0
  • little-omega
  • f(n) is ?(g(n)) if, for any constant c gt 0, there
    is an integer constant n0 ? 0 such that f(n) ?
    cg(n) for n ? n0

63
Intuition for Asymptotic Notation
  • Big-Oh
  • f(n) is O(g(n)) if f(n) is asymptotically less
    than or equal to g(n)
  • big-Omega
  • f(n) is ?(g(n)) if f(n) is asymptotically greater
    than or equal to g(n)
  • big-Theta
  • f(n) is ?(g(n)) if f(n) is asymptotically equal
    to g(n)
  • little-oh
  • f(n) is o(g(n)) if f(n) is asymptotically
    strictly less than g(n)
  • little-omega
  • f(n) is ?(g(n)) if is asymptotically strictly
    greater than g(n)

64
Example Uses of the Relatives of Big-Oh
  • 5n2 is ?(n2)

f(n) is ?(g(n)) if there is a constant c gt 0 and
an integer constant n0 ? 1 such that f(n) ?
cg(n) for n ? n0 let c 5 and n0 1
  • 5n2 is ?(n)

f(n) is ?(g(n)) if there is a constant c gt 0 and
an integer constant n0 ? 1 such that f(n) ?
cg(n) for n ? n0 let c 1 and n0 1
  • 5n2 is ?(n)

f(n) is ?(g(n)) if, for any constant c gt 0, there
is an integer constant n0 ? 0 such that f(n) ?
cg(n) for n ? n0 need 5n02 ? cn0 ? given c, the
n0 that satisfies this is n0 ? c/5 ? 0
Write a Comment
User Comments (0)
About PowerShow.com