Title: Integrating Advanced Algorithms into Undergraduate Computer Science Curriculum
1Integrating Advanced Algorithms into
Undergraduate Computer Science Curriculum
- Yana Kortsarts
- Widener University
- Computer Science Department
2Plan
- Randomized Algorithms
- Teaching a Power of Randomization Using a Simple
Game - Additional Examples of Advanced Algorithms in
Undergraduate Computer Science Curriculum
3Algorithm
- An algorithm is a sequence of instructions for
solving a problem - Deterministic Algorithm runs in the same way on
the same input every time. Deterministic
algorithm has predicted behavior - Randomized Algorithm is an algorithm that makes
random choices during execution
4Deterministic Algorithm
THE SAME INPUT
THE SAME BEHAIVOR
OUTPUT
5Randomized Algorithm
THE SAME INPUT
DIFFERENT BEHAIVOR
OUTPUT
6Why Should We Teach Randomized Algorithms?
- Randomization is a general tool that applies in
various computer science areas and not just a
subject by itself. - Significance many of the breakthroughs in
various algorithmic areas have used
randomization. - Example Prime Number Test
- Simple polynomial one-sided error Monte Carlo
algorithm Rabin Algorithm (1980) - A deterministic polynomial time algorithm was
given by Agarwal, Kayal, and Saxena (2002).
7Advantages of Randomized Algorithms
- Performance for many problems, randomized
algorithms run faster than the best known
deterministic algorithm - Simplicity many randomized algorithms are
simpler to describe and implement than
deterministic algorithms of comparable
performance.
8Challenges and Solutions
- The concept of a randomized algorithm can be
difficult to understand. - Usually, there is no separate course on
Randomized Algorithms in undergraduate CS
curriculum - The idea of a randomized algorithm is clearer for
students when presented as a game. - Topic could be integrated into introductory
courses
9Algorithm as Part of the Game
Design of an Algorithm for a Combinatorial Problem
GAME
Algorithm Player
Input Player
Designs the Algorithms
Goal Minimize Running Time of the Algorithm
Goal Maximize Running Time of the Algorithm
Selects Test Input for Selected Algorithm
10Deterministic Algorithms
Input Player
Algorithm Player
Deterministic Strategy (Deterministic Algorithm)
Best Strategy Finding the Worst Input for the
Algorithm Produced by the Algorithm Player
- Reveals an entire strategy
- (algorithm) first
- Input Player can pick
- the worst example for
- the suggested algorithm
11Deterministic Algorithms
- The problem facing the algorithm player is that
if it uses a deterministic strategy, then since
in a sense it moves first", the second (input)
player can indeed pick the worst example for the
suggested algorithm
12Randomized Algorithms
Input Player
Algorithm Player
Randomized Strategy (Randomized Algorithm)
A bad input for a randomized algorithm
has to be an input which is bad for several
algorithms simultaneously
- Randomized algorithm can be seen
- as a distribution over all possible
- deterministic algorithms
- Doesnt reveal his cards fully in advance
- Tells the second player the probability by
- which it selects any one of the possible
- deterministic algorithms
- The coins have not fallen yet, and the game
- only begins after the input player chooses
- its adversarial input.
13Game Description
- Player 1 Decides on integer x gt 0
- Player 2 Has to find a number yn so that yn ?
x - Rules
- Player 2 y1lt y2lt lt yn
- y1lt y2lt lt yn-1 lt x and yn ? x
- On a guess yj, player 1 either says
- smaller than x, please provide a next guess
- larger or equal x, and reveals x stopping the
game
14Optimization Criteria
- Let the guesses be y1, y2, .yn, so that
- yn ? x
- yj lt x for all j n 1
- The optimization criteria is the
-
- performance ratio
15EXAMPLE
Player 1
Player 2
x is chosen, please provide a guess
1
Smaller than x, next guess
3
Smaller than x, next guess
10
Smaller than x, next guess
28
Smaller than x, next guess
76
STOP! x 37
Performance Ratio 118 / 37 3.189189
16EXAMPLE
Player 1
Player 2
x is chosen, please provide a guess
1
Smaller than x, next guess
2
Smaller than x, next guess
3
Smaller than x, next guess
4
Smaller than x, next guess
23
STOP! x 5
Performance Ratio (123423) / 5 6.6
17Teaching the Game
- Discussion of the Optimization Criteria
selection - Why not to choose yi/x, where yi ? x?
- Answer simple strategy 1, 2, 3, is optimal
- Discussion of possible strategies
- Why not to start with some large number?
- Why do we not benefit from increasing the next
guess only a little compared to the previous
guess? - What is the disadvantage of making the next
guess, say, 100 times larger than previous guess?
18The Powers of 2 Strategy for the Second Player
- It turns out that the simple strategy that
selects powers of 2 y0 1, y1 2, y2 4, yi
2i - is an optimal deterministic strategy for
this game - The worst case for the strategy is when the
number selected by the first player is x 2j 1
- In this case the game is played until the second
player suggests 2j1
19EXAMPLE
- x 13
- guesses 1, 2, 4, 8, 16
- sum 124816 31
- performance ratio is 2.384615
- x 65
- guesses 1, 2, 4, 8, 16, 32, 64, 128
- sum 1248163264128 255
- performance ratio is 3.923
20The Powers of 2 Strategy Analysis
- The strategy y0 1, y1 2, y2 4, yi 2i
gives a following performance ratio - Worst Case x 2j 1, the performance ratio is
-
21Teaching the Game
- Encourage students to find by themselves the
worst case for the powers of 2 strategy. - This example serves well in illustrating the
strict notion of the worst case input. - The bad instance for the powers of 2 strategy is
a very specific and rare number. (1, 2, 4, 8, 16,
17, 32)
22Teaching the Game
- If x is some random number, the powers of 2
strategy performs much better. - A good place to discuss the difference between
random strategy and random inputs. - The input is sometimes not within our control,
while the randomized algorithm is within our
control as the designers of the algorithms.
23A Deterministic Worst CaseLower Bound
- Let ? gt 0 be a small as desired constant.
- We show that any deterministic strategy has
examples with performance ratio at least 4 - ? - The powers of 2 is the optimal deterministic
strategy.
24A Randomized Strategy
- The following simple randomized strategy gives an
improved expected value - Let ? ?R 0, 1) randomly and uniformly chosen
from interval 0, 1) - Define yj ?exp( j ? )?
- Let i be so that ?exp( i - 1 ? )? lt x ?exp( i
? )? - The expected performance ratio is
25EXAMPLE
Player 1
Player 2
x is chosen, please provide a guess
? 0.419
?exp(?)? 1
Smaller than x, next guess
?exp(?1)? 4
Smaller than x, next guess
?exp(?2)? 11
Smaller than x, next guess
?exp(?3)? 30
Smaller than x, next guess
?exp(?4)? 83
STOP! x 48
Performance Ratio 129 / 48 2.6875
26EXAMPLE
Player 1
Player 2
x is chosen, please provide a guess
? 0.866
?exp(?)? 2
Smaller than x, next guess
?exp(?1)? 6
Smaller than x, next guess
?exp(?2)? 17
Smaller than x, next guess
?exp(?3)? 47
Smaller than x, next guess
?exp(?4)? 129
STOP! x 63
Performance Ratio 201 / 63 3.190476
27EXAMPLE
Player 1
Player 2
x is chosen, please provide a guess
? 0.195
?exp(?)? 1
Smaller than x, next guess
?exp(?1)? 3
Smaller than x, next guess
STOP! x 6
?exp(?2)? 8
Performance Ratio 12 / 6 2.0000
28A Deterministic Worst CaseLower Bound Intuitive
Explanation
- The second player cannot
- choose yi1 to be
- too large compared to yi
If after may times the second player makes such
choices
x yi 1 ltlt yi1 (yi1 yi)/x is already a
large number
For some yj y1y2yj-1 gtgt yj
The second player always selects yi1 to be not
much larger than yi
The choice x yj is bad for second player,
since (y1y2.yj)/yj is large
29Summary
- Simple game illustrating the power of
randomization. - The full analysis of the game is presented in
Teaching the Power of Randomization Using Simple
Game, SIGCSE 2006 (Y. Kortsarts, J. Rufinus) - Teaching the game
- Introduction to Computer Science I and II
- Design and Analysis of Algorithms
- Undergraduate Research Projects
30Summary
- The game is well-motivated from the point of view
of modern scheduling research - Even though this specific game seems not to have
been studied before, the techniques illustrated
here have been used in a series of papers on
approximating scheduling problems 11, 7, 8, 4.
These papers study the fast scheduling of
conflicting jobs with the goal of minimizing the
sum of finish times of these jobs. Hence, the
suggested game is at the heart of modern research.
31Advanced Algorithms in Introductory CS Curriculum
- Las Vegas - always gives the correct solution.
- Monte Carlo - may sometimes produce an incorrect
solution - How (and why) to Introduce Monte Carlo
Randomized Algorithms Into a Basic Algorithms
Course?, Y. Kortsarts, J. Rufinus, Journal of
Computing Sciences in Colleges, December 2005 - Integrating a real-world scheduling problem into
the basic algorithms course, Yana Kortsarts,
Journal of Computing Sciences in Colleges, June
2007
32Advanced Algorithms in Introductory CS Curriculum
- Merkle-Hellman Knapsack Cryptosystem 31
- Elegant and beautiful underlying mathematics
- Due to its simple structure, the knapsack
cryptosystem is an ideal model for introducing
algorithmic techniques and a concept of Public
Key cryptosystem to computer science students - Sequence Alignment 32, 33
- Needleman and Wunsch Algorithm (Global Alignment)
- Smith-Waterman Algorithm (Local Alignment)
33Knapsack Cryptosystem in Computer Science
Curriculum
Cryptology
Design and Analysis of Algorithms
Introduction to Computer Science
Concept of Public Key Cryptosystem
- Knapsack Problem
- Subset-Sum Problem
- Algorithmic Techniques
- Concept of Public Key
- Cryptosystem
- Computational Problems
- Prime Numbers
- GCD, Euclidian Algorithm
- Modular Exponentiation
- Primitive Roots for Primes
Undergraduate Student Research Projects
34Sequence Alignment
- Global Alignment compare two sequences in their
entirety the gap penalty is assessed regardless
of whether gaps are located internally within a
sequence, or at the end of one or both sequences.
- The Needleman and Wunsch Algorithm.
- Local Alignment find best matching subsequences
within the two search sequences. - The Smith-Waterman Algorithm.
35REFERENCES
- 1 S. Arora, C. Lund, R. Motwani, M. Sudan and
M. Szegedy. Proof verication and the hardness of
approximation problems. Journal of ACM,
45(3)501-555, 1998. - 2 G. J. Brebner and L. G. Valiant, Universal
schemes for parallel communication. Proceedings
of the thirteenth annual ACM symposium on Theory
of computing, Pages 263 - 277, 1981 - 3 T. H. Cormen, C. E. Leiserson, and R. L.
Rivest. Introduction to algorithms. The MIT
Press, 2nd edition, 2001. - 4 S. Chakrabarti, C. A. Phillips, A. S. Schulz,
D. B. Shmoys, C. Stein and J. Wein. Improved
scheduling algorithms for-minsum criteria. ICALP
'96, 875-886. - 5 A. Fiat, R. M. Karp, M. Luby, L. A. McGeoch,
D. D. Sleator, and N. E. Young, Competitive
paging algorithms. Journal of Algorithms archive
Volume 12(4) 685 - 699 1991 - 6 O. Goldreich, S. Micali, and A. Wigderson.
Proofs that yield nothing but their validity or
all languages in NP have zero-knowledge proof
systems. Journal of the ACM, 38(3)690 - 728,
1991
36REFERENCES
- 7 L. A. Hall, D. B. Shmoys, and J. Wein.
Scheduling to minimize average completion time
O-line and on-line algorithms. SODA'96, 142-151.
42-151, Jan 1996. - 8 L. A. Hall, A. Schulz, D. B. Shmoys, and J.
Wein. Scheduling to minimize average completion
time O-line nd on-line approximation algorithms.
Math. Operations Research 22513-544, 1997. - 9 G. Kalai, A subexponential randomized simplex
algorithm, Proceedings of the twenty-fourth
annual ACM symposium on Theory of computing, 475
- 482, 1992 - 10 R. M. Karp, E. Upfal and A. Wigderson.
Constructing a perfect matching is in random NC.
Combinatorica Volume 6(1)35-48, 1986 - 11 M. Queyranne, M. Sviridenko. Approximation
algorithms for shop scheduling problems with
minsum objective. J. Scheduling 5287-305, 2002. - 12 R. L. Rivest, A. Shamir, L. M. Adleman, A
Method for Obtaining Digital Signatures and
Public-Key Cryptosystems. Commun. ACM
21(2)120-126, 1978
37REFERENCES
- 13 N. Alon and R Yuster and U Zwick.
Color-coding Journal of the ACM, 42(4)844 - 856 - 14 A. Bjorklund, T. Husfeldt and S. Khanna.
Approximating Longest Directed Path. Symposium on
Automata, Languages and Programming (ICALP) 2004,
to appear. - 15 D. Dor, U. Zwick, Selecting the Median, SIAM
J. Comput, 28(5) 1722-1758, 1999. - 16 D. Dor and U. Zwick, Median Selection
Requires (2epsilon)n Comparisons, SIAM Journal
on Discrete Mathematics, 14(3)312-325
38REFERENCES
- 17 R. W. Floyd and R. L. Rivest Expected time
bounds for selection Communications of the ACM,
18(3)165 - 172, 1975. - 18 T. Feder, R. Motwani, C. Subi. Finding long
paths and cycles in sparse Hamiltonian graphs
Proceedings of the ACM symposium on Theory of
computing, pages 524 - 529, 1999 - 19 H. Gabow, Finding paths and cycles of
superpolylogarithmic size. - Proceedings of the ACM symposium on
Theory of computing, pages 407-416, 2004. - 20 M. T. Goodrich and R. Tamassia. Using
randomization in the teaching of data structures
and algorithms, The proceedings of the thirtieth
SIGCSE technical symposium on Computer science
education, 53 - 57, 1999 - 21 D. Karger, R. Motwani, and G.D.S. Ramkumar.
On Approximating the Longest Path in a Graph.
Algorithmica 18 (1997) 82-98.
39REFERENCES
- 22 R. M. Karp. Reducibility among combinatorial
problems, R. E. Miller and J. W. Thatcher, eds.,
Complexity of Computer Computations, Plenum
Press, New York, 1972, pp. 85-103. - 23 M. O. Rabin Probabilistic algorithm for
testing primality, J. Number Theory, 12, 128-138,
1980. - 24 N. Robertson and P. Seymour, Graph minors.
II. Algorithmic aspects of tree-width. J. - Algorithms 7, 1986.
- 25 R. Motwani and P. Raghavan, Randomized
Algorithms, Cambridge University Press, 1995 - 26 R.M. Karp, An Introduction to randomized
algorithms, Discrete Applied Mathematics, 34
165-201, 1991
40REFERENCES
- 27 D.R.Karger, Global min-cuts in RNC, and
other ramifications of a simple min-cut
algorithm, In Proceedings of the 4th Annual
ACM-SIAM Symposium on Discrete Algorithms, pp.
21- 30, 1993. - 28 M. J. Quinn, Parallel Programming in C with
MPI and OpenMP, McGraw-Hill, 2004 - 29 Y. Kortsarts, J. Rufinus, Teaching the Power
of Randomization Using Simple Game, SIGCSE 2006 - 30 Y. Kortsarts, J. Rufinus, How (and why) to
Introduce Monte Carlo Randomized Algorithms Into
a Basic Algorithms Course?, Journal of Computing
Sciences in Colleges, 2005
41REFERENCES
- 31 R. C. Merkle, M. E. Hellman, Hiding
Information - and Signatures in Trapdoor Knapsacks,
IEEE - Transactions on Information Theory, vol.
IT-24, 1978, pp. 525-530. - 32 An Introduction to Bioinformatics
Algorithms, - N.C. Jones and P. A. Pevzner, The MIT
Press, 2004 - 33 Fundamental Concepts of Bioinformatics,
- D. E. Krane and M . L. Raymer,
Publisher - Benjamin Cummings, 2002