Title: Case Studies
1Case Studies
Experiencing Cluster Computing
2Description
- Download the source from
- http//www.sci.hkbu.edu.hk/tdgc/tutorial/ExpCluste
rComp/casestudy/casestudy.zip
- Unzip the package
- Follow the instructions from each example
3Hello World
4Hello World
- The sample program uses MPI and has each MPI
process print
- Hello world from process i of n
- using the rank in MPI_COMM_WORLD for i and the
size of MPI_COMM_WORLD for n. You can assume that
all processes support output for this example.
- Note the order that the output appears in.
Depending on your MPI implementation, characters
from different lines may be intermixed. A
subsequent exercise (I/O master/slaves) will show
how to order the output. - You may want to use these MPI routines in your
solution
- MPI_Init, MPI_Comm_size, MPI_Comm_rank,
MPI_Finalize
5Hello World
- Source
- casestudy/helloworld/helloworld.c
- casestudy/helloworld/Makefile
- Compile and run
- mpicc -o helloworld helloworld.c
- mpirun -np 4 helloworld
- Sample output
- Hello world from process 0 of 4
- Hello world from process 3 of 4
- Hello world from process 1 of 4
- Hello world from process 2 of 4
6Sending in a Ring
7Sending in a Ring
- The sample program that takes data from process
zero and sends it to all of the other processes
by sending it in a ring. That is, process i
should receive the data and send it to process
i1, until the last process is reached. - Assume that the data consists of a single
integer. Process zero reads the data from the
user.
- You may want to use these MPI routines in your
solution
- MPI_Send, MPI_Recv
8Sending in a Ring
9Sending in a Ring
- Source
- casestudy/ring/ring.c
- casestudy/ring/Makefile
- Compile and run
- mpicc -o ring ring.c
- mpirun -np 4 ring
- Sample Output
- 10
- Process 0 got 10
- 22
- Process 0 got 22
- -1
- Process 0 got -1
- Process 3 got 10
- Process 3 got 22
- Process 3 got -1
- Process 2 got 10
- Process 2 got 22
- Process 2 got -1
10Finding PI using MPI collective operations
11Finding PI using MPI collective operations
- The method evaluates PI using the integral of
4/(1xx) between 0 and 1. The integral is
approximated by a sum of n intervals
- The approximation to the integral in each
interval is (1/n)4/(1xx).
- The master process asks the user for the number
of intervals
- The master then broadcast this number to all of
the other processes.
- Each process then adds up every n'th interval (x
0rank/n, 0rank/nsize/n,...).
- Finally, the sums computed by each process are
added together using a reduction.
12Finding PI using MPI collective operations
- Source
- casestudy/pi/pi.c
- casestudy/pi/Makefile
- Sample Output
- Enter the number of intervals (0 quits) 100
- pi is approximately 3.1416009869231249, Error is
0.0000083333333318
- Enter the number of intervals (0 quits) 1000
- pi is approximately 3.1415927369231262, Error is
0.0000000833333331
- Enter the number of intervals (0 quits) 10000
- pi is approximately 3.1415926544231256, Error is
0.0000000008333325
- Enter the number of intervals (0 quits) 100000
- pi is approximately 3.1415926535981269, Error is
0.0000000000083338
- Enter the number of intervals (0 quits) 1000000
- pi is approximately 3.1415926535898708, Error is
0.0000000000000777
- Enter the number of intervals (0 quits)
10000000
- pi is approximately 3.1415926535897922, Error is
0.0000000000000009
13Implementing Fairness using Waitsome
14Implementing Fairness using Waitsome
- Write a program to provide fair reception of
message from all sending processes. Arrange the
program to have all processes except process 0
send 100 messages to process 0. Have process 0
print out the messages as it receives them. Use
nonblocking receives and MPI_Waitsome. - Is the MPI implementation fair?
- You may want to use these MPI routines in your
solution
- MPI_Waitsome, MPI_Irecv, MPI_Cancel
15Implementing Fairness using Waitsome
- Source
- casestudy/fairness/fairness.c
- casestudy/fairness/Makefile
- Sample Output
- Msg from 1 with tag 0
- Msg from 1 with tag 1
- Msg from 1 with tag 2
- Msg from 1 with tag 3
- Msg from 1 with tag 4
-
- Msg from 2 with tag 21
- Msg from 1 with tag 55
- Msg from 2 with tag 22
- Msg from 1 with tag 56
16Master/slave
17Master/slave
- Message passing is well-suited to handling
computations where a task is divided up into
subtasks, with most of the processes used to
compute the subtasks and a few processes (often
just one process) managing the tasks. The manager
is called the "master" and the others the
"workers" or the "slaves". - In this example, it is to build an Input/Output
master/slave system. This will allow you to
relatively easily arrange for different kinds of
input and output from the program, including - Ordered output (process 2 after process 1)
- Duplicate removal (a single instance of "Hello
world" instead of one from each process)
- Input to all processes from a terminal
18Master/slave
- This will be accomplished by dividing the
processes in MPI_COMM_WORLD into two sets
- The master (who will do all of the I/O) and the
slaves (who will do all of their I/O by
contacting the master).
- The slaves will also do any other computation
that they might desire for example, they might
implement the Jacobi iteration.
- The master should accept messages from the slaves
(of type MPI_CHAR) and print them in rank order
(that is, first from slave 0, then from slave 1,
etc.). The slaves should each send 2 messages to
the master. For simplicity, Have the slaves send
the messages - Hello from slave 3
- Goodbye from slave 3
- You may want to use these MPI routines in your
solution
- MPI_Comm_split, MPI_Send, MPI_Recv
19Master/slave
- Source
- casestudy/io/io.c
- casestudy/io/Makefile
- Sample Output
- mpicc -o io io.c
- mpirun -np 4 io
- Hello from slave 0
- Hello from slave 1
- Hello from slave 2
- Goodbye from slave 0
- Goodbye from slave 1
- Goodbye from slave 2
20A simple output server
21A simple output server
- Modify the previous example accept three types of
messages from the slaves. These types are
- Ordered output (just like the previous exercise)
- Unordered output (as if each slave printed
directly)
- Exit notification (see below)
- The master continues to receive messages until it
has received an exit message from each slave. For
simplicity in programming, have each slave send
the messages - Hello from slave 3
- Goodbye from slave 3
- and
- I'm exiting (3)
- You may want to use these MPI routines in your
solution
- MPI_Comm_split, MPI_Send, MPI_Recv
with the ordered output mode
with the unordered output mode
22A simple output server
- Source
- casestudy/io2/io2.c
- casestudy/io2/Makefile
- Sample Output
- mpicc -o io2 io2.c
- mpirun -np 4 io2
- Hello from slave 0
- Hello from slave 1
- Hello from slave 2
- Goodbye from slave 0
- Goodbye from slave 1
- Goodbye from slave 2
- I'm exiting (0)
- I'm exiting (2)
- I'm exiting (1)
23Benchmarking collective barrier
24Benchmarking collective barrier
- The sample program measures the time it takes to
perform an MPI_Barrier on MPI_COMM_WORLD.
- It will print the size of MPI_COMM_WORLD and time
for each test and make sure that both sender and
receiver are ready when the test begin.
- How does the performance of MPI_Barrier vary with
the size of MPI_COMM_WORLD?
25Benchmarking collective barrier
- Source
- casestudy/barrier/barrier.c
- casestudy/barrier/Makefile
- Sample Output
- mpirun -np 1 barrier
- Kind np time (sec)
- Barrier 1 0.000000
- Barrier 5 0.000212
- Barrier 10 0.000258
- Barrier 15 0.000327
- Barrier 20 0.000401
- Barrier 40 0.000442
26Determining the amount of MPI buffering
27Determining the amount of MPI buffering
- The sample program determines the amount of
buffering that MPI_Send provides. That it,
determining how large a message can be sent with
MPI_Send without a matching receive at the
destination. - You may want to use these MPI routines in your
solution
- MPI_Wtime, MPI_Send, MPI_Recv
28Determining the amount of MPI buffering
- Hint
- Use MPI_Wtime to establish a delay until an
MPI_Recv is called at the destination process. By
timing the MPI_Send, you can detect when the
MPI_Send was waiting for the MPI_Recv - Source
- casestudy/buflimit/buflimit.c
- casestudy/buflimit/Makefile
29Determining the amount of MPI buffering
- Sample Output
- mpirun -np 2 buflimit
- Process 0 on tdgrocks.sci.hkbu.edu.hk
- Process 1 on comp-pvfs-0-1.local
- 0 received 1024 fr 1
- 1 received 1024 fr 0
- 0 received 2048 fr 1
- 1 received 2048 fr 0
- 0 received 4096 fr 1
- 1 received 4096 fr 0
- 0 received 8192 fr 1
- 1 received 8192 fr 0
- 0 received 16384 fr 1
- 1 received 16384 fr 0
- 0 received 32768 fr 1
- 1 received 32768 fr 0
- MPI_Send blocks with buffers of size 65536
- 0 received 65536 fr 1
- 1 received 65536 fr 0
30Exploring the cost of synchronization delays
31Exploring the cost of synchronization delays
- In this example, 2 processes are communicating
with a third.
- Process 0 is sending a long message to process 1
and process 2 is sending a relatively short
message to process 1 and then to process 0.
- The code is arranged so that process 1 has
already posted an MPI_Irecv for the message from
process 2 before receiving the message from
process 0, but also ensure that process 1
receives the long message from process 0 before
receiving the message from process 2.
32Exploring the cost of synchronization delays
- This seemingly complex communication pattern but
can occur in an application due to timing
variations on each processor.
- If the message sent by process 2 to process 1 is
short but long enough to require a rendezvous
protocol (meeting point), there can be a
significant delay before the short message from
process 2 is received by process 1, even though
the receive for that message is already
available. - Explore the possibilities by considering various
lengths of messages.
33Exploring the cost of synchronization delays
34Exploring the cost of synchronization delays
- Source
- casestudy/bad/bad.c
- casestudy/bad/Makefile
- Sample Output
- mpirun -np 3 maxtime
- 2 Litsize 1, Time for first send 0.000020,
for second 0.000009
35Graphics
36Graphics
- A simple MPI example program that uses a number
of procedures in the MPE graphics library.
- The program draws lines and squares with
different colors in graphic mode.
- User can select a region and the program will
report the selected coordination.
37Graphics
- Source
- casestudy/graph/mpegraph.c
- casestudy/graph/Makefile
38GalaxSee
39GalaxSee
- The GalaxSee program lets the user model a number
of bodies in space moving under the influence of
their mutual gravitational attraction.
- It is effective for relatively small numbers of
bodies (on the order of a few hundred), rather
than the large numbers (over a million) currently
being used by scientists to simulate galaxies. - GalaxSee allows the user to see the effects that
various initial configurations (mass, velocity,
spacial distribution, rotation, dark matter, and
presence of an intruder galaxy) have on the
behavior of the system.
40GalaxSee
- Command line options
- num_stars star_mass t_final do_display.
-
- where
- num_stars the number of stars (integer),
- star_mass star mass (decimal),
- t_final final time for the model in Myears
(decimal).
- do_display enter a 1 to show a graphical
display,
- or a 0 to not show a graphical display.
41GalaxSee
- Source
- casestudy/galaxsee/Gal_pack.tgz
- Reference
- http//www.shodor.org/master/galaxsee/
42Cracking RSA
43Cryptanalysis
- Cryptanalysis is the study of how to compromise
(defeat) cryptographic mechanisms, and cryptology
is the discipline of cryptography and
cryptanalysis combined. - To most people, cryptography is concerned with
keeping communications private. Indeed, the
protection of sensitive communications has been
the emphasis of cryptography throughout much of
its history.
44Encryption and Decryption
- Encryption is the transformation of data into a
form that is as close to impossible as possible
to read without the appropriate knowledge (a key
see below). Its purpose is to ensure privacy by
keeping information hidden from anyone for whom
it is not intended, even those who have access to
the encrypted data. - Decryption is the reverse of encryption it is
the transformation of encrypted data back into an
intelligible form.
45Cryptography
- Today's cryptography is more than encryption and
decryption. Authentication is as fundamentally a
part of our lives as privacy.
- We use authentication throughout our everyday
lives - when we sign our name to some document
for instance - and, as we move to a world where
our decisions and agreements are communicated
electronically, we need to have electronic
techniques for providing authentication.
46Public-Key vs. Secret-Key Cryptography
- A cryptosystem is simply an algorithm that can
convert input data into something unrecognizable
(encryption), and convert the unrecognizable data
back to its original form (decryption). - To encrypt, feed input data (known as
"plaintext") and an encryption key to the
encryption portion of the algorithm.
- To decrypt, feed the encrypted data (known as
"ciphertext") and the proper decryption key to
the decryption portion of the algorithm. The key
is simply a secret number or series of numbers.
Depending on the algorithm, the numbers may be
random or may adhere to mathematical formulae.
47Public-Key vs. Secret-Key Cryptography
- The drawback to secret-key cryptography is the
necessity of sharing keys.
- For instance, suppose Alice is sending email to
Bob. She wants to encrypt it first so any
eavesdropper will not be able to understand the
message. But if she encrypts using secret-key
cryptography, she has to somehow get the key into
Bob's hands. If an eavesdropper can intercept a
regular message, then an eavesdropper will
probably be able to intercept the message that
communicates the key.
48Public-Key vs. Secret-Key Cryptography
- In contrast to secret-key is public-key
cryptography. In such a system there are two
keys, a public key and its inverse, the private
key. - In such a system when Alice sends email to Bob,
she finds his public key (possibly in a directory
of some sort) and encrypts her message using that
key. Unlike secret-key cryptography, though, the
key used to encrypt will not decrypt the
ciphertext. Knowledge of Bob's public key will
not help an eavesdropper. To decrypt, Bob uses
his private key. If Bob wants to respond to
Alice, he will encrypt his message using her
public key.
49The One-Way Function
- The challenge of public-key cryptography is
developing a system in which it is impossible (or
at least intractable) to deduce the private key
from the public key. - This can be accomplished by utilizing a one-way
function. With a one-way function, given some
input values, it is relatively simple to compute
a result. But if you start with the result, it is
extremely difficult to compute the original input
values. In mathematical terms, given x, computing
f(x) is easy, but given f(x), it is extremely
difficult to determine x.
50RSA
- The RSA cryptosystem is a public-key cryptosystem
that offers both encryption and digital
signatures (authentication). Ronald Rivest, Adi
Shamir, and Leonard Adleman developed the RSA
system in 1977 RSA78 RSA stands for the first
letter in each of its inventors' last names.
51RSA Algorithm
- The RSA algorithm works as follows
- Take two large primes, p and q, and compute their
product n pq n is called the modulus.
- Choose a number, e, less than n and relatively
prime to (p-1)(q-1), which means e and (p-1)(q-1)
have no common factors except 1.
- Find another number d such that (ed - 1) is
divisible by (p-1)(q-1). The values e and d are
called the public and private exponents,
respectively. - The public key is the pair (n, e) the private
key is (n, d). The factors p and q may be
destroyed or kept with the private key.
52RSA Algorithm
- It is currently difficult to obtain the private
key d from the public key (n, e). However if one
could factor n into p and q, then one could
obtain the private key d. Thus the security of
the RSA system is based on the assumption that
factoring is difficult.
53Encryption
- Suppose Alice wants to send a message m to Bob.
- Alice creates the ciphertext c by exponentiating
c me mod n, where e and n are Bob's public key.
She sends c to Bob.
- To decrypt, Bob also exponentiates m cd mod n
the relationship between e and d ensures that Bob
correctly recovers m.
- Since only Bob knows d, only Bob can decrypt this
message.
54Digital Signature
Suppose Alice wants to send a message m to Bob in
such a way that Bob is assured the message is
both authentic, has not been tampered with, and
from Alice.
- Alice creates a digital signature s by
exponentiating s md mod n, where d and n are
Alice's private key. She sends m and s to Bob.
- To verify the signature, Bob exponentiates and
checks that the message m is recovered m se
mod n, where e and n are Alice's public key.
55Encryption
- Thus encryption and authentication take place
without any sharing of private keys
- each person uses only another's public key or
their own private key.
- Anyone can send an encrypted message or verify a
signed message, but only someone in possession of
the correct private key can decrypt or sign a
message.
56What would it take to break the RSA cryptosystem?
- The obvious way to do this attack is to factor
the public modulus, n, into its two prime
factors, p and q. From p, q, and e, the public
exponent, the attacker can easily get d, the
private exponent. The hard part is factoring n
the security of RSA depends on factoring being
difficult. - You can use d to factor n, as well as use the
factorization of n to find d.
57What would it take to break the RSA cryptosystem?
- Another way to break the RSA cryptosystem is to
find a technique to compute eth roots mod n.
Since c me mod n, the eth root of c mod n is
the message m. This attack would allow someone to
recover encrypted messages and forge signatures
even without knowing the private key. This attack
is not known to be equivalent to factoring. No
general methods are currently known that attempt
to break the RSA system in this way. However, in
special cases where multiple related messages are
encrypted with the same small exponent, it may be
possible to recover the messages.
58What would it take to break the RSA cryptosystem?
- Some people have also studied whether part of the
message can be recovered from an encrypted
message.
- The simplest single-message attack is the guessed
plaintext attack. An attacker sees a ciphertext
and guesses that the message might be, for
example, "Attack at dawn," and encrypts this
guess with the public key of the recipient and by
comparison with the actual ciphertext, the
attacker knows whether or not the guess was
correct. Appending some random bits to the
message can thwart this attack.
59What would it take to break the RSA cryptosystem?
- Of course, there are also attacks that aim not at
the cryptosystem itself but at a given insecure
implementation of the system
- These do not count as "breaking" the RSA system,
because it is not any weakness in the RSA
algorithm that is exploited, but rather a
weakness in a specific implementation. - For example, if someone stores a private key
insecurely, an attacker may discover it. One
cannot emphasize strongly enough that to be truly
secure, the RSA cryptosystem requires a secure
implementation mathematical security measures,
such as choosing a long key size, are not enough.
In practice, most successful attacks will likely
be aimed at insecure implementations and at the
key management stages of an RSA system.
60How much does it cost to factor a large number?
61The RSA Challenge Numbers
- The currently challenge number is 640 bits ?193
digits
- A link to each of the eight RSA challenge numbers
is listed below.
- US 20,000 will be given to those who factored
the number RSA-640.
- Reference
- http//www.rsasecurity.com/rsalabs/node.asp?id209
3
62RSA Cracker
- Serial Randomized Brute Force Attack
- Source
- casestudy/rsa2/rsa2.c
- Reference
- http//www.daimi.au.dk/aveng/projects/rsa/
- Parallelized version?
- Do it yourself!!
- The programming structure is given to you.
- rsa2/Makefile
- rsa2/popsys.c
63A Parallel Implementation of the Quadratic Sieve
Algorithm
- The purpose of the project is to implement a
parallel version of the quadratic sieve algorithm
used for factoring large composite integers.
- Source
- casestudy/mpqs/mpqs_parallel.tgz
- Reference
- http//www.daimi.au.dk/pmn/scf02/CDROM/pr2/
64END