Title: Beowulf Cluster Computing
1Beowulf Cluster Computing
PROJECT We constructed a parallel processing
computer system using the Beowulf cluster
computing design created at NASA in an attempt to
build a powerful computer that could assist in
Bioinformatics research and data
analysis. BEOWULF CLUSTERS A Beowulf Cluster
is a computer design that uses parallel
processing across multiple computers to create
cheap and powerful supercomputers. A Beowulf
Cluster in practice is usually a collection of
generic computers, either stock systems or
wholesale parts purchased independently and
assembled, connected through an internal
network. A cluster has two types of computers,
a master computer, and node computers. When a
large problem or set of data is given to a
Beowulf cluster, the master computer first runs a
program that breaks the problem into small
discrete pieces it then sends a piece to each
node to compute. As nodes finish their tasks,
the master computer continually sends more pieces
to them until the entire problem has been
computed. MPICH2 In order for
the master and node computers to communicate,
some sort message passing control structure is
required. MPI,(Message Passing Interface) is the
most commonly used such control, and the one that
we've incorporated into our project. MPICH2 is a
implementation of MPI that was specifically
designed for use with cluster computing systems
and parallel processing. It is an open source
set of libraries for various high level
programming languages that give programmers tools
to easily control how large problems are broken
apart and distributed to the various computers in
a cluster. OUR CLUSTER Using funding from
the Biology department, the cluster we
constructed contains eight computers with one
master and seven node computers. Each computer
in the cluster contains a dual core processor,
giving us a total of 16 processors to utilize.
Each runs on the Fedora Core 6 version of Linux
and uses the MPICH2 libraries for message
passing. They are all connected on a internal
network through a high speed gigabyte switch.
CLUSTER USES Clusters have a variety of
different applications in the world. They are
used in bioinformatics to run DNA string matching
algorithms or to run protein folding
applications. Geologists also use clusters to
emulate and predict earthquakes and model the
interior of the Earth and sea floor Clusters are
even used to render and manipulate
high-resolution graphics in engineering. Our
completed Beowulf cluster will use a computer
algorithm known as BLAST,(Basic Local Alignment
Search Tool), to analyze massive sets of DNA
sequences for research into Bioinformatics.
RESULTS The total processing power of our
cluster has yet to be determined. Once the
cluster has been completely streamlined and
stabilized, we will run benchmark tests to
calculate its average and peak performances CLUSTE
R LAYOUT AND DESIGN
Researcher Ben Case Researcher Stephen
Ciesla Advisor Ed Harcout Biology Consultant
Lorraine Olendzenski
Sample Cluster Computer
SATA Hard Drives
2 GB RAM
- Each Computer in the cluster is equipped with
- Intel Core 2 Duo 6400 Processor(Master Core 2
Duo 6700) - 2 Gigabytes of DDR RAM in Dual Channel
- D-Link Gigabyte Network Interface Card(Master 2x
Cards) - 60 Gigabyte Hard Drive(Master 1000 Gigabyte RAID
5)
Intel Core 2 Processor
D-Link Network Card