Title: Distributed Computing
1Distributed Computing
- Utilize unused PC resources
- Processing
- Complex calculations
- Load distribution
- 25 of storage is unused
- SANs
- 100 computers 80gb drives 6tb unused
2Process Sharing Applications
- For large-scale computations
- Data analysis, data mining, scientific computing
- Research Problems
- SETI_at_Home
- Folding_at_Home
- distributed.net
- Genome_at_Home
- FightAIDS_at_Home
- climate simulation
- Economics
- medicine
3Distributed Computing
- P2P is not distributed computing similar
challenges and issues from sharing and taking
advantage of resources available at endpoints and
harnessing their power for computationally
intensive problems - SETI_at_home, fightaids_at_home, genome_at_home
- Grid computing and e-science
- Computational grids to solve/simulate real-life
problems - E-Science
- Commercial applications
- United Devices, Entropia, Avaki, etc.
4Distributed Computing
A central coordinator schedules tasks on
volunteer computers, Master worker
paradigm, Cycle stealing
- Dedicated Applications
- SETI_at_Home, distributed.net,
- Décrypthon (France)
- Production applications
- Folding_at_home, Genome_at_home,
- Xpulsar_at_home,Folderol,
- Exodus, Peer review,
- Research Platforms
- Javelin, Bayanihan, JET,
- Charlotte (based on Java),
- Commercial Platforms
- Entropia, Parabon,
- United Devices, Platform (AC)
Client application Params. /results.
Coordinator
Parameters
Internet
Volunteer PC
Volunteer PC Downloads and executes the
application
Volunteer PC
5(No Transcript)
6Cycle Sharing Model
- Chunks of data are sent to client in suspend mode
- Data is processed by clients when client is not
in use and returned to the master - Internet-based (Master-slave) computing
- Example SETI_at_Home scans radio telescope images
7SETI_at_HOME
Client/Server P2P
- Launched In 1996
- Scientific experiment - uses Internet-connected
computers in the Search for Extraterrestrial
Intelligence (SETI) - Distributes a screen saverbased application to
users - Applies signal analysis algorithms different
data sets to process radio-telescope data. - Has more than 3 million users - used over a
million years of CPU time to date
1. Install Screen Server
SETI_at_Home Main Server
4. Client sends results back to server
Radio-telescope Data
2. SETI client (screen Saver) starts
8Distributed Computing SETI_at_home
- Search for Extraterrestrial Intelligence that has
over two million computers crunching away and
downloading data gathered from the Arecibo radio
telescope in Puerto Rico - The SETI_at_Home project is widely regarded as the
fastest computer in the world - In fact, the project has already performed the
single largest cumulative computation to date - From the architecture point of view Seti_at_Home is
based upon client-server - The centralised servers hold enormous amounts of
data gathered from the Arecibo radio telescope
"listening" to the skies - That data needs to be analysed for distinct or
unusual radio waves that might suggest
extraterrestrial communications - http//setiathome.ssl.berkeley.edu
9SETI_at_Home
- Search for Extraterrestrial Intelligence
10(No Transcript)
11Processing
- Intels Netbatch
- 10,000 workstations over 25 locations
- Chip design
- Shortened time for chip development
- Reduced outlay for new mainframes
- 500 million savings
12Processing
- Amerada Hess
- Connects 200 Dell PCs to handle complex seismic
data interpretation - Allowed them to replace a pair of IBM
supercomputers.
Were running seven times the throughput at a
fraction of the cost.
Richard Ross, CIO
13Storage
- Intel
- Distribution of computer-based training
- Prevents large downloads from central servers
- Preserves bandwidth
- Preserves expensive network storage
14P2P Distributed Computing
- Allows any node to play different roles (client,
server, system infrastructure)
Client (PC)
Server (PC)
accept
request
PC
Potential communications for parallel
applications
PC
PC
result
provide
PC
PC
P2P system
Client (PC)
request
accept
PC
PC
PC
result
Server (PC)
PC
provide
Request may be related to Computations or data
Accept concerns computation or data
A very simple problem statement but leading to a
lot of research issues scheduling, security,
message passing, data storage Large Scale
enlarges the problematic volatility, confidence,
etc.
15Three Obstaclesto Making P2P Distributed
Computing Routine
- New approaches to problem solving
- Data Grids, distributed computing, peer-to-peer,
collaboration grids, - Structuring and writing programs
- Abstractions, tools
- Enabling resource sharing across distinct
institutions - Resource discovery, access, reservation,
allocation authentication, authorization,
policy communication fault detection and
notification
Credit Ian Foster
16P2P for Distributed Computing or Web Computing
- The distributed computing P2P applications are
highlighted by the use of millions of Internet
clients to analyze data looking for
extraterrestrial life (SETI_at_home
http//setiathome.ssl.berkeley.edu/ ) and the - Newer project examining the folding of proteins (
Folding_at_home http//www.stanford.edu/group/pandegr
oup/Cosm/ ). - These are building distributed computing
solutions for a special class of applications - Those that can be divided into a huge number of
essentially independent computations, and a
central server system doles out separate work
chunks to each participating client. - In the parallel computing community, these
problems are called "pleasingly or embarrassingly
parallel". - This approach is included in the P2P category
because the computing is Peer based even though
it does not have the "Peer only communication"
characteristic of all aspects of Gnutella and
Napster for information transfer. - SETI_at_home and Folding_at_home are elegantly
implemented as screen savers that you download.
17P2P space Distributed Computing
- Distributed Collaboration
- Use under utilized Internet and/or network
resources for improving computation and data
analysis - MetaComputing, CareScience, DataSynapse,
Distributed.net, DistributedScience,
Entropia, Parabon, The Open Lab - Distributed Search Engines
- Used to easily lookup and share files and offer
content management - BearShare, Filetopia, Hotline Connect,
InfraSearch, Plebio, Jibe, LimeWire,
MusicBrainz.org, NeuroGrid, NextPage, Redfoot,
Opencola, Project Pandango
18Entropia Financial Modeling I
19Entropia Financial Modeling II
- Each basic financial instrument can be calculated
independently - Central Server interprets the total simulation
- Make Money or Learn what causes market swings or
.
20Drug Structure Simulations
21United Devices also does DrugSimulation
- Parameter Study do billions of simulations
each with different parameters - Search Engine like interface to simulation
- Works as each calculation fits in a PC a
detailed molecular model would usually not do this
22Performance of Entropia Network
23Server
Server
Server
Server
Server
Server
Peer to Peer P2P Illusion among collaborating
clients For Napster like Services or Collaboration