Title: Build A Cluster
1Build A Cluster
2What Well Be Doing
- Overview of Rocks
- Bring up a frontend with Rocks
- Bring up compute nodes
3Overview of Rocks
4Rocks
- Goal Make clusters easy
- Target audience Scientists who want a capable
computational resource in their own lab
5Philosophy
- Not fun to care and feed for a system
- All compute nodes are 100 automatically
installed - Critical for scaling
- Essential to track software updates
- RHEL 3.0 issued 225 source RPM updates since Oct
21 - Roughly 1 updated SRPM per day
- Run on heterogeneous standard high volume
components - Use the components that offer the best
price/performance!
6More Philosophy
- Use installation as common mechanism to manage a
cluster - Everyone installs a system
- On initial bring up
- When replacing a dead node
- Adding new nodes
- Rocks also uses installation to keep software
consistent - If you catch yourself wondering if a nodes
software is up-to-date, reinstall! - In 10 minutes, all doubt is erased
- Rocks doesnt attempt to incrementally update
software
7Rocks Cluster Distribution
- Fully-automated cluster-aware distribution
- Software Packages
- Full Red Hat Linux distribution
- Red Hat Linux Enterprise 3.0 rebuilt from source
- De-facto standard cluster packages
- Rocks packages
- Rocks community package
- System Configuration
- Configure the services in packages
8Rocks Hardware Architecture
9Processors Supported
- x86 (Pentium/Athlon)
- Opteron
- Itanium
10Interconnects Supported
- Ethernet
- Myrinet
- Infiniband in development
- We recently received IB gear
11Storage
- NFS
- The frontend exports all home directories
- Parallel Virtual File System version 1
- System nodes can be targeted as Compute PVFS or
strictly PVFS nodes
12Minimum Hardware Requirements
- Frontend
- 2 ethernet connections
- 18 GB disk drive
- 512 MB memory
- Compute
- 1 ethernet connection
- 18 GB disk drive
- 512 MB memory
- Power
- Ethernet switches
13Cluster Software Stack
14Rocks Rolls
- Rolls are containers for software packages and
the configuration scripts for the packages - Rolls dissect a monolithic distribution
15Guaranteeing Reliable Software Deployment
- Rolls are added by the Red Hat installer
- Software is added and configured in a known
environment - Benefit can guarantee correct software
functionality
16Red Hat Installer Modified to Accept Rolls
17Status
18But Are Rocks Clusters High Performance Systems?
- 4 clusters on June 2004 Top500 list
- 26, Texas Advanced Computing Center
- 176, Scalable Systems Group, Dell Computer
- 201, SDSC
- 408, UCSD, Center for Theoretical and Biological
Physics
19Who Cares?
- Over 500 subscribers to the Rocks Discussion
List - Over 200 clusters voluntarily registered on the
Rocks Cluster Registration
20(No Transcript)
21Install Compute Nodes
22Login to Frontend
- Create ssh public/private key
- Ask for passphrase
- These keys are used to securely login into
compute nodes without having to enter a password
each time you login to a compute node - Execute insert-ethers
- This utility listens for new compute nodes
23Insert-ethers
- Used to integrate appliances into the cluster
- Well choose Compute
24Boot a Compute Node in Installation Mode
- Instruct the node to network boot
- Network boot forces the compute node to run the
PXE protocol (Pre-eXecution Environment) - Also can use the Rocks Base CD
- If no CD and no PXE-enabled NIC, can use a boot
floppy built from Etherboot (http//www.rom-o-ma
tic.net)
25Insert-ethers Discovers the Node
26Insert-ethers Status
27eKVEthernet Keyboard and Video
- Monitor your compute node installation over the
ethernet network - No KVM required!
- Execute ssh compute-0-0
28Node Info Stored In A MySQL Database
- If you know SQL, you can execute some powerful
commands
29Cluster Database
30Kickstart
- Red Hats Kickstart
- Monolithic flat ASCII file
- No macro language
- Requires forking based on site information and
node type. - Rocks XML Kickstart
- Decompose a kickstart file into nodes and a graph
- Graph specifies OO framework
- Each node specifies a service and its
configuration - Macros and SQL for site configuration
- Driven from web cgi script
31Extra insert-ethers Usage
- If you have more than one cabinet
insert-ethers --cabinet1
insert-ethers --replacecompute-0-0
- To rebuild and restart relevant services
insert-ethers --update
32Sample Node File
lt?xml version"1.0" standalone"no"?gt lt!DOCTYPE
kickstart SYSTEM "_at_KICKSTART_DTD_at_" lt!ENTITY ssh
"openssh"gtgt ltkickstartgt ltdescriptiongt Enable
SSH lt/descriptiongt ltpackagegtsshlt/packagegt
ltpackagegtssh-clientslt/packagegt ltpackagegtssh-s
erverlt/packagegt ltpackagegtssh-askpasslt/packagegt
ltpostgt ltfile name"/etc/ssh/ssh_config"gt Host
CheckHostIP no
ForwardX11 yes ForwardAgent
yes StrictHostKeyChecking
no UsePrivilegedPort no
FallBackToRsh no Protocol
1,2 lt/filegt chmod orx /root mkdir
/root/.ssh chmod orx /root/.ssh lt/postgt lt/kickst
artgtgt
33Sample Graph File
lt?xml version"1.0" standalone"no"?gt lt!DOCTYPE
kickstart SYSTEM "_at_GRAPH_DTD_at_"gt ltgraphgt ltdescrip
tiongt Default Graph for NPACI Rocks. lt/descripti
ongt ltedge from"base" to"scripting"/gt ltedge
from"base" to"ssh"/gt ltedge from"base"
to"ssl"/gt ltedge from"base" to"lilo"
arch"i386"/gt ltedge from"base" to"elilo"
arch"ia64"/gt ltedge from"node" to"base"
weight"80"/gt ltedge from"node"
to"accounting"/gt ltedge from"slave-node"
to"node"/gt ltedge from"slave-node"
to"nis-client"/gt ltedge from"slave-node"
to"autofs-client"/gt ltedge from"slave-node"
to"dhcp-client"/gt ltedge from"slave-node"
to"snmp-server"/gt ltedge from"slave-node"
to"node-certs"/gt ltedge from"compute"
to"slave-node"/gt ltedge from"compute"
to"usher-server"/gt ltedge from"master-node"
to"node"/gt ltedge from"master-node"
to"x11"/gt ltedge from"master-node"
to"usher-client"/gt lt/graphgt
34Kickstart framework
35Appliances
- Laptop / Desktop
- Appliances
- Final classes
- Node types
- Desktop IsA
- standalone
- Laptop IsA
- standalone
- pcmcia
- Code re-use is good
36Architecture Differences
- Conditional inheritance
- Annotate edges with target architectures
- if i386
- Base IsA grub
- if ia64
- Base IsA elilo
- One Graph, Many CPUs
- Heterogeneity is easy
- Not for SSI or Imaging
37Installation Timeline