Title: Ubuntu Linux for Server Networks in Science
1Ubuntu Linux for Server Networks in Science
- Tony Travis
- Rowett Research Institute
- Aberdeen
2Tony Travis (RRI, Aberdeen)?
- Job
- Bioinformatician
- Expertise
- Bioinformatics
- WGS sequencing
- Image analysis
- Plant science
- Parallel computing
- Unix/Linux
3Funding
- Scottish Executive Environment and Rural Affairs
Department - European Nutrigenomics Organisation
SEERAD
4Collaboration
- Computing Science
- Peter Gray
- Alun Preece
- Chris Burnett
- NuGO WP8
- Heiner Boeing
- Ulrich Harttig
German Institute of Human Nutrition,
Potsdam-Rehbrücke
5Beowulf cluster
- Donald Becker
- Scalable performance
- Commodity hardware
- Private network
- Open source software infrastructure
Grendel
6EPCC BOBCAT
- Budget-Optimised Beowulf Clusterusing Affordable
Technology - Diskless nodes
- Dual ethernet 1 System 2 Application
7Separate network fabrics
- Cluster servers
- Private
- 192.168.0.0
- 192.168.1.0
- LAN
- 143.234.32.0
- NAT
- Head node
- to LAN/WAN
- Cluster nodes
- Private
- 192.168.0.0
- 192.168.1.0
- NAT
- 192.168.0.0
- To head node
8Traffic segregation
- System (192.168.0.0) 100Base-T
- PXE boot, DHCP, NFS
- Application (192.168.1.0) Gigabit
- HPC interconnect
- LAN (143.234.32.0) 100Base-T
- Interactive login, HTTP, Web services
9Distributed process space
- NASA/Scyld
- bproc Linux kernel modification
- Remote process management
- openMOSIX
- Linux kernel-level load balancer
- Automatic process migration
- openSSI
- Single System Image
10Beowulf operating system
- Ubuntu 6.06.1 LTS
- Kernel level Load-balancing for Linux
- Open source free version of MOSIX
- SSI (Single System Image)?
- Type 2 Beowulf
11RRI/BioSS Beowulf cluster
Data Filer
Local Filer
sw4
Lab
LAN
RAID
Desktop
Internal users
bobcat
topcat
sw5
sw3
WAN
compute nodes
External users
12Windows is easier than Unix
- it depends what you want to do...
- Windows is a familiar system
- good desktop environment
- well supported
13Unix is hard to understand...
- command line
- similar to DOS
- cd
- mkdir
- blastall -p blastx...
- X-Windows
- graphical interface
- similar to Windows
14Bio-Linux Ubuntu 'biobuntu'
- NEBC (UK) Bio-Linux4 software
- NERC Environmental Bioinformatics Centre
- Ubuntu 6.06.1 LTS operating system
- Open source software based on Debian Linux
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24Cluster computing
- Embarrassingly parallel
- Distributed processingPVM/MPI
- Multiple instances of CPU intensive tasks running
on separate processors - Single instance of a CPU intensive task divided
up among all available processors
25Beowulf limitations
- Embarrasingly parallel
- Single CPU performance/memory
- Interconnect bandwidth
- Distributed (MPI)?
- Cluster performance/memory
- Interconnect bandwidth
- openMosix MPI
- Load-balancing assists MPI
26NuGO Black Box project
- Develop 'lab-scale' server pre-configured with
base2 and other bioinformatics tools - Web-based appliance accessed and administered
using a web browser - Support remote login via SSH and VNC
- Compatible with NuGO BioMoby
- Encourage data sharing
27Why a black-box approach?
- Dont need to know how it works to use it
- Deploy a pre-configured Linux server
- Install from live DVD on existing hardware
- Pre-installed on systems supplied by NuGO
- Reduce need for IT support in every lab
- Automatic backup and software updates
- Autonomous system able to discover peer systems
and cooperate with them
28(No Transcript)
29(No Transcript)
30(No Transcript)
31Web services
- Enable transparent use of Beowulf cluster as
compute GRID resource - Present non-interactive program interface
Semantic web
32Services
Model Explorer
Workflow Diagram
33NuGO-GRID
- Network of NuGO-Linux servers
- Interconnected to create GRID
- Compute GRID
- Load-balance between servers
- Data GRID
- Pool data and share resources
- P2P (Peer-to-peer)
- Local control of resource sharing
34Current status of NuGO-GRID
35SSH (Secure Shell)?
- Used to tunnel the insecure database port 3306
through encrypted channel - Encrypts data in transit and allows the database
server port to remain hidden
Encrypted data
Database (port 3306)?
Firewall (port 22)?
Client (port 3306)?
36Remote administration via SSH
nugo_at_nbx2 sudo nbx-keyscan nbx1.nugo.org
SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3.2
nbx2.nugo.org SSH-2.0-OpenSSH_4.2p1
Debian-7ubuntu3.2 getaddrinfo nbx3.nugo.org Name
or service not known getaddrinfo nbx4.nugo.org
Name or service not known nbx5.nugo.org
SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3.2 read
(nbx6.nugo.org) No route to host nbx7.nugo.org
SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3.2
nbx8.nugo.org SSH-2.0-OpenSSH_4.2p1
Debian-7ubuntu3.2 nbx9.nugo.org
SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3.2
nbx10.nugo.org SSH-2.0-OpenSSH_4.2p1
Debian-7ubuntu3.2 getaddrinfo nbx11.nugo.org
Name or service not known nbx13.nugo.org
SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3.2
nbx14.nugo.org SSH-2.0-OpenSSH_4.2p1
Debian-7ubuntu3.2 nbx15.nugo.org
SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3.2
nbx16.nugo.org SSH-2.0-OpenSSH_4.2p1
Debian-7ubuntu3.2 nbx17.nugo.org
SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3.2
nbx18.nugo.org SSH-2.0-OpenSSH_4.2p1
Debian-7ubuntu3.2 nbx19.nugo.org
SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3.2
nbx21.nugo.org SSH-2.0-OpenSSH_4.2p1
Debian-7ubuntu3 nbx22.nugo.org
SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3
nbx23.nugo.org SSH-2.0-OpenSSH_4.2p1
Debian-7ubuntu3 getaddrinfo nbx24.nugo.org Name
or service not known nbx25.nugo.org
SSH-2.0-OpenSSH_4.2p1 Debian-7ubuntu3.2
nbx26.nugo.org SSH-2.0-OpenSSH_4.2p1
Debian-7ubuntu3.2
37Dancer's shell / distributed shell
- Used to execute remote commands
- sequentially
- dsh -Mg up -- uptime
- concurrently
- dsh -cMg up -- uptime
- System administration
- local and remote NBX managers
38IT policy at NuGO partners
- Limited access requested
- Port 22 (SSH) and 80 (HTTP) open
- Tunnel insecure protocols via SSH
- Client PC requirements modest
- Java enabled web browser
- Optional installation of Windows clients
- Remote admin of NBX's by NuGO
39What else is needed ?
- Transparent p2p exchange of base2 data objects
between NBX's via BioMoby - Integrate NuGO R-server and RRI/BioSS Beowulf
cluster into NuGO BioMoby - Execute base2 plugins on remote compute servers
(large memory SMP Beowulf)? - Establish NBX help desk and training
- Get more feedback from Focus teams
40In conclusion
- Ubuntu is Debian
- better presented
- better supported
- 'biobuntu'
- Ubuntu 6.06 LTS
- NEBC bio-linux
http//bioinformatics.rri.sari.ac.uk/biobuntu