Clustering Technology Overview - PowerPoint PPT Presentation

About This Presentation
Title:

Clustering Technology Overview

Description:

Multi-user Multitasking Unix-like OS. Multi-architecture, ... Red Hat : piranha, High Availability Server. Turbo Linux : Turbo Cluster Server. VA Linux : VACM ... – PowerPoint PPT presentation

Number of Views:495
Avg rating:3.0/5.0
Slides: 49
Provided by: dpnmPos
Category:

less

Transcript and Presenter's Notes

Title: Clustering Technology Overview


1
Clustering Technology Overview
Clustering with Linux
Sungho Kim , Ph.D. President KESPER Inc.
2
Agenda
Linux Overview Linux Kernel Features Linux
Network Protocols Overview Linux Clustering
Overview Network Protocols for Clustering HPC
Cluster Internet Cluster HA Cluster Conclusions
3
Linux Overview
  • Multi-user Multitasking Unix-like OS
  • Multi-architecture, multi-platform OS
  • Freely distributable open source OS GNU
    software
  • IEEE POSIX compliance
  • Wide range of peripherals supports
  • Wide configurability From embedded to
    supercomputer
  • X Window Support
  • Full Network awareness
  • - Various Network Protocol Support
  • TCP/IP, IPX/SPX, Appletalk, Samba, NFS, Web,
    Mail, etc
  • 32/64 bits Full supports

4
Linux Hardware
  • Systems
  • IBM PC and compatibles
  • Apple Macintosh
  • from m68000 to powerpc
  • SUN
  • SGI
  • Atari/Amiga
  • Compaq alpha
  • Netwinder

CPU Intel x86, AMD, Cyrix Alpha EV5, EV6
(64-bit) PowerPC Sparc, UltraSparc(64-bit) M68k S
trong/ARM MIPS
5
Linux Network
Network Applications Web Server Apache ,
Netscape DHCP Server dhcpd FTP Server proftpd
( ftp, ncftp ) Mail Server sendmail / pine,
mutt, elm pop3 / imap / procmail mailing list
majordomo Chatting Server irc ( bitchx, irc
) File Server Samba News Server innd ( tin,
pine, trn ) DNS Server bind NIS Server NIS
  • Network Interface Cards
  • 10/100/1000 MB/s
  • Myrinet
  • ATM
  • Token Ring / FDDI / HIPPI
  • ARCnet
  • ISDN
  • X.25
  • Frame Relay
  • Fibre Channel
  • WAN

6
Kernel Features
  • Kernel Options of 2.2.x
  • Code maturity level options
  • Processor type and features
  • Loadable module support
  • General Setup
  • Plug and Play support
  • Block devices
  • Networking options
  • SCSI options
  • SCSI low-level drivers
  • Network device support
  • Amateur Radio support
  • ISDN subsystem
  • CD-ROM drivers

Character devices Mice Video for Linux Joystick
support Ftape, the floppy tape device driver File
systems Network File Systems Partition
types Console drivers Sound Kernel Hacking
7
Specific Features
  • Status of Kernel 2.4-test
  • USB supports
  • Logical Volume Manager
  • Ext3 Journaling File Systems
  • IrDA driver updates
  • Gas using instead of as86
  • Athlon supports
  • QuickCAM support
  • XFree86 DRI (Direct Rendering Interface)
  • Kernel HTTPD supports
  • Direct decompressing from Flash or ROM
  • I2O driver updates
  • DVD filesystem (udf) supports

8
Network Protocols
  • Supported Kernel Network Features
  • TCP/IP Protocol
  • IPX
  • Multicasting ( MBONE )
  • Tunneling ( GRE / Mobile-IP )
  • VPN
  • Advanced Router
  • WAN Router ( WAN Card Linux )
  • Frame Relay / X.25 / leased line
  • HIPPI ( Cluster and Supercomputer )
  • Token Ring
  • IP Masquerading ( NAT )
  • IP Alias ( Virtual IP / Virtual domain )
  • Bridging ( Bridging / Load Balancing )
  • ISDN / xDSL

9
Network Protocols
Linux Networking
  • Other Network Protocols
  • EQL ( Serial Line load balancing )
  • SLIP ( Serial Line Interface Protocol )
  • CSLIP (Compressed Serial Line Interface Protocol
    )
  • PPP ( Point-to-Point Protocol )
  • PLIP ( Parallel Line Interface Protocol )
  • X.25 PLP ( Packet Layer Protocol )
  • HIPPI ( High Performance Parallel Interface )
  • FDDI ( Fiber Distributed Data Interface )
  • IPv6 ( IPng ) Experimental
  • ARCnet
  • SNMP

10
Cluster System Overview
  • Category of Cluster Systems
  • Categories depend on their configuration method
    and applied areas
  • HPC Cluster Computation-intensive
  • Bulk Storage Cluster Stored Data sharing and
    service
  • Web/Internet Cluster Network load distribution
    and LB
  • HA Cluster Increase the Availability of systems
  • Components Network OS Storage API

HPC High Performance Computing HA High
Availability LB Load Balancer
11
Linux Cluster Network
Filtering IP Packet filtering Linux Socket
filtering (BSD socket filtering) Unix domain
socket filtering ( X-windows, syslog ) Firewall
? packet filer/IP masquerading IP kernel
level autoconfig Network booting X terminal
TFTP / BOOTP / RARP
  • IP Tunneling
  • Encapsulating data of protocol
  • VPN ( Virtual Private Network )
  • GRE tunneling
  • Generic Routing Encapsulation
  • CISCO Router
  • Mobile-IP for laptop
  • IP Firewalls/Masquerading
  • NAT ( Network Address Translation )
  • Modified firewall
  • IP auto forward
  • IP port forward

12
Linux HPC Cluster
  • Clustering Technology
  • A Bunch of computers to execute some jobs in
    parallel with multiple computers and
    pre-configured networks
  • Beowulf Linux based Cluster
  • Characteristics of Clusters
  • High Availability and expandability
  • High Performance/price

Personal Supercomputer
13
Linux HPC Cluster
  • Components of Clusters
  • Hardware
  • CPU Intel Pentium, Digital Alpha, Mac G3
  • Network Ethernet, Myrinet, ATM, Gigabit
    Ethernet
  • Storage Fibre Channel/SCSI RAID
  • Software
  • Operating System Compiler Linux, Windows
    NT, DEC OSF
  • Communication Library PVM, MPI
  • Administration Tool CMS
  • Queuing Software DQS, PBS
  • Application Libraries BLACS, ATLAS,
    ScaLapack, PBLAS

14
Linux HPC Cluster
  • AVALON - Los Alamos National Lab.
  • Configuration of Hardware systems

Network Configuration
3Com SuperStack II 3900 36-port fast ethernet
switches
3Com SuperStack II 9300 12-port Gigabit Ethernet
switch
switched network of 144 fast ethernet ports


4x
Cost about 300 per port.
Cyclades multiport serial switches
Node Configuration (140)
533MHz Alpha 21164A microprocessor DEC AlphaPC
164LX motherboard ECC SDRAM DIMMs (256 Mbytes
total per node) Quantum Fireball ST3.2A 3079Mb
EIDE U-ATA drive Kingston ethernet card with a
DEC Tulip chipset
15
Linux HPC Cluster
3COM 9300 1G eth.
3900
3900
3900
3900
16
Linux HPC Cluster
  • Software Configurations
  • OS RedHat Linux 5.0, kernel 2.1.125
  • MPICH and own basic set of MPI routines
  • Compiler egcs 1.1b
  • Application Programs
  • SPaSM
  • Gravitational tree code

17
Linux HPC Cluster
  • Performance (113/500)
  • 70 nodes 140 nodes
  • Linpack benchmark 19.7 GFlops 47.7 GFlops
  • SPaSM 12.8 GFlops 29.6 GFlops
  • Gravitational treecode 10.0 GFlops -
  • Price vs Performance
  • Price of Avalon 313,000
  • Avalons Performance
  • 64 CPUs 195 Mhz SGI Origin 2000
  • (SPaSM, Tree code, and Linpack)
  • Price of 64 CPUs SGI Origin 2000
  • over 100 M

18
Network Configuration
Simple Network Connection Nodes have Internet
IP Addresses
LAN/WAN
Server 1
Server n
Cluster Server Farm
19
Network Configuration
Double Network Connection Nodes have Internet
IP Addresses and Local IP Addresses
LAN/WAN
Server n
Server 1
Second-layer Network
Cluster Server Farm
20
Network Configuration
Double Network Connection Master-Slave(NAT)
configuration Nodes have local IP Addresses
Master Server
LAN/WAN
Slave Server n
Slave Server 1
Second-layer Network
Cluster Server Farm
21
Second-layer Network connection with Cross-bar
connection on 32 Node Cluster
Crossbar Inter-connection
16 N o d e s
16 N o d e s
  • 32 Host Bus Adapters
  • 12 Switches
  • 64 Cables

22
I/O Connection
Keyboard-Video-Mouse and Disk IO connection
Master (IO controller)
Node 0 (IO controller)
SCSI / FC
Console Splitter Switcher
Node 1 (IO controller)
RAID Controller(0)
Node 2 (IO controller)
RAID Controller(1)
Monitor Keyboard Mouse
Node 3 (IO controller)
Node 4 (IO controller)
Node 5 (IO controller)
Node 6 (IO controller)
23
Internet Cluster
Virtual Internet Cluster Server Scalable and
highly available server built on a cluster of
real servers The architecture of the cluster is
transparent to end users and the users see only a
single virtual server. Methods to build Virtual
Internet Cluster Server Virtual Server via
NAT Virtual Server via IP tunneling Virtual
Server via IP filtering Virtual Server via Direct
Routing
Ref www.linux-vs.org
24
Internet Cluster
Internet Cluster
25
Internet Cluster
Virtual Internet Cluster Server via NAT This is
done by network address port translation. The
code is implemented on Linux IP Masquerading
codes and port forwarding code are reused. Refer
ipfwadm command. All the process are figured
out.
26
Internet Cluster
Intranet
DSU/Router
Internet
User
L4 Switch
Load Balancer Linux Box
LAN/WAN
Real Server 1
Real Server n
Virtual Cluster Server via NAT
27
Internet Cluster
How This Cluster Works ?
(1) requests
DSU/Router
(5) replies
User
(4) rewriting replies
Load Balancer Linux Box
(2) Scheduling rewriting packets
LAN/WAN
(3) Processing The requests
Real Server n
Virtual Cluster Server via NAT
28
Internet Cluster
Virtual Internet Cluster Server via IP
Tunneling IP Tunneling (IP encapsulation) is a
technique to encapsulate IP datagram within IP
datagrams, which allows datagrams destined for on
IP address to be wrapped and redirected to
another IP address. IP encapsulating is now
commonly used in Extranet, Mobile-IP,
IP-Multicast, tunneled host or network. The
load balancer encapsulates the packet and
forwarded to the server. When the server
receives the encapsulated packet, it
decapsulates the packet and processes the
request, finally return the result directly to
the user. Refer NET-3-HOWTO command.
29
Internet Cluster
(1) requests
DSU/Router
Internet
User
Load Balancer Linux Box
Replies going to the user directly
Virtual IP address is assigned
IP Tunnel
IP Tunnel
LAN/WAN
Real Server 1
Real Server n
Virtual Cluster Server via IP Tunneling
30
Internet Cluster
(1) requests
DSU/Router
Internet
User
(2) encapsulation
Load Balancer Linux Box
Virtual IP address is assigned
LAN/WAN
Real Server 1
Real Server n
(3) de-encapsulation reply to user
Virtual Cluster Server via IP Tunneling
31
Storage Cluster
Network is configured with one of the virtual
cluster server techniques. The disk storage is
connected with Fibre Channel including SAN file
systems.
Internet
DSU/Router
LAN/WAN
Fibre Channel
half-duplex 100MBytes/sec full-duplex
200MBytes/sec
FC Switch
Fibre RAID Storage
Linux Storage Cluster with GFS or SAN
32
Storage Cluster
  • SAN(Storage Area Network)
  • Scalability ? 125 disks w/ one controller
  • Easiness of Management
  • Fast Disk I/O Speed ? 100 Mbytes/sec (
    half-duplex ), 200Mbytes/sec (full-duplex)
  • Long Distance ? over 10 km (fiber-optical cable)

Fibre Channel half-duplex 100MBytes/sec full-d
uplex 200MBytes/sec
Fibre Channel Switch
Fibre RAID Storage
Linux Storage Cluster with GFS or SAN
33
Storage Cluster
  • Linux Supporting File Systems
  • ext2/ext3 file systems
  • ISO 9660 (CD-FS)
  • VFAT / FAT
  • SMB (CIFS)
  • UFS
  • NTFS
  • UDF ( DVD-FS )
  • NFS / CODA
  • LVM ( Logical Volume Manager )
  • GFS ( Global File Systems )
  • Reiser FS ( Journaling File Systems ), SGI XFS,
    IBM JFS
  • RIO ( Raw IO )
  • RAMFS
  • ROMFS

34
GFS Storage Cluster
  • Feature Overview about GFS
  • The Global File System (GFS) allows multiple
    Linux machine to share storage devices over a
    network. Each machine sees the network disks as
    local, and GFS itself appears as a local file
    system. Writes to a file by one Linux machine are
    seen by another machine that later reads that
    file.

35
GFS Cluster Configuration
Normal Configuration
36
GFS Cluster Configuration
Complex Configuration
Cross-bar FC connection
37
GFS Cluster Configuration
NFS Configuration
GFS Configuration
Hybrid Configuration
38
GFS Cluster Performance
39
Enterprise Server Requirements
High Availability Cluster
40
High Availability Cluster
  • Server Downtime
  • Un-Planned Downtime
  • Hardware Fault
  • Software Fault
  • Planned Downtime
  • Hardware exchange
  • Hardware Upgrade
  • O/S upgrade
  • Software upgrade

Cost due to Downtime
41
Comparison of Availability
High Availability Cluster
42
Concept of HA system
High Availability Cluster
Dual Network for Response
Heartbeat
Active or Standby Systems
Dual IO connection for Storage
Shared Storage
43
Lines of heartbeat
High Availability Cluster
Dual Network for Response
Serial Connection TCP/IP over LAN Shared SCSI
Heartbeat
Components of HA
Active or Standby Systems
Dual IO connection for Storage
Redundant Systems All connectable lines
Shared Disks Filesystem Management software
including Heartbeat checking daemon
Shared Storage
Ref www.linux-ha.org
44
Linux Internet Cluster Products
Concluding Remarks
Wyz Cluster DR Cluster Mission Critical Linux Red
Hat piranha, High Availability Server Turbo
Linux Turbo Cluster Server VA Linux
VACM Legato Cluster Veritas etc
Linux Clustering is a starting point that Linux
can enter the enterprise market. Until now,
however, the clustering technology is one of
major considerations of technical development
group like institutes or academies.
45
Why Linux Cluster?
Concluding Remarks
Cost Effective and Easy configurability Fast
technical development with open source Many
references in various fields
Future Needs
New network configurability and TCP/IP stack
performance. High-Availability for enterprise
markets Cluster filesystem and disk I/O
performance High performance peripheral
drivers Stable management and scheduler
46
Concluding Remarks
Do Not Myth !
Clustering technology is matured enough
? Easiness and stability are acquired ? The
clustering is a big market ? If, any field
? Linux is in enterprise market ? If not,
backend system ? Linux vendor can maintain
their advantages ?
47
Thank You !!!
48
KESPER Inc. RM 803 DongA Officetel
BongMyeong-Dong YuSeong-Gu Taejeon
305-709 Republic of Korea (South Korea) Tel.
82-42-828-7458 Fax 82-42-828-7455
Sungho Kim, President/CEO shkim_at_kesper.co.kr
or sungho_at_kesper.co.kr
Write a Comment
User Comments (0)
About PowerShow.com