Title: Solaris status and plans
1Solaris status and plans
- HEPIX Autumn 2003
- Ignacio Reguero, Michel Manent, Carlos Ungil
- presented by
- Sebastian Lopienski
2Executive Summary
- Current Status
- Some figures
- SUNINST0 Network Installation Server
- CAE Server Upgrade
- SUNDEV Technology Refresh
- New 10 Sun 210s
- Implementation of EDG WP4 Quattor fabric
management on Solaris - System administration view
- Solaris 9 Certification
- Sun Blade Server 1600 and N1 Management
3Current Status of Solaris Usage at CERN
- Second platform for LHC physics
- Mostly for validation purposes (numerical
software) - Total population of 663 Active nodes
- Figures from LanDB network database
- Around 300 on Solaris 8
- Rest about a half running Solaris 2.6 and a half
on Solaris 7 - Problem Most of these machines cannot upgrade OS
without hardware upgrade (disks and memory)
4SUNINST0 Network Installation Server
- Jumpstart server
- Network configurations responsible now fully
extracted from LANDB network database - With single fetch procedure
- After router fix Sun DHCP server is stable
- On request of SM18 LHC Magnet Test had to
demonstrate boot of exotic devices (like data
acquisition devices) from it - However, still working with CS group to replace
DHCP server with the one of CS - SOAP interface to an Oracle DB will allow us to
update - Similar than the one in place for Print DNS
hierarchy
5CAE Server Upgrade (1)
- CAE - Electronic design cluster
- To serve the electronics design community
- New server V480
- 4 x 900 MHz CPU
- 8 Gb SDRAM
- Gigabit ethernet
- A1000 RAID Disk box
- 436Gb with RAID 5 space for users
- Had to coordinate IDPROM change with disk
movement - Hit technical and sociological problems
- IDPROM change a need to keep old Cadence
licenses on new server - but SUN reluctant to do
it for new HW models - At the end provided ad hoc solution that does not
support OBP upgrade - Found the hard way!
- And need OBP upgrade for A1000 support
6CAE Server Upgrade (2)
- IDPROM problem not solved yet
- Considering to use other machine
- Cannot be too new (works on V220 or 280R)
- Also lots of A1000 RAID box problems
- RAID manager software has to be coordinated with
firmware level in the controller and OBP - So lots of upgrades required before connecting
old A1000s to new server - After first installation, additional A1000s are
not seen unless adding entries by hand to
/kernel/drv/sd.conf with the relevant SCSI ID
7CAE Server Upgrade
8SUNDEV Technology Refresh
- A cluster for physics development
- 10 Sun Fire V210
- State of the art SPARC machines
- Thin Rack mountable servers
- 1 unit on 19 racks
- They all fit in a single CERN rack together with
Gigabit switch and Sun blade server - Dual 1GHz UltraSPARC-IIIi
- 2Gb memory
- 2 x 36GB Disk drives
- 4 x GIGABIT Ethernet on the motherboard
- They are being installed on Solaris 8.7.3 (latest
required for this hardware), later Solaris 9 - Performance improvement at least 120 over the
current SUNDEV machines
9SUNDEV Technology Refresh
10Implementation of EDG WP4 Quattor fabric
management on Solaris (1)
- We plan to use Quattor to manage all Solaris
systems - From Solaris 9 onwards
- What does it mean for us?
- Central Configuration DataBase (CDB)
- Configuration information
- Software to be installed
- Both applications and system
- A cache manager provided for the client accessing
the DB - To avoid dependency on the DB server or on the
network - The configuration database is linked to the
network installation server - The Jumpstart profile is generated from the
database
11Implementation of EDG WP4 Quattor fabric
management on Solaris (2)
- Node Configuration Manager (NCM)
- A la SUE
- For configuration components
- Simplified SUE features
- NCM components are simplified SUE features
- They have single action configure
- They access Configuration DB through the cache
manager - SPMA software distributor (package level)
- Replaces ASIS software distribution (file level)
- For Linux it uses RPMs, for Solaris implemented
with Solaris PKG - Allows to install packages from various SW
repositories - Several protocols supported HTTP, file system
(AFS), FTP, etc.
12Implementation of EDG WP4 Quattor fabric
management on Solaris (3)
- Still working on
- Creation of Solaris NCM Components from existing
SUE features (Juan Pelegrin) - DB Access Control
- For delegation
- Behavior with unmanaged software
REPOSITORY
Host
PKG
xml
CDB
Host
REPOSITORY
Host
pan
NCM
SPMA
REPOSITORY
target.cf
13Solaris 9 Certification
- Validating Solaris 9 running all SW on the new
system - Timescale for the end of 2003
- Refsol9 reference machine now available
- Not big changes in terms of Solaris, but new
features - Web Start Flash Archives system images for
installation - Nice for farms (but for same HW)
- Resource pools
- Guaranteed resources for an application on large
shared systems - Gnome 2.0 is the standard desktop environment
- We deliver Mozilla 1.4 (instead of Netscape
recomm. by Sun) - Sun ONE Studio 8 as default compiler
- Replacement of ASIS and SUE with Quattor
- More Open Source software packaged with the
system - Perl, Bash,
- Some of these products supported on same basis as
SUN native ones - Probably occasion to reduce the number of
products maintained by us
14Solaris Reference machines Installation Server
15Sun Blade Server 1600
- Sun Blade server 1600
- Packaged farm
- Fits in 3 units of a 19 rack
- SSC Controller with gigabit switch that manages
up to 16 CPUs - Several Gigabit Ethernet external connections
- VLAN with 16 Gigabit Ethernet Interface
- Protection attack by Packet Filter configuration
- Console through Serial Port for each Blade
- Received 12 X 650MHz UltraSPARC-IIe
- Waiting for 4 Intel Compatible CPUs
- AMD Athlon XP-M 1.2GHz
- Other Specialized Blades supported on hardware
level - SSL Encryptor
- Load Balancer
16Sun Blade Server 1600 system chassis
External Switch
SSC0 (active)
SSC1 (standby)
Switch Fabric
Switch Fabric
Slot 0s15
Slot 0s15
137.138.x.x (ce0)
137.138.x.x (ce1)
137.138.x.x (ce0)
137.138.x.x (ce1)
Blades 0.15
17Sun Blade Server 1600
18Sun Blade Server 1600 Installation and
Configuration
- Fully automated network installation (DHCP) using
Jumpstart from SUNINST0 - Initial configuration, installation application
software - One private IP address for each System Controller
- One IP address for each Blade
- Ongoing Test of Web Start Flash Archives
- Quick replicate one Blades operating environment
application software on other Blades intended - Sun VTS (Validation Test Suite) online
diagnostics tool - verifies configuration and functionality of
hardware controllers, devices and platforms
19Sun N1 System Management Framework
- Sun N1 Provisioning server 3.0 Blade Edition
being tested - Automates configuration and deployment different
kinds of servers - Including specialized servers
- Assignment may vary according to a schedule or
other input dynamic management of clusters - To compare N1 with Quattor functionality
- Question could N1 manage heterogeneous farms out
the Blade server scope?
20Questions?
- Unix Infrastructure section
- http//cern.ch/product-support/UI
- Ignacio.Reguero_at_cern.ch
- Michel.Manent_at_cern.ch
- Carlos.Ungil_at_cern.ch