Farm Management - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Farm Management

Description:

using asynchronous non-blocking SNMPv2 bulk Get requests RRDtool library, for graphs. PerfMC (presented _at_ CHEP03), ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 2
Provided by: ucs76
Category:

less

Transcript and Presenter's Notes

Title: Farm Management


1
https//bbrweb.pd.infn.it5212/farm/
D. Andreotti1) , A. Crescente2), A. Dorigo2), F.
Galeazzi2), M. Marzolla3), M. Morandin2), F.
Safai Tehrani4), R. Stroili2), G. Tiozzo2), G.
Vedovato2) 1) I.N.F.N. of Ferrara, Italy, 2)
Univ. and I.N.F.N. of Padova, Italy , 3) Univ.
Ca Foscari, Venezia and I.N.F.N. of Padova,
Italy , 4) I.N.F.N. of Roma, Italy and the
BaBar Computing Group
A new dedicated facility for (re)processing of
BaBar raw data, supported by INFN, has been
installed in Padova (Italy) in 2002 as part of
the distributed TierA system at disposal of the
experiment. The facility consists of four
independent farms, each capable of processing 2
million events (corresponding to 160 pb-1 of raw
data) per day. Reconstructed data are stored in
an Objectivity federation, checked and finally
transferred to SLAC. The facility exploits
commodity CPU and disk storage while preserving
good reliability, high performance and well
organized system management. The center, which
now counts on approx. 200 dual CPU PIII and 30 TB
of disk space, has been in operation since
October 2002 and experience so far has been very
satisfactory.
  • First BaBar Data Processing farm fully based on
  • Linux
  • cheap hardware

Farm Performance
System is continuously stressed!
  • Existing hardware
  • All machines 2 x 1.26 GHz CPU, 1 GB ram
  • 140 clients, 40 GB local IDE disk (software RAID)
  • 20 servers, same configuration as clients,
    Gigabit ethernet
  • 30 storage servers, 1.28 TB IDE disk with 3ware
    RAID controller, Gigabit ethernet
  • 5 PR servers, up to 0.35 TB SCSI disk 10k RPM,
    with SCSI controller ServeRaid, Gigabit ethernet
  • one tape library for 700 LTO tapes (70 TB
    uncompressed)
  • New acquisitions
  • new tape library for 700 LTO2 tapes (140 TB
    uncompressed)
  • 103 clients, 2 x Xeon 2.4 GHz, 2 GB ram
  • 14 storage servers, 2 x Xeon 2.4 GHz, 2 GB ram.
    1.4 TB IDE disk
  • 10 PR servers, 2 x Xeon 2.4 GHz, 2 GB ram

Extensive work done to optimize resources and to
reduce bottlenecks (e.g., minimizing usage of NFS)
time_of_day
time_of_day
Farm Monitoring
  • Machines are organized into
  • 4 identical farms, 60 CPUs each
  • 160 pb-1/day/farm
  • 2,000,000 events/day/farm (output)
  • 160 GB/day/farm input (raw) data
  • 330 GB/week/farm output (Objy) data
  • Based on
  • SNMP, to be compatible with widest variety of
    hardwareusing asynchronous non-blocking SNMPv2
    bulk Get requests
  • RRDtool library, for graphs.
  • PerfMC (presented _at_ CHEP03), a high performance
    monitoring program developed for this farm
  • scalable
  • efficient
  • requires low resources
  • easily configurable using XML
  • operates in background (no GUI)

Farm Management
  • Using IBM's xCAT (eXtreme Cluster Administration
    Toolkit) allowing
  • remote power control ()
  • remote BIOS console ()
  • remote OS console
  • remote software reset
  • parallel remote shell
  • network installation
  • .
  • () on IBM machines only
  • Monitored quantities
  • CPU
  • Disk I/O
  • Network I/O
  • Temperatures
  • Total disk needed for whole farm 5 GB.

Screenshot of parallel installation of gt100
clients
MySQL widely used for farmmonitoring,management
and production 12 databases, 3.5 GB total
First Boot Machines must support PXE
SysAlarm Home-made Perl tool to parse system
logfiles and save errors in MySQL database.
Software installation Kickstart installation
method preferred, because easier to configure
according to machine type. Cloning (hard disk
copy) or imaging (partition copy) methods also
possible. Can use 2nd level repositories.
  • Problems
  • vendor driver availability and support for
    different Linux releases
  • had to recompile for large file support
  • nfs not optimal under (heavy load on) Linux

Network configuration All machines on a private
network. A few front-end machines have two
interfaces. Public machines resolve private names
using a NIS server.
Log server used tocentralize system logs on
one machine
Write a Comment
User Comments (0)
About PowerShow.com