Title: TRIUMF SITE REPORT Corrie Kost
1TRIUMF SITE REPORTCorrie Kost
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
2LINUX at TRIUMF
TRIUMF urges proper support for Scientific Linux
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
3WAN Replacement MRV units (10Gb/sec
capable) Third Passport Router
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
4WestGrid UBC/TRIUMF Site
- 504 dual 3.06 GHz Xeon IBM blades
- Red Hat Linux 9 to allow GPFS (NFS nixed)
- OPENPBS Scheduling with (MOAB) Maui
- 10 TB disk storage
- 70 TB tape storage
- Direct Gigabit connection between sites
- Possible 10GB in future
- February 2004 opened for general use.
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
5WestGrid UBC/TRIUMF Site(www.westgrid.ca)
- From a cold start
- GPFS servers load in 5-10min
- All nodes up on 60-90min
- Bring up single nodes 10min
- Rebuild (disk) for node 2 hrs
- Single node failure rate 1/day
- Node disk failures dominate
- Utilization about 87
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
6Network / Servers
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
7Servers Upgrade Program
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
8LCG Grid Participant
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
9High I/O Testbed
- Hardware nice but
- 40pin IDE cable is
- a problem with 2.6 kernel
- Mounting bracket
- screws can short audio
- halt boot
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
10- STORM1 STORM2
- Dual 3.2 GHz Xeons
- 4GB memory
- 4 3WARE 8506-4LP
- 16 SATA150 120GB DRIVES
- 20GB ST92011A DRIVE
- INTEL 10GBE PXLA8590LR
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
11High Speed I/O Part 1
- Used ext2 for highest speeds (no journaling, but
2TB file size limit) - RH 9 One?Four disk (writes) software RAID 0
3-Ware Controller - 50.6 , 98, 124, 141 MB/sec respectively.
- Four disks split over two 3-Ware controllers ?
162 MB/sec writes - Four disks on 1 hardware raid 0 and software raid
0 ? 138MB/sec writes - Adding 4 more disks on second 3-Ware 250 MB/sec
(slots 2,5) -
--247 MB/sec (slots 2,3) - Adding 4 more disks on third 3-Ware -- 273
MB/sec (slots 2,3,5) -
-- 265 MB/sec (slots 2,3,4) - Adding 4 more disks on fourth 3-Ware -- 283
MB/sec (slots 2,3,4,5)
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
12High Speed I/O- Part 2
- Using 4 3-ware in hardware raid 0 mode ,
software raided by Linux - dd if/dev/zero of/raid/8GB bs81920
count104857 - Fedora1 non-smp 2.4.22-1.2188np1 HT ext2
-T news write 370 MB/sec - Fedora1 non-smp 2.4.22-1.2188.np1 HT
reiserfs write 227 MB/sec - Loaded e2fs module 1.35-7.1 to fix -largefile
and largefile4 creation with mkfs T largefile
/dev/md0 - Fedora1 non-smp 2.4.22-1.2188npt1 HT largefile
ext2 write 349 MB/sec - Fedora1 non-smp -2.4.22-1.2188npt1 noHT
largefile ext2 write 300 MB/sec - Fedora1 non-smp 2.6.61 HT largefile ext2
write 375 MB/sec - Replaced 40 with 80 pin ide cable to main disk
allowed SMP to boot - Fedora1 SMP 2.6.61 noHT largefile ext2
write 309 MB/sec - echo 262144 gt /proc/sys/net/core/rmem_default
- echo 8388608 gt /proc/sys/net/core/rmem_max
- echo 262144 gt /proc/sys/net/core/wmem_default
- echo 8388608 gt /proc/sys/net/core/wmem_max
- echo 300000 gt /proc/sys/net/core/netdev_max_backlo
g - echo 8388608 gt /proc/sys/net/core/optmem_max
- sysctl -w net.ipv4.tcp_rmem"10000000 10000000
10000000" - sysctl -w net.ipv4.tcp_wmem"10000000 10000000
10000000" - sysctl -w net.ipv4.tcp_mem"10000000 10000000
10000000"
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
13High Speed I/O- Part 3
- root_at_storm2 root time ttcp -t -b 6000000 -l
102400 storm1-10g lt/raid/8gb-a - ttcp-t buflen102400, nbuf2048, align16384/0,
port5001, sockbufsize6000000 tcp -gt
storm1-10g - ttcp-t socket
- ttcp-t sndbuf
- ttcp-t connect
- ttcp-t 8589934592 bytes in 42.80 real seconds
195978.14 KB/sec - ttcp-t 83887 I/O calls, msec/call 0.52,
calls/sec 1959.80 - ttcp-t 0.0user 22.2sys 042real 52 0i0d
0maxrss 025pf 17854622csw - Ttcp disk to disk 191 Mbytes/sec
- Three Walls
- CPU - 100 seen
- 3Ware I/O Controller (140MB/sec instead of 450,
375MB/sec instead of 4140) - 10Gbit Intel Card using ixgb-1.0.65 driver (2.3
Gb/sec) - Ongoing
- Tuning Process Affinity (using /usr/bin/run)
- Interrupt Affinity (IRQ of 3-ware and 10GbE set
to CPUs eg /proc/irq/24/smp_affinity) -
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
14Misc. Developments
- Build a cheap hot-swap
- Serial ATA drives
- Raid 5 system
- 1 Promise Fasttrack S150 SX4 controller 233Can
- 3 Promise Superswap 1100 Drive Enclosures for
SATA/150 112Can - 3 Maxtor 120GB S-ATA drives (6Y120M0) 145Can
- Test on cheap 1.7GHz Celeron, Intel D845GVSLR,
256Mb memory - Redhat 9.0 base (wont work on updated kernels)
- Read large file 46.8 Mbytes/sec
- Write large file 46.5 Mbytes/sec
- Able to pull disk while active auto rebuilds in
75min when replaced.
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
15Misc. Developments
- Remote power on/off
- using networked power bars
www.servertech.com
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
16Mail at TRIUMF
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
17IMP Webmail
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost
18Squirrel Webmail
TRIUMF Site Report for HEPiX, Edinburgh, 24-28
May 2004 Corrie Kost