Title: Oxford University Particle Physics Site Report
1Oxford University Particle Physics Site Report
- Pete Gronbech
- Systems Manager
2(No Transcript)
3Particle Physics Strategy The Server / Desktop
Divide
Approx 200 Windows 2000 Desktop PCs with Exceed
used to access central Linux systems
4Central Physics Computing Services
- E-Mail hubs
- In last year 7.3M messages were relayed , 73
rejected and 5 were viruses. - Anti-virus and anti-spam measures increasingly
important in email hubs. Some spam inevitably
leaks through and clients need to deal with this
in a more intelligent way. - Windows Terminal Servers
- Use is still increasing 250 users in last three
months out of 750 staff/students. Now Win2k and
2003. - Introduced an 8 CPU server (TermservMP) . Much
more powerful system but still awaiting updated
versions of some applications which will run
properly on OS. - Web / Database
- New web server (Windows 2003) in service.
- New web applications for lecture lists, Computer
inventory, admissions and finals - Exchange Servers
- Running two new servers using Exchange 2003
running on Windows server 2003. Much better Web
interface, support for mobile devices (oma) and
for tunnelling through firewalls. - Desktops
- Windows XP pro is default OS for new desktops and
laptops.
5Linux
- Central Unix systems are Linux based
- Red Hat Linux 7.3 is the standard
- Treat Linux as just another Unix and hence a
server OS to be managed centrally. - Wish to avoid badly managed desktop PCs running
Linux. - Linux based file server (April 2002)
- General purpose Linux server installed August
2002 - Batch farm installed
6CDF
General Purpose Systems
Fermi7.3.1
RH7.3
RH7.3
RH7.3
RH7.3
Fermi7.3.1
1Gb/s
pplx2
pplx3
pplx1
morpheus
pplxfs1
pplxgen
minos DAQ
RH7.3
RH7.3
ppminos1
ppminos2
cresst DAQ
RH7.3
RH7.3
Grid Development
ppcresst1
ppcresst2
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
Atlas DAQ
RH7.3
RH7.3
grid
tbwn01
pptb01
pptb02
tblcfg
se
ce
ppatlas1
atlassbc
7The Linux File Server pplxfs1 8146GB SCSI
disks Dual 1GHz PIII, 1GB RAM
8New Eonstor IDE RAID array added in April 04. 16
250GB disks gives approx 4TB for around
6k. This is our second foray into IDE storage.
So far so good.
9General Purpose Linux Server pplxgen
pplxgen is a Dual 2.2GHz Pentium 4 Xeon based
system with 2GB ram. It is running Red Hat 7.3 It
was brought on line at the end of August 2002.
Provides interactive login facilities for code
development and test jobs. Long jobs should be
sent to the batch queues. Memory to be upgraded
to 4GB next week.
10PP batch farm running Red Hat 7.3 with Open PBS
can be seen below pplxgen This service became
fully operational in Feb 2003. Additional 4
worker nodes were installed in October 2003.
These are 1U servers and are mounted at the top
of the rack. Miscellaneous other nodes bring a
total of 21 cpus available to PBS.
11http//www-pnp.physics.ox.ac.uk/ganglia-webfronten
d-2.5.4/
12CDF Linux Systems
Morpheus is an IBM x370 8 way SMP 700MHz
Xeon with 8GB RAM and 1TB Fibre Channel
disks Installed August 2001 Purchased as part of
a JIF grant for the CDF group Runs Fermi Red Hat
7.3.1 Uses CDF software developed at Fermilab
and Oxford to process data from the CDF
experiment.
13Second round of CDF JIF tender Dell Cluster -
MATRIX
10 Dual 2.4GHz P4 Xeon servers running Fermi
Linux 7.3.1 and SCALI cluster software. Installed
December 2002
Approx 7.5 TB for SCSI RAID 5 disks are attached
to the master node. Each shelf holds 14 146GB
disks. These are shared via NFS with the worker
nodes. OpenPBS batch queuing software is used.
14Plenty of space in the second rack for expansion
of the cluster. Additional Disk Shelf with
14146GB plus an extra node was installed in
Autumn 2003.
15Oxford Tier 2 centre for LHC Two racks each
containing 20 Dell dual 2.8GHz Xeons with SCSI
system disks. 1.6TB SCSI disk array in each
rack. Systems will be loaded with LCG2
software. SCSI disks and Broadcom Gigabit
Ethernet causes some problems with installation.
Slow progress being made.
16Problems of Space, Power and Cooling. Second rack
currently temporarily located in theoretical
physics computer room. A proposal for a new
purpose built computer room on Level 1
(underground) in progress. False floor, large
Air conditioning units and power for approx 20-30
racks to be provided. 1200W/sq m max air cooling,
a rack full of 1U servers can create 10KW of
heat. Water cooling??
17OLD Grid development systems. EDG Test bed
setup, currently 2.1.13
18Tape Backup is provided by a Qualstar
TLS4480 tape robot with 80 slots and Dual Sony
AIT3 drives. Each tape can hold 100GB of
data. Installed Jan 2002. Netvault 7.1 Software
from BakBoneis used, running on morpheus,
for backup of both cdf and particle physics
systems. Main userdisks backed up every weekday
night data disks not generally backed up BUT
weekly backups to OUCS HFS service provide some
security.
19Network Access
Super Janet 4
2.4Gb/s with Super Janet 4
Physics Backbone Router
100Mb/s
Physics Firewall
OUCS Firewall
100Mb/s
1Gb/s
Backbone Edge Router
1Gb/s
100Mb/s
Campus Backbone Router
100Mb/s
1Gb/s
depts
Backbone Edge Router
depts
100Mb/s
depts
100Mb/s
depts
20Physics Backbone Upgrade to Gigabit Autumn 2002
Linux Server
1Gb/s
Physics Firewall
Server switch
1Gb/s
Win 2k Server
1Gb/s
100Mb/s
Particle Physics
1Gb/s
100Mb/s
Physics Backbone Router
100Mb/s
1Gb/s
desktop
Clarendon Lab
100Mb/s
1Gb/s
desktop
1Gb/s
1Gb/s
100Mb/s
Astro
Atmos
Theory
21Network
- Gigabit network installed for the physics
backbone. - Most PP servers are now interconnected via
gigabit. - Many switches have been upgraded to provide 100
mpbs to almost every port with gigabit uplinks to
the core network. - Connection to campus remains at 100 mbps, campus
upgrade to 10Gbps core not expected till end of
2004. - Virtual Private Network (VPN) server getting
increased usage, overcomes problems getting some
protocols through firewalls. Allows authorised
users to get into the Physics network from remote
sites, but it has its own security risks..
22Network Security
- Constantly under threat from worms and viruses.
Boundary Firewalls dont solve the problem
entirely as people bring infections in on
laptops. - New firewall based on stateful inspection. Policy
is now default closed. Some teething problems
as we learnt what protocols were required but
there has been a very significant improvement in
security. - Main firewall passes average 5.8GB/hour (link
saturates at peak). Rejects 26,000 connection per
hour (7 per second). Mischievous connects
rejected 1500/hour, one every 2.5 secs. During
blaster worm this reached 80/sec. - Additional firewalls installed to protect the
Atlas construction area and to protect us from
attacks via dialup or VPN. - Need better control over how laptops access our
network. Migrating to a new Network Address
Translation system so all portables connect
through a managed gateway. - Have made it easier to keep Anti-Virus software
(Sophos) uptodate via simply connecting to a web
page. Important that everyone managing their own
machines takes advantage of this. Very useful for
both laptops and home systems - Keeping OSs patched is a major challenge. Easier
when machines are all inside one management
domain but is still very time consuming. Must
compare to perhaps 1-few man months of IT support
staff effort to clean out a successful worm from
the network.
23Goals for 2004 (Computing)
- Continue to improve Network security
- Need better tools for OS patch management
- Need users to help with their private laptops
- Use automatic updates (e.g. Windows Update)
- Update Antivirus software regularly
- Segment the network by levels of trust
- All the above without adding an enormous
management overhead ! - Reduce number of OSs
- Remove last NT4 machines and exchange 5.5
- Digital Unix and VMS very nearly gone.
- Getting closer to standardising on RH 7.3
especially as the EDG software is now heading
that way. - Still finding it very hard to support laptops but
now have a standard clone and recommend IBM
laptops. - What version of Linux to run ? Currently all 7.3
but what next? - Looking into Single Sign On for PP systems