Title: Outsourcing University Services
1Outsourcing University Services
Future of National Computer Grid Services in the
UK
- Dr Rhys Newman
- University of Oxford
- NeSC 22nd Feb 2007
2My Background
- Academic researching Computer Grid Technology
(more on this later) - I work in the physics department although I am a
software engineer by trade - Spent 6 months project-leading a small-scale
computer room build project for Oxford Physics - Spent 2 further years on the committee overseeing
a computer room build for the expansion of the
Oxford Supercompter - Spent most of last year campaigning for
outsourcing CPU provision in the face of the
above experiences - Am a director of a University spin-out company
which aims to bring grid computing technology to
market - This experience has meant I have looked at the
economic argument for grid computing (at least in
connection with CPU usage) in detail and compared
it in particular with outsourcing CPU time and
building your own computer facility
- I therefore feel well qualified to comment on the
issue of Outsourcing your CPU
3Current Status in the UK
- The UK maintains about 7 of the top 500
supercomputer power (6 in 2006). - Even though the total CPU power has increased by
35x since 2000. - To get into the top 500 youll need about 1000
processor cores at 2.4Ghz or better
- A cluster similar to Cambridges recent
supercomputer, but equivalent to the 1
supercomputer in GFlops - 18 000 Dual Core Xeon machines
- Cost over 30 million to buy (computers only at
2500 each retail) - Consume over 15MW and cost 8 million in
electricity to run per year - Would provide approximately 1 billion GHzHrs per
year - Would need a machine room the size of a football
pitch
4What do Academic Users want?
- More computing!!!
- Typically x86 Linux based
- Power Processors coming in artificially in 2006
due to Blue Gene upgrade. - Clusters with raw CPU grunt rather than special
hardware or interconnect. - Important for cost
- Infiniband appearing before 10GBit Ethernet?
5Correlation of GFlops and GHzHrs
- Clock speed (GHzHrs) correlates to performance
within machine families
SpecInt2000 vs GHz in 2005
SpecInt2000 vs GHz in 2004
SpecInt2000 vs GHz in 2006
6To GHzHr or not to GHzHr?
- Despite the flaws in this measurement of power,
observe the following prices on Dell.co.uk
- The average is 1.36p GHzHr (2.53p if you use Hire
Purchace). - These values were 1.15p GHzHr 6 months ago (1.39p
on HP) - This suggests an acceptance de-facto of GHzHrs as
a basis of price
7How to get the most GHzHrs for
- Buy your own computers, build your room and run a
computing facility - Rent computers and hosting from external provider
- Use grid computing to extract value from existing
machines
8Option 1 Build your own Facility
- Advantages
- You get good PR when it opens
- You can get exactly the equipment you want.well
almost - Disadvantages
- The project risks of building and commissioning
such facilities are surprisingly large - No flexibility run at 100 all the time or
waste the investment - All the responsibility, uptime, hardware
failures, hardware refresh, being everything to
everyone! - Real cost
9The costs of a computer room
- Cost of computer hardware is 1.0p GHzHr
- However Bare Bones facility calculation for a
1000 Dual CPU node cluster shows - 1.3million running costs per year (500k in
electricity alone) - GHzHr rate 1.27p to 1.43p
- 4.6 to 6.3 million startup costs
- This build has no UPS or other high
availability features - This should be the cost Universities should be
able to pass on to internal users - However inherent inefficiencies in the internal
process mean this rapidly becomes more than 5p
GHzHr if the university can find the initial
capital in the first place! - Anecdotally one institution believes it is
possible to charge 10p GHzHr and expect - Their academics to pay it
- The research councils to accept the charge on the
FEC project sheet
10The chain of events..
- More and more research areas need substantial
computing resources more than any department
can contemplate - The University steps in to provide a central
computing facility more efficient on the
surface - You now need a computer room built to modern spec
(as youll need 1000 CPUs) - Almost always requires a substantial building
project - Building projects are notoriously late and over
budget (typical fully costed build multiplies
initial quotes by 2) - Every department now at the mercy of the progress
of this central project, which becomes
politically more risky to control as costs
overrun and deadlines are missed - Computing facility comes online and has to
recover much larger costs than anticipated from
departments (and their research projects)
almost inevitable now we have FEC - A late project has had an academic opportunity
cost which is difficult to quantify (but almost
certainly has cost research grants), and the
overpricing needed has made it unattractive to
use - University steps in to force use (User charge
bumped up with general fund support) - Nobody is happy, costs are high and research has
suffered
- Even worse if certain high spec users of
computing have persuaded the University to spend
extra money on hardware which they need, as this
results in central funds sponsoring a particular
groups work at the expense of other uses
11Option 2 Outsource CPU Resources
- Advantages
- You can get the best value from a load of
suppliers (competition is fierce) - You can often get your resources online in less
than a week - You have no risks with a building project and the
hardware maintenance, infrastructure resilience
and operational hassle is no longer yours - You are flexible grow and shrink as necessary
- Disadvantages
- You dont get as large a choice of hardware
12Price Comparison for GHzHrs
- Some prices from the web for dedicated hosting
- 5p was the norm 6 months ago
- Worthy mention www.VCompute.com
- 5p/GHzHr but in conventional cluster arrangement
with 8GB RAM on each node, happy to supply over
10000 nodes
- Reasons for Variation
- Different RAM
- Pentium/XEON/AMD
- Bandwidth restrictions
- HDD size
- Additional services
13Option 3 Use Existing Resources Better
- Grid Technology can enable the thousands of
machines in an instituion to be utilised much
more effectively - This is a real resource which is going to waste
estimated 100 billion globally per annum. - Office machines can be used to soak up the more
conventional computing tasks leaving only
specialist tasks for special machines - Specialist machines can be smaller and come back
into the departmental remit where they belong!
14Potential locked away.
- How many computers in Oxford?
- Oxford University has 5000 staff
- 50000 registered IP addresses
- Suggests 10000 modern machines available
- How many in the UK academic sector?
- 168 institutions employing 160000 staff
- Assume 200000 decent machines (2Ghz or better)
available and connected to the LAN - 3.5 Billion GHzHrs total, 2.6 Billion outside
office hours - Total incremental cost of this is dominated by
the extra electricity 50000 per year per
institution - Equivalent to 0.3p GHzHr
- For a UK-wide cost of 8m/year, we could have 2
Darwin machines - Equivalent to 13 in the top 500
15Grid Technology Nereus
- Any proposed technology which attempts to exploit
these idle machines must - Support Windows primarily (90 of all computers
run Windows not Linux) - Not require admin privileges to run
- Be bulletproof to protect users and owners from
each other (and limit support calls) - Must be simple and easy to install
- My particular interest and project Nereus
- In development for 2 years, currently in beta
- Testing phase set to begin within weeks on many
thousands of machines (ironically not in academia
and not in the UK!) - Solves the above issues in a way not addressed by
any current grid middleware
16Recommendations
- Do not
- build any more computer rooms at an institution
level - Waste money on large special hardware
- Wait any longer to catch up the rest of the world
in computing resources - Do..
- Outsource computing resources to specialist
providers - Soak up the existing resources in institutions
using grid technology - Let special projects buy their special hardware
for their own use as before - Finally a Request for a composite National Grid
Service - We can build an academic grid using Nereus which
pools the idle time of all UK institutions a
resource of global capability - Can the NGS supply conventional clusters and also
manage a desktop grid deployment to ensure the
right users get the best resources per - Can anyone suggest a means to fund the desktop
grid part in the UK- a small outlay will have
massive benefits