HPCC Status, 4172009

1 / 39
About This Presentation
Title:

HPCC Status, 4172009

Description:

Construction continues, should be done by May ... pre-emption. HPCC Meeting. 35. Interested? Contact Kelly Osborn at kosborn_at_msu.edu ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 40
Provided by: billp74

less

Transcript and Presenter's Notes

Title: HPCC Status, 4172009


1
HPCC Status, 4/17/2009
  • Buyin
  • Scheduling

2
Changes in HPCC world
  • SGI bankruptcy and subsequent purchase by
    Rackable
  • Apparent end of Western Scientific
  • Sun is rumored to be up for sale (IBM the latest)
  • Economy is stressing many companies

3
Changes in OUR HPCC world
  • Construction continues, should be done by May
  • Working on a number of software issues, in
    particular networking
  • Systems have been running fairly busy in the
    90-95 recently, overall over 75.

4
Recent issues
  • SGI SMP
  • Green lost memory DIMM, went down
  • White (Green frontend) went down
  • Weird power loss (50 seconds, 300 kvA offline)
  • Construction/planned downtime Thur.

5
Recent changes
  • new lustre file system online
  • new user file system online (allows for Samba
    mounts!)
  • NFS 4 running on Brody and infrastructure (help
    with networking problems)
  • better testing suite to find problems sooner

6
Coming soon
  • a single disk image (same copy of the OS) is
    being developed to be run on every system. It
    will make using the different clusters much
    easier
  • the environment will be the same on every system
    (cluster, fat nodes, whatever)
  • ssh test-amd05, soon will have an intel version
    as well.

7
GLCPC
  • www.greatlakesconsortium.org
  • recently had a survey which 7 MSU members had
    filled out (thanks!)
  • will hold summer sessions, likely remotely to the
    various institutions
  • Really want to find out who will use the Blue
    Waters machine and what they need to know to do
    so.
  • Please let me know any questions.

8
Staff plus issues
  • Staff will present some of the present issues the
    Center is working on

9
Home Directory StorageEd Kryda, Manager
  • Currently 100TB available / 50GB default quota
  • Sun X4540 customized
  • Performance 200 MB/s write, 1 GB/s read Max
  • Initial reliability issues
  • NFS v4
  • Samba/CIFS file sharing!
  • snapshots

10
Lustre StorageGreg Mason, System Administrator
  • Old Lustre retirement 5/1/09 (/mnt/lustre)
  • Eventually repurposed
  • New Shared Scratch Space (/mnt/ls09)
  • 33 TB
  • /mnt/lustre_scratch_2009
  • /mnt/scratch
  • ONLY TEMPORARY FILES
  • Future automatic deletion

11
User Education and AssistanceDirk Colbry,
Academic Specialist
http//wiki.hpcc.msu.edu/
  • Research Collaborations
  • System Level Debugging
  • System Level Testing
  • University Level Training Classes
  • Research Group Level Training Classes
  • Face-to-Face Individual Training and Debugging
  • Up-to-date Documentation

12
Better testingJim Leikert, System Administrator
  • New scripts for testing node health
  • New measures to keep jobs in line
  • Job state messages
  • Slowly being rolled out

13
User vignettesKelly Osborn, Administrative
Assistant
  • improves our public face
  • currently have 12 vignettes
  • looking for additional research to showcase
  • kosborn_at_msu.edu

14
SMP and WhiteAndy Keen, System Administrator
  • SMP off support, down twice, repairing by hand
  • White was down to two processors
  • SMP days numbered
  • Need to transition to newer fat nodes
  • it will require recompilation to use the new
    library links. Queue brody_4s

15
Shorter term issues
  • Discussion items

16
Buy replacement SMP nodes
  • We have previously discussed buying replacement
    nodes
  • Sweet spot is a box with 32 cores, 256GB
  • Would like to buy on the order of 4-5 of these as
    replacements for the SMP
  • Note you would have to recompile!
  • same OS image as the clusters however!
  • Your opinions? Wed like to buy soon.

17
more storage
  • roll our own has been a lot of work.
  • Transition to NFS4 has improved performance and
    reliability but we need more storage
  • Continue with the cheaper, expandable version or
    go with a turnkey solution (such as NetApp)?

18
Rack
Chassis
Nodes
Processors / Sockets
Cores
Examples
19
Job Scheduling Example
Queue
ID1
of cores
duration
ID2
Priority
ID3
ID4
ID5
Current Jobs
New Schedule
1
Node 1
Node 1
4
Backfill
Current Time
Current Time
20
Isolating long running jobs
  • Working now on isolating long running jobs.
  • long running jobs clog the nodes, especially
    long running, single cpu jobs.
  • users would prefer to run on a single node for
    better efficiency

21
Current Scheduling Problem
1 week
Node
Current Time
  • Long single core jobs take over nodes.
  • Middle sized jobs (8-64 cores) can not be
    reliably scheduled on dedicated nodes.
  • Very large core jobs can not be scheduled at all.

22
Changing the scheduling of long jobs
  • We propose grouping long-term jobs in the system
  • Could involve capping the number
  • For example, reserve ¼ of each cluster (128
    256 for 384)
  • Improve scheduling of larger jobs, with potential
    few side effects
  • Discussion?

23
Discussion, buy-in priority
24
Reinstitute buy-in
  • Would like to reinstitute buy-in, users buying
    nodes to be run by the HPCC
  • the recent renovations allow for expansion of the
    centers facilities for shared, HPCC
    infrastructure (no user hosting!)
  • we believe there are many users with equipment
    money who would like to buy-in

25
Rack
Chassis
Nodes
Processors / Sockets
Cores
Examples
26
Users will buy chassis
  • Increment of a chassis for purchase
  • Price to be determined, but roughly 1000/core
  • box will be 8 or 16 cores, depends on deals and
    prices
  • example deal 8 core Nehalem, 48 GB memory, about
    8000 (varies).
  • better deals with larger purchases

27
HPCC will provide
  • support the hardware, networking, disks, power
    and cooling
  • software, OS, access
  • 3 or 5 years (need feedback)
  • most support contracts are 3 years, could be 5
    but there are issues with this

28
HPCC will also purchase chassis
  • HPCC does have some funds to purchase general use
    nodes as well
  • For the next 5 years will continue to expand
    within the bounds of ICER budget.
  • However, ICER budget is sliding scale, providing
    more support, less hardware, over time.

29
Priority scheduling of buy-in
  • These are points of discussion
  • need your feedback.
  • Couple of models, all of which allow non-used
    nodes to get scheduled for larger jobs, but still
    give buy-in users access

30
First, really two systems
  • HPCC provides public nodes for anyone with an
    HPCC account to schedule
  • first come, first served (mostly)
  • The researchers who buy-in would have reserved
    access to their nodes, and the slack of other
    buy-in users
  • no general scheduling in this part of the system
    (mostly)

31
In the buy-in system, three issues
  • How quickly
  • How many
  • How long

32
How quickly Purdue model
  • guarantee access to number of purchased nodes
    within X hours (could be 1 hour, 4 hours, 8
    hours). Purdue is now 4
  • Buy-in users can get more than they ask for if
    they dont run longer than 4 hours (1 hour, 8
    hours, ).
  • cannot guarantee the big job will go within some
    time period, but the timeslice above provides
    an opportunity

33
How many Dial-in nodes
  • Users can dial-in how many nodes of those
    purchased they need within some time slice (1
    day, 1 week, )
  • Dialing-in low gets higher priority or future
    credit, but other nodes now available for
    larger jobs outside of what was purchased
  • Must have a reasonable time-slice to get good
    scheduling (a week?)

34
How long Dial-in Area
  • Buy-in users get their nodes 24x7
  • Could use the area model under some timeslice.
    For example
  • You bought 100cores x168hours (1 week timeslice)
  • Could use 200cores x 84hours (then wait 84 hours
    before you schedule again)
  • Only resets every timeslice

35
Others on buy-in nodes
  • We would still like to keep utilization up on
    buy-in nodes, so it would is possible that
    general users get access to those nodes under two
    conditions
  • very short time jobs (especially single cpu)
  • pre-emption

36
Interested?
  • Contact Kelly Osborn at kosborn_at_msu.edu
  • Required Information
  • Account Number
  • Approximate Amount (unit amount unknown)
  • Deadlines on spending?
  • Contact name

37
Short, single cpu jobs
  • Very short jobs can be used as backfill in the
    scheduler to fill holes
  • if short, no one has to wait very long (5 minutes
    say)
  • only if there is slack in the schedule
  • low priority

38
Preemption
  • Jobs that label themselves are preemptible get
    very high priority and can run anywhere at
    anytime
  • preemptible means that they can be stopped at any
    time
  • once stopped, re-queued at high priority
  • User must recover state of stopped job!

39
What about non buy-in researchers
  • Wolfgang
Write a Comment
User Comments (0)