NOW and Beyond - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

NOW and Beyond

Description:

Demand for resources revealed in price. distinct from the cost of acquiring the ... will cause resources to go to where they are most valued at the lowest price ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 29
Provided by: DavidE2
Category:
Tags: now | beyond

less

Transcript and Presenter's Notes

Title: NOW and Beyond


1
NOW and Beyond
  • Workshop on Clusters and Computational Grids for
    Scientific Computing
  • David E. Culler
  • Computer Science Division
  • Univ. of California, Berkeley
  • http//now.cs.berkeley.edu/

2
NOW Project Goals
  • Make a fundamental change in how we design and
    construct large-scale systems
  • market reality
  • 50/year performance growth gt cannot allow 1-2
    year engineering lag
  • technological opportunity
  • single-chip Killer Switch gt fast, scalable
    communication
  • Highly integrated building-wide system
  • Explore novel system design concepts in this new
    cluster paradigm

3
Berkeley NOW
  • 100 Sun UltraSparcs
  • 200 disks
  • Myrinet SAN
  • 160 MB/s
  • Fast comm.
  • AM, MPI, ...
  • Ether/ATM switched external net
  • Global OS
  • Self Config

4
Landmarks
  • Top 500 Linpack Performance List
  • MPI, NPB performance on par with MPPs
  • RSA 40-bit Key challenge
  • World Leading External Sort
  • Inktomi search engine
  • NPACI resource site

5
Taking Stock
  • Surprising successes
  • virtual networks
  • implicit co-scheduling
  • reactive IO
  • service-based applications
  • automatic network mapping
  • Surprising unsuccesses
  • global system layer
  • xFS file system
  • New directions for Millennium
  • Paranoid construction
  • Computational Economy
  • Smart Clients

6
Fast Communication
  • Fast communication on clusters is obtained
    through direct access to the network, as on MPPs
  • Challenge is make this general purpose
  • system implementation should not dictate how it
    can be used

7
Virtual Networks
  • Endpoint abstracts the notion of attached to the
    network
  • Virtual network is a collection of endpoints that
    can name each other.
  • Many processes on a node can each have many
    endpoints, each with own protection domain.

8
How are they managed?
  • How do you get direct hardware access for
    performance with a large space of logical
    resources?
  • Just like virtual memory
  • active portion of large logical space is bound to
    physical resources

Host Memory
Process n
Processor

Process 3
Process 2
Process 1
NIC Mem
P
Network Interface
9
Network Interface Support
  • NIC has endpoint frames
  • Services active endpoints
  • Signals misses to driver
  • using a system endpont

Frame 0
Transmit
Receive
Frame 7
EndPoint Miss
10
Communication under Load
gt Use of networking resources adapts to demand.
11
Implicit Coscheduling
  • Problem parallel programs designed to run in
    parallel gt huge slowdowns with local scheduling
  • gang scheduling is rigid, fault prone, and
    complex
  • Coordinate schedulers implicitly using the
    communication in the program
  • very easy to build, robust to component failures
  • inherently service on-demand, scalable
  • Local service component can evolve.

12
Why it works
  • Infer non-local state from local observations
  • React to maintain coordination
  • observation implication action
  • fast response partner scheduled spin
  • delayed response partner not scheduled block

13
Example
  • Range of granularity and load imbalance
  • spin wait 10x slowdown

14
I/O Lessons from NOW sort
  • Complete system on every node powerful basis for
    data intensive computing
  • complete disk sub-system
  • independent file systems
  • MMAP not read, MADVISE
  • full OS gt threads
  • Remote I/O (with fast comm.) provides same
    bandwidth as local I/O.
  • I/O performance is very tempermental
  • variations in disk speeds
  • variations within a disk
  • variations in processing, interrupts, messaging,
    ...

15
Reactive I/O
  • Loosen data semantics
  • ex unordered bag of records
  • Build flows from producers (eg. Disks) to
    consumers (eg. Summation)
  • Flow data to where it can be consumed

Adaptive Parallel Aggregation
Static Parallel Aggregation
16
Performance Scaling
  • Allows more data to go to faster consumer

17
Service Based Applications
Transcend Transcoding Proxy
Service request
Front-end service threads
User Profile Database
Manager
Physical processor
Caches
  • Application provides services to clients
  • Grows/Shrinks according to demand, availability,
    and faults

18
On the other hand
  • Glunix
  • offered much that was not available elsewhere
  • interactive use, load balancing, transparency
    (partial),
  • straightforward master-slaves architecture
  • millions of jobs served, reasonable scalability,
    flexible partitioning
  • crash-prone, inscrutable, unaware,
  • xFS
  • very sophisticated co-operative caching network
    RAID
  • integrated at vnode layer
  • never robust enough for real use
  • Both are hard, outstanding problems

19
Lessons
  • Strength of clusters comes from
  • complete, independent components
  • incremental scalability (up and down)
  • nodal isolation
  • Performance heterogeneity and change are
    fundamental
  • Subsystems and applications need to be reactive
    and self-tuning
  • Local intelligence simple, flexible composition

20
Millennium
  • Campus-wide cluster of clusters
  • PC based (Solaris/x86 and NT)
  • Distributed ownership and control
  • Computational science and internet systems testbed

21
Paranoid Construction
  • What must work for RSH, dCOM, RMI, read, ?
  • A page of C to safely read a line from a socket!
  • gt carefully controlled set of cluster system
    ops
  • gt non-blocking with timeout and full error
    checking
  • even if need a watcher thread
  • gt optimistic with fail-over of implementation
  • gt global capability at physical level
  • gt indirection used for transparency must track
    fault envelope, not just provide mapping

22
Computational Economy Approach
  • System has a supply of various resources
  • Demand for resources revealed in price
  • distinct from the cost of acquiring the resources
  • User has unique assessment of value
  • Client agent negotiates for system resources on
    users behalf
  • submits requests, receives bids or participates
    in auctions
  • selects resources of highest value at least cost

23
Advantages of the Approach
  • Decentralized load balancing
  • according to users perception of importance, not
    systems
  • adapts to system and workload changes
  • Creates Incentive to adopt efficient modes of use
  • maintain resources in usable form
  • avoid excessive usage when needed by others
  • exploit under-utilized resources
  • maximize flexibility (e.g., migratable,
    restartable applications)
  • Establishes user-to-user feedback on resource
    usage
  • basis for exchange rate across resources
  • Powerful framework for system design
  • Natural for client to be watchful, proactive, and
    wary
  • Generalizes from resources to services
  • Rich body of theory ready for application

24
Resource Allocation
Stream of (partial, delayed, or
incomplete) resource status information
Stream of (incomplete) Client Requests
Allocator
  • Traditional approach allocates requests to
    resources to optimize some system utility
    function
  • e.g., put work on least loaded, most free mem,
    short queue, ...
  • Economic approach views each user as having a
    distinct utility function
  • e.g., can exchange resource and have both happy!

25
Pricing and all that
  • Whats the value of a CPU-minute, a MB-sec, a
    GB-day?
  • Many iterative market schemes
  • raise price till load drops
  • Auctions avoid setting a price
  • Vikrey (second price sealed bid) will cause
    resources to go to where they are most valued at
    the lowest price
  • In self-interest to reveal true utility function!
  • Small problem auctions are awkward for most real
    allocation problems
  • Big problem people (and their surrogates) dont
    know what value to place on computation and
    storage!

26
Smart Clients
  • Adopt the NT everything is two-tier, at least
  • UI stays on the desktop and interacts with
    computation in the cluster of clusters via
    distributed objects
  • Single-system image provided by wrapper
  • Client can provide complete functionality
  • resource discovery, load balancing
  • request remote execution service
  • Flexible applns will monitor availability and
    adapt.
  • Higher level services 3-tier optimization
  • directory service, membership, parallel startup

27
Everything is a service
  • Load-balancing
  • Brokering
  • Replication
  • Directories
  • gt they need to be cost-effective or client will
    fall back to self support
  • if they are cost-effective, competitors might
    arise
  • Useful applications should be packaged as
    services
  • their value may be greater than the cost of
    resources consumed

28
Conclusions
  • Weve got the building blocks for very
    interesting clustered systems
  • fast communication, authentication, directories,
    distributed object models
  • Transparency and uniform access are convenient,
    but...
  • It is time to focus on exploiting the new
    characteristics of these systems in novel ways.
  • We need to get real serious about availability.
  • Agility (wary, reactive, adaptive) is
    fundamental.
  • Gronky F77 MPI and no IO codes will seriously
    hold us back
  • Need to provide a better framework for cluster
    applications
Write a Comment
User Comments (0)
About PowerShow.com