7a.1 - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

7a.1

Description:

Computational Grids – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 62
Provided by: BarryWi6
Category:

less

Transcript and Presenter's Notes

Title: 7a.1


1
Computational Grids
2
Computational Problems
  • Problems that have lots of computations and
    usually lots of data.

3
Demand for Computational Speed
  • Continual demand for greater computational speed
    from a computer system than is currently possible
  • Areas requiring great computational speed include
    numerical modeling and simulation of scientific
    and engineering problems.
  • Computations must be completed within a
    reasonable time period.

4
Grand Challenge Problems
  • One that cannot be solved in a reasonable amount
    of time with todays computers. Obviously, an
    execution time of 10 years is always
    unreasonable.
  • Examples
  • Modeling large DNA structures
  • Global weather forecasting
  • Modeling motion of astronomical bodies.

5
Weather Forecasting
  • Atmosphere modeled by dividing it into
    3-dimensional cells.
  • Calculations of each cell repeated many times to
    model passage of time.

6
Global Weather Forecasting Example
  • Suppose global atmosphere divided into cells of
    size 1 mile ? 1 mile ? 1 mile to a height of 10
    miles - about 5 ? 108 cells.
  • Suppose each calculation requires 200 floating
    point operations. In one time step, 1011 floating
    point operations necessary.
  • To forecast weather over 7 day period using
    1-minute intervals, a computer operating at
    1Gflops (109 floating point operations/s) takes
    106 seconds or over 10 days.

7
Modeling Motion of Astronomical Bodies
  • Each body attracted to each other body by
    gravitational forces. Movement of each body
    predicted by calculating total force on each
    body.
  • With N bodies, N - 1 forces to calculate for each
    body, or N2 calculations.
  • (N log2 N for an efficient approximate
    algorithm.)
  • After determining new positions of bodies,
    calculations repeated.

8
  • A galaxy might have, say, 1011 stars.
  • Even if each calculation done in 1 ms (extremely
    optimistic figure), it takes almost a year for
    one iteration using the N log2 N algorithm.
  • 100 years for 100 iterations. Typically require
    millions of iterations.

9
  • Astrophysical N-body simulation by Scott Linssen
    (undergraduate UNC-Charlotte student).

10
High Performance Computing (HPC)
  • Traditionally, achieved by using the multiple
    computers together - parallel computing.
  • Simple idea! -- Using multiple computers (or
    processors) simultaneously should be able can
    solve the problem faster than a single computer.

11
Using multiple computers or processors
  • Key concept - dividing problem into parts that
    can be computed simultaneously.
  • Parallel programming - programming a computing
    platform consisting of more than one processor or
    computer.
  • Concept very old (50 years).

12
High Performance Computing
  • Long History
  • Multiprocessor system of various types (1950s
    onwards)
  • Supercomputers (1960s-80s)
  • Cluster computing (1990s)
  • Grid computing (2000s) ??

Maybe, but lets first look at how to achieve HPC.
13
Speedup Factor
  • ts is execution time on a single processor
  • tp is execution time on a multiprocessor.
  • S(p) gives increase in speed by using
    multiprocessor.
  • Best sequential algorithm for single processor.
    Parallel algorithm usually different.

14
Maximum Speedup
  • Maximum speedup is usually p with p processors
    (linear speedup).
  • Possible to get superlinear speedup (greater than
    p) but usually a specific reason such as
  • Extra memory in multiprocessor system
  • Non-deterministic algorithm

15
Maximum Speedup Amdahls law
16
  • Speedup factor is given by
  • This equation is known as Amdahls law

17
Speedup against number of processors
  • Even with infinite number of processors, max.
    speedup limited to 1/f .
  • Example With only 5 of computation being
    serial, max. speedup 20, irrespective of number
    of processors.

18
Superlinear Speedup Example Searching
  • (a) Searching each sub-space sequentially

19
  • (b) Searching each sub-space in parallel

20
  • Question
  • What is the speed-up now?

21
  • Worst case for sequential search when solution
    found in last sub-space search. Then parallel
    version offers greatest benefit, i.e.

22
  • Least advantage for parallel version when
    solution found in first sub-space search of the
    sequential search, i.e.
  • Actual speed-up depends upon which subspace holds
    solution but could be extremely large.

23
Types of Parallel Computers
  • Two principal types
  • 1. Single computer containing multiple processors
    - main memory is shared, hence called Shared
    memory multiprocessor
  • 2. Multiple computer system

24
Conventional Computer
  • Consists of a processor executing a program
    stored in a (main) memory
  • Each main memory location located by its address
    within a single memory space.

25
Shared Memory Multiprocessor
  • Extend single processor model - multiple
    processors connected to multiple memory modules
  • Each processor can access any memory module

26
  • Examples
  • Dual Pentiums
  • Quad Pentiums

27
Programming Shared Memory Multiprocessors
  • Threads - programmer decomposes program into
    parallel sequences (threads), each being able to
    access variables declared outside threads.
    Example Pthreads
  • Use sequential programming language with
    preprocessor compiler directives, constructs, or
    syntax to declare shared variables and specify
    parallelism. Examples OpenMP (an industry
    standard), UPC (Unified Parallel C) -- needs
    compilers.

28
  • Parallel programming language with syntax to
    express parallelism. Compiler creates executable
    code -- not now common.
  • Use parallelizing compiler to convert regular
    sequential language programs into parallel
    executable code - also not now common.

29
Multiple ComputersMessage-passing multicomputer
  • Complete computers connected through and
    interconnection network

30
Networked Computers as a Computing Platform
  • Became a very attractive alternative to expensive
    supercomputers and parallel computer systems for
    high-performance computing in 1990s.
  • Several early projects. Notable
  • Berkeley NOW (network of workstations)
    project.
  • NASA Beowulf project.

31
Key Hardware Advantages
  • Very high performance workstations and PCs
    readily available at low cost.
  • Latest processors can easily be incorporated into
    the system as they become available.

32
Programming Clusters
  • Usually based upon explicit message-passing.
  • Common approach -- a set of user-level libraries
    for message passing. Example
  • Parallel Virtual Machine (PVM) - late 1980s.
    Became very popular in mid 1990s.
  • Message-Passing Interface (MPI) - standard
    defined in 1990s and now dominant.

33
Beowulf Clusters
  • Name given to a group of interconnected
    commodity computers designed to achieve high
    performance with low cost.
  • Typically using commodity interconnects
    (high-speed Ethernet).
  • Typically Linux OS.
  • Beowulf comes from name given by NASA Goddard
    Space Flight Center cluster project.

34
Cluster Interconnects
  • Originally fast Ethernet on low cost clusters
  • Gigabit Ethernet - easy upgrade path
  • More Specialized/Higher Performance
  • Myrinet - 2.4 Gbits/sec - disadvantage single
    vendor
  • Infiniband - may be important as Infiniband
    interfaces may be integrated on next generation
    PCs

35
Dedicated cluster with a master node
36
WCU Department of Mathematics and CS leo I
cluster(now dismandled)
Being replaced with Pentium IVs and Gigabit
Ethernet.
37
Message-Passing Programming using User-level
Message Passing Libraries
  • Two primary mechanisms needed
  • 1. A method of creating separate processes for
    execution on different computers
  • 2. A method of sending and receiving messages

38
Multiple program, multiple data model(MPMD)
39
Single Program Multiple Data Model(SPMD)
  • Different processes merged into one program.
  • Control statements select different parts for
    each processor to execute.
  • All executables started together - static process
    creation

40
Single Program Multiple Data Model(SPMD)
41
Multiple Program Multiple Data Model(MPMD)
  • Separate programs for each processor.
  • One processor executes master process.
  • Other processes started from within master
    process - dynamic process creation.

42
Multiple Program Multiple Data Model(MPMD)
43
Point-to-point send and receive routines
Passing a message between processes using send()
and recv() library calls
44
Synchronous Message Passing
  • Routines that return when message transfer
    completed.
  • Synchronous send routine
  • Waits until complete message can be accepted by
    the receiving process before sending the message.
  • Synchronous receive routine
  • Waits until the message it is expecting arrives.

45
Synchronous send() and recv() using 3-way protocol
46
  • Synchronous routines intrinsically perform two
    actions
  • They transfer data and
  • They synchronize processes.

47
Asynchronous Message Passing
  • Do not wait for actions to complete before
    returning.
  • More than one version depending upon semantics
    for returning.
  • Usually require local storage for messages.
  • They do not synchronize processes and allow
    processes to move forward sooner. Must be used
    with care.

48
MPI Definitions of Blocking and Non-Blocking
  • Blocking - return after their local actions
    complete, though message transfer may not have
    been completed.
  • Non-blocking - return immediately.
  • Assumes data storage not modified by subsequent
    statements prior to being used for transfer, and
    it is left to the programmer to ensure this.

49
How message-passing routines return before
transfer completed
Message buffer needed between source and
destination to hold message
50
Asynchronous (blocking) routines changing to
synchronous routines
  • Buffers only of finite length and a point could
    be reached when send routine held up because all
    available buffer space exhausted.
  • Then, send routine will wait until storage
    becomes re-available - i.e then routine behaves
    as a synchronous routine.

51
Message Tag
  • Used to differentiate between different types of
    messages being sent.
  • Message tag is carried within message.

52
Message Tag Example
To send a data, x, with message tag 5 from
process, 1, to destination process, 2, and
assign to y
53
Wild Card
  • If message tag matching not required, wild card
    message tag used.
  • Then, recv() will match with any send().

54
Collective Message Passing Routines
  • Have routines that send message(s) to a group of
    processes or receive message(s) from a group of
    processes
  • Higher efficiency than separate point-to-point
    routines although not absolutely necessary.

55
Broadcast
Sending same message to all processes concerned
with problem.
56
Scatter
Sending each element of an array in root process
to a separate process. Contents of ith location
of array sent to ith process.
57
Gather
Having one process collect individual values from
set of processes.
58
Reduce
Gather operation combined with arithmetic/logical
operation. Example Values gathered and added
together
59
Grid Computing
  • A grid is a form of multiple computer system.
  • For solving computational problems, it could be
    viewed as the next step after cluster computing,
    and the same programming techniques used.

Why is this not necessarily true?
60
  • VERY expensive, sending data across network costs
    millions of cycles
  • Links unreliable
  • Bandwidth shared with other users

61
Computational Strategies
  • As a computing platform, a grid favors situations
    with absolute minimum communication between
    computers.
  • Next class will look at these strategies and
    details of MPI programming.
Write a Comment
User Comments (0)
About PowerShow.com