Time Warp on the Blue Gene - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Time Warp on the Blue Gene

Description:

Port the ROSS kernel from shared memory to the Blue Gene/L ... Remote events lead to ROSS-MPI consuming memory during GVT computation. ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 17
Provided by: akintay
Category:
Tags: blue | gene | ross | time | warp

less

Transcript and Presenter's Notes

Title: Time Warp on the Blue Gene


1
Time Warp on the Blue Gene
  • Analysis of Time Warp on the Blue Gene
    Supercomputer.

2
Overview
  • Port the ROSS kernel from shared memory to the
    Blue Gene/L
  • Compare the performance of ROSS-MPI with ROSS-SMP
    on a shared memory platform
  • Investigate the performance of ROSS-MPI on the
    Blue Gene/L

3
Motivation
  • Distributed memory computers, scale bigger than
    shared memory computers.
  • Models exist that are beyond the capability of
    shared memory computers.
  • The Internet contains 500,000,000 host ISC,
    modelling the Internet on shared memory often
    requires compromises in model design.

4
Background
  • Time Warp is an optimistic parallel discrete
    event simulator.
  • ROSS is an efficient Time Warp implementation.
  • Blue Gene/L is a distributed memory
    supercomputer, capable of 500 TFLOPS.

5
TimeWarp Architecture, Recap
  • Logical Processor (LP) are tasks, the basic unit
    of work.
  • Processor Elements (PE) are mapped to processors.
  • Global Virtual time indicates the earliest time
    for an event within the system.
  • Events older than GVT can be removed from memory.
  • Events that execute out of order can be rolled
    back.

6
Approach
  • Implement the Processor Elements PE as separate
    MPI tasks.
  • Split or remove global data structures.
  • Replace Fujimoto with an efficient distributed
    memory algorithm

7
Remote Events
  • Process Elements (PE) are mapped to MPI tasks.
  • Events between the Processor Elements are sent as
    MPI messages, Remote events.
  • Remote events increase memory consumption.
  • Remote events lead to ROSS-MPI consuming memory
    during GVT computation.

8
GVT Computation algorithm
  • Delivers the consistent cut, by performing a
    global sum over the number of outstanding
    messages.
  • Transient message problem avoided, as there are
    no outstanding messages when it is computed.
  • Simultaneous reporting avoided, as GVT
    computation is delayed until all messages are
    received.

9
Workloads, PCS and PHOLD
  • PHOLD, a synthetic benchmark that exercises the
    performance of Time Warp. Workload is due to
    event scheduling. Characterised by a random
    communication pattern.
  • PCS, a model of a mobile phone network. An
    example of a typical model.

10
Performance under SMP
11
Performance under SMP
  • All events were randomly scheduled at another LP.
  • ROSS-MPI's performance decreased with processor
    count.

12
Performance under SMP
  • Performance improvement is sub-linear with
    increasing processor count.

13
PHOLD, Scaling
14
PHOLD,Scaling
  • With a population of 8 million events, peak
    performance of 300 million events/sec with 4096
    processors.
  • With a population of 16 million events,
    performance of 500 million events/sec with 8192
    processors.
  • PHOLD modified, only 10 percent of events are
    remote.
  • Results are preliminary.

15
Conclusion
  • ROSS-MPI can simulate larger problems than
    ROSS-SM.
  • ROSS-SM is noticeably faster than ROSS-MPI on
    shared memory.
  • The scalability and performance of ROSS-MPI on
    the Blue Gene/L is encouraging.

16
Summary
  • Port the ROSS kernel to the Blue Gene/L
  • Compare the performance of ROSS-MPI with ROSS-SMP
    on a shared memory platform
  • Investigate the performance of ROSS-MPI on the
    Blue Gene/L
Write a Comment
User Comments (0)
About PowerShow.com