JavaTile: CMP-simulation with a twist - PowerPoint PPT Presentation

About This Presentation
Title:

JavaTile: CMP-simulation with a twist

Description:

Show benefits and problems of approach. Spark interest in collaboration ... Hydra 4 MIPS-core CMP simulator. CMP-SIM (extension of SimpleScalar) ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 17
Provided by: dangree
Category:

less

Transcript and Presenter's Notes

Title: JavaTile: CMP-simulation with a twist


1
JavaTile CMP-simulation with a twist
  • Dan Greenfield
  • Computer Architecture Group

Internal Presentation, 16th February 2007
2
Aim of Talk
  • Introduce JavaTile
  • Show benefits and problems of approach
  • Spark interest in collaboration
  • Invite expertise from multiple areas to solve CMP
    problems

3
Quick Background Exciting Times!
Cisco 188-core (50 BIPS) 2
Intel 80-core (1 TFLOPS) 1
4
Parts of a CMP
  • Q How well do each of the components run?
  • Q How well does the network run?

From Pestata et al 2004 3
5
Parts of a CMP continued
  • Real Q How well do Applications run?

6
Motivations
  • Need more realistic NoC traffic
  • Current methods synthetic, limited applications,
    low PE count, course-grain, OO Superscalar
    internals
  • How is the network used?
  • What is needed in NoC for future CMP?
  • Want System-level view of performance, power and
    fault-tolerance
  • Most current metrics concern the NoC and 'guess'
    what this means for the system-level
  • Want to explore solutions at all levels

7
Some Existing CMP Approaches
  • SimpleScalar-based CMP simulator
  • Hydra 4 MIPS-core CMP simulator
  • CMP-SIM (extension of SimpleScalar)
  • SESC Superscalar (1.5MIPS on 3GHz P4)
  • GEMS (commercial SIMICS-based)
  • ML-RSIM (Sparc RSIM-based)

8
Java Virtual Machine
  • Platform with standard library
  • Virtual Processor executing Java instruction set
    'bytecode'
  • Compilable to native platform

9
Java Advantages
  • A widely deployed standard platform
  • Its 'machine code' is itself Object Oriented with
    type information
  • Amenable to static code analysis
  • Tools to run efficiently, or compile to native
    executable

10
JavaTile Processing Element
11
JavaTile System
12
Bytecode Instrumentation
  • Hook into all instructions that may cause NoC
    traffic

Fibonacci2() Code 0 bipush 0 2
bipush -33 4 invokestatic 23
//Method monitor/Monitor.methodStart(II)V 7
sipush -29729 10 sipush 0 13
invokestatic 26 //Method monitor/Monitor.jum
pMarker(II)V 16 aload_0 17 sipush
1 20 invokestatic 30 //Method
monitor/Monitor.syncCycleCount(I)V 23
invokespecial 32 //Method java/lang/Object."lt
initgt"()V 26 sipush -29729 29
sipush 4 32 invokestatic 35
//Method monitor/Monitor.postMethodCall(II)V
35 return
13
Current Flow
14
Problems
  • Garbage Collection
  • Local memory vs global memory allocation
  • Passing by pointers (ownership)
  • Push versus Pull
  • No Inlining
  • Auto-Parallelization
  • Debugging

15
Auto-Parallelization
  • Software Pipelining
  • e.g. MIT RAW Compiler 4
  • e.g. Princeton DSWP (Decoupled SWP) 5
  • Thread-Level Speculation
  • Loop-level (e.g. Stanford Jrpm) 6
  • Method-level (e.g. SableSpMT) 7
  • Affine Partitioning
  • e.g. Incorporated in Stanford SUIF 8

16
References
  • 1 Intel Polaris, from IDF 2006 slides, photo at
    http//www.tomshardware.com
  • 2 W. Eatherton, The Push of Network Processing
    to the Top of the Pyramid, Keynote Slides at
    http//www.cesr.ncsu.edu/ancs/slides/eatherton/Key
    note.pdf
  • 3 Pestata et al, Cost-Performance Trade-Offs in
    Networks on Chip A Simulation-Based Approach,
    DATE 2004
  • 4 Waingold et al, Baring it All to Software
    Raw Machines, Computer Vol 30, 9, 1997
  • 5 Ottoni et al, Automatic Thread Extraction
    with Decoupled Software Pipelining. MICRO 2005
  • 6 Chen et al, The Jrpm System for Dynamically
    Parallelizing Sequential Java Programs, IEEE
    Micro Vol 23, No 6, Nov/Dec 2003
  • 7 Pickett et al, SableSpMT a software
    framework for analysing speculative
    multithreading in Java, PASTE Workshop 2006
  • 8 Lim et al, An affine partitioning algorithm
    to maximize parallelism and minimize
    communications, ACM SIGARCH 1999
Write a Comment
User Comments (0)
About PowerShow.com