UltraSparc IV - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

UltraSparc IV

Description:

UltraSparc IV Tolga TOLGAY OUTLINE Introduction History What is new? Chip Multitreading Pipeline Cache Branch Prediction Conclusion INTRODUCTION Sparc = Scalable ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 32
Provided by: Tolg4
Category:

less

Transcript and Presenter's Notes

Title: UltraSparc IV


1
UltraSparc IV
  • Tolga TOLGAY

2
OUTLINE
  • Introduction
  • History
  • What is new?
  • Chip Multitreading
  • Pipeline
  • Cache
  • Branch Prediction
  • Conclusion

3
INTRODUCTION
  • Sparc Scalable Processor Architecture
  • Open processor architecture
  • SUN UltraSparc v9
  • RISC Architecture
  • 64 bit address and data
  • Superscalar

4
HISTORY
  • Begin developing Sparc 1984
  • First Sparc Processor 1986
  • SuperSparc 1992
  • UltraSparc I 1995
  • UltraSparc II 1997
  • UltraSparc III 2001
  • UltraSparc IV 2004
  • UltraSparc IV 2005
  • UltraSparc T1 2005

5
WHAT IS NEW?
  • What UltraSparc IV offers new
  • CMT (Chip Multithreading)
  • New registers added due to CMT enhancement
  • MCU registers, Sun Fireplan Interconnect
    registers are shared.
  • Enhancements on Floating Point Unit
  • 16 MB L2 cache with 128 byte line-size shared by
    two processors.
  • L2 caches uses LRU replacement strategy
  • New write-cache indexing-hashing feature

6
Chip Multitreading (CMT)
  • Two UltraSparc III cores into one die.
  • Two mirrored cores share
  • System bus
  • DRAM controller
  • Off-die L2 cache
  • Fireplan registers.
  • Also called Chip Multiprocessing

7
Chip Multitreading
8
Chip Multitreading
  • Aim is to increase performance without increasing
    clock speed.
  • Mirroring the cores cause a hot spot of floating
    point units.
  • How to avoid hot spot
  • Heat towers in copper interconnect

9
Chip Multitreading
10
Core
  • More core improvements
  • Improved instruction fetch and store bandwidth.
  • Improved data prefetching
  • FPU can handle more unexpected and underflow
    cases so reducing exceptions.
  • On-die cache enhanced with a hashed index to
    better handle multiple writes.

11
Pipeline
  • Because UltraSparc IV contains two UltraSparc III
    cores, it uses the same pipeline.
  • 4-way superscalar architecture.
  • 14-stage pipeline.

12
Pipeline Stages
13
Pipeline Stages
Pipeline Stage Definition
A Address Generation
P Preliminary Fetch
F Fetch Intructions from I-Cache
B Branch Target Computation
I Instruction Group Formation
J Grouping
R Register Access
E Execute
C Cache
M Miss Detect
W Write
X Extend
T Trap
D Done
14
Pipeline Stages
15
Pipeline Stages
  • Stage A Address Generation
  • Generates and selects the fetch address
  • Address can be selected from several sources
  • Stage P Preliminary Fetch
  • Starts fetching from I-Cache
  • Accesses to Branch Predictor
  • Stage F Fetch
  • Second half of I-Cache access
  • At the end of stage 4 instructions may be latched
  • Stage B Branch Target Computation
  • Analyzes the instructions
  • Calculate branch target address

16
Pipeline Stages
  • Stage I Instruction Group Formation
  • Instructions are grouped into instruction queue.
  • Stage J Instruction Group Staging
  • A group of instructions are dequeued and sent to
    R-Stage
  • Stage R Dispatch and Register Access
  • Dependency calculation
  • Dependency solution

17
Pipeline Stages
  • Stage E Integer Instruction Execution
  • First stage of execution pipelines
  • Integer instructions -gt A0 and A1 pipelines
  • Branch instructions -gt Branch pipeline
  • Other instructions -gt MS pipeline
  • Stage C Cache
  • Integer pipelines write results back
  • SIU results are produced
  • First stage for Floating Point Instructions

18
Pipeline Stages
  • Stage M Miss
  • Data cache misses are determined
  • Second step for FP instructions
  • Stage W Write
  • MS pipeline results are written
  • Third step for FP instructions
  • D-cache miss requests send to L2 cache
  • Stage X Extend
  • Final step for Floating Point instructions
  • Results from FP instructions are ready for bypass

19
Pipeline Stages
  • Stage T Trap
  • Traps are signalled
  • After trap, instructions invalidate results
  • Stage D Done
  • Integer results are written into architectural
    register file
  • Floating point results are written to floating
    point register file.
  • Results became visible to any traps generated
    from younger instructions.

20
Pipeline Rules
  • Grouping rules
  • Group collection of instructions that does not
    limit eachother to be executed in parallel
  • Made before R-stage
  • Needed for
  • The execution order is maintained
  • Each pipeline runs a subset of instructions
  • Instructions may require helpers
  • Execution order in order execution

21
Cache Organization
  • Doubled cache size because of dual core.
  • Data Cache 64 KB x 2
  • Instruction Cache 32 KB x 2
  • L2 Cache 16 MB, off-chip, shared
  • No L3 Cache

22
Cache Organization
23
Cache Organization
  • Data Cache
  • 64 KB Level 1 cache per core
  • Instruction Cache
  • 32 KB Level 1 cache per core
  • 4 way associative

24
Cache Organization
  • Prefetch Cache
  • One of L1 caches
  • 2 Kbyte SRAM 32 x 64 bytes
  • Uses LRU replacement algorithm
  • Aim is to fetch data before needed
  • Reduces main memory access latency
  • 2 ports reads 8 bytes, 1 port writes 16 bytes per
    cycle.
  • Hardware prefetch

25
Cache Organization
  • Write Cache
  • Reduces the bandwidth due to store traffic
  • 2 Kbyte cache
  • Handles multiprocessor and on-chip cache
    consistency
  • Improves error recovery
  • Optionally uses a hashed index

26
Cache Organization
  • L2 Cache
  • 16 MB SRAM shared by two processors
  • Seperate L2 cache tags
  • Two way set associative
  • LRU replacement policy
  • 128 bytes of line size
  • UltraSparc IV has an on-die Level 2 cache with
    an off-die Level 3 cache

27
Branch Prediction
  • Branch Predictor
  • Small, single-cycle accessed
  • SRAM
  • Output is connected to P-stage
  • Branch detemination is made in B-stage
  • If miss, return to A-Stage.

28
Conclusion
  • UltraSparc IV is a milestone as it is first dual
    core chip of UltraSparc family
  • Sun continues to develop UltraSparc
  • UltraSparc IV
  • UltraSparc T1

29
References
  • UltraSparc IV Users Manual, Sun Microsystems
  • UltraSparc IV Whitepaper, Sun Microsystems
  • UltraSparc IV Mirrors Predecessor, Kevin Krewell
  • Implementation and Productization of a 4th
    Generation 1.8GHz Dual-Core SPARC V9
    Microprocessor, Anand Dixit, Jason Hart, ...
  • UltraSparc III Users Manual, Sun Microsystems

30
References
  • Web Sites
  • http//web.cs.unlv.edu/cs219/group3/index.html
  • http//bwrc.eecs.berkeley.edu/CIC/archive/cpu_hist
    ory.htmlSPARC
  • http//www.arcade-eu.org/overview/2005/
    sparcIV.html
  • http//www.top500.org/orsc/2006/sparcIV.htm
  • http//www.sparc.org/history.html

31
Questions...
Write a Comment
User Comments (0)
About PowerShow.com