Future of the Microprocessors - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Future of the Microprocessors

Description:

Whether will HW alone continue to extract parallelism? Compatibility with legacy softwares ... better at exploiting parallelism. Workloads come to contain ... – PowerPoint PPT presentation

Number of Views:186
Avg rating:3.0/5.0
Slides: 19
Provided by: mae74
Category:

less

Transcript and Presenter's Notes

Title: Future of the Microprocessors


1
Future of the Microprocessors
  • Billion-Transistor Architectures
  • IEEE Computer, September 1997

2
Billion-Transistor Architectures
  • Future Trends
  • Hardware trends and physical limits
  • In the 1994 road map, the Semiconductor Industry
    Association predicted
  • by 2010, 800 million Trs with thousands of pins,
    1000-bit bus, and clock speeds over 2 GHz
  • 180 W
  • On-chip wires are becoming much slower relative
    to logic gates
  • impossible to maintain one global clock over the
    entire chip
  • sending signals across a billion trs as many as
    20 cycles
  • System software
  • Whether will HW alone continue to extract
    parallelism?
  • Compatibility with legacy softwares

3
  • Future workloads
  • Architectural design is driven by the dominant
    anticipated workload
  • multimedia workloads
  • Design, verification, and testing
  • complex hundreds of engineers
  • validation and testing 40 to 50 of an Intel
    chips design cost and 6 of the transistors
  • Economies of scale
  • Fabrication plants 2 billion (a factor of ten
    more than a decade ago)
  • need larger markets mass marketing of computer
    chips

4
Future Architectures
Advanced superscalar processors
Simultaneous multithreaded processors
Vector IRAM processors
Raw(configurable) processors
Superspeculative processors
Trace(multiscalar) processors
Chip multiprocessors
Wire delays become dominant, forcing HW to be
more distributed System software(compilers)
becomes better at exploiting parallelism Workloads
come to contain more exploitable
parallelism Design and validation costs become
more limiting
Trends
5
Advanced Superscalar
  • One Billion Transistors, One Uniprocessor, One
    Chip
  • U of Michigan
  • Billion transistor processors will be much as
    they are today
  • Bigger, faster, and wider
  • Out-of-order fetching, Multi-Hybrid branch
    predictors, and trace caches
  • Large, out-of-order-issue instruction window
    (2,000 instructions), clustered banks of
    functional units
  • The current uniprocessor model can provide
    sufficient performance and use a billion
    transistors effectively without the programming
    model or discarding software compatibility.

6
One Billion Transistors, One Uniprocessor, One
Chip
60 M for execution core 240 M for trace cache 48
M for branch predictor 32 M for data cache 640 M
for L2 cache
7
Superspeculative
  • Superspeculative Microarchitecture for Beyond AD
    2000
  • CMU
  • Billion-transistor uniprocessor
  • Massive speculation at all levels to improve
    performance
  • Trace caches and advanced branch prediction
  • Without this much speculation, future processors
    will be limited by true dependences
  • Their investigations discovered large speedups on
    code that have traditionally not been ameanable
    to finding ILP

8
Superspeculative Microarchitecture for Beyond AD
2000
9
Simultaneous Multithreading
  • Simultaneous Multi-Threading(SMT) Processor
  • Wide-issue superscalar Multithreaded Processor
  • multiple issues per cycle
  • HW for multithreads (registers, PC, and so on)
  • Exploit all types of parallelism
  • Within a thread
  • Among threads

10
Trace
  • Trace Processors Moving to Fourth-Generation
    Microarchitectures
  • U of Wisconsin-Madison
  • Multiple, distributed on-chip processor cores
  • Each of the cores simultaneously executes a
    different trace
  • All but one core executes the traces
    speculatively, having used branch prediction to
    select traces that follow the one executing
  • It does not require explicit compiler supports
  • Rely heavily on replication, hierarchy, and
    prediction

11
(No Transcript)
12
Vector IRAM
  • Vector IRAM
  • U of California, Berkeley
  • Intelligent RAM(IRAM)
  • To increase the on-chip memory capacity by using
    DRAM instead of SRAM
  • The resultant on-chip memory capacity
  • High memory bandwidth
  • cost-effective vector processing

13
(No Transcript)
14
A Single-Chip Multiprocessor
  • Single-Chip Multiprocessor
  • Stanford University
  • Multiple (four to 16) simple, fast processors on
    one chip
  • each processor is tightly coupled to a small,
    fast, level-one cache
  • all processor share a larger level-two cache
  • a parallel job or independent tasks
  • Simpler design, faster validation, cleaner
    functional partitioning, and higher theoretical
    peak performance
  • Compilers will have to make code explicitly
    parallel
  • Old ISAs will be incompatible with this
    architecture

15
(No Transcript)
16
Raw Processor
  • Baring It All to Software RAW Machines
  • MIT
  • The most radical architecture
  • Highly parallel architectures with hundreds of
    very simple processors coupled to a small portion
    of the one-chip memory
  • Each processor or tile
  • a small bank of configurable logic, allowing
    synthesis of complex operations directly in
    configurable HW
  • Compilers efficacy
  • does not use a traditional instruction set
    architecture
  • all units are told explicitly what to do by the
    compiler
  • the compiler even schedules most of the intertile
    communication

17
(No Transcript)
18
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com