Microprocessors - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Microprocessors

Description:

Motorola has a long tradition as the leading provider of embedded ... 603ev, 603p (Valiant), 603r, 603er (Goldeneye) manufacturing optimization. PowerPC 604 ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 29
Provided by: montgome
Learn more at: https://cs.nyu.edu
Category:

less

Transcript and Presenter's Notes

Title: Microprocessors


1
Microprocessors
  • Introduction to PowerPC Architecture
  • History Interesting Tidbits

2
Outline
  • Motorola has a long tradition as the leading
    provider of embedded technologies has produced
    revolutionary microprocessor and microcontroller
    solutions
  • And Motorola continues to build on that tradition
    of leadership and innovation with the
    ever-expanding family of microprocessors that
    implement the PowerPC instruction set
    architecture
  • In these slides, well take a look at just how
    the PowerPC got to be in the place it is today.

3
Background of POWER1
  • Part of IBMs first attempt at making a real
    workstation
  • POWER Performance Optimization With Enhanced
    RISC
  • IBM redefined RISC to mean Reduced Instruction
    Set Cycle
  • Unlike classic RISC design, the POWER1 would be a
    complex processor
  • This meant more high level instructions and more
    memory-data processors
  • This goes against initial RISC philosophy!

4
POWER1 Branch unit
  • Had three Instruction Caches branch, integer,
    and floating point units
  • Branch unit unusually complex
  • Contained a program counter, condition code (CC)
    register, and loop register
  • CC register 8 fields
  • First 2 reserved for fixed float ops
  • The 7th for vector operations
  • and the rest could be set separately

5
POWER1 Branch unit cont.
  • Loop register is a counter for decrement and
    branch on zero loops with no branch penalty
  • Branch unit could dispatch multiple instructions
    while itself executing a program control op (up
    to four ops at once, and out of order)
  • This meant this is one of the first superscalar
    CPUs!

6
Integer/Float units
  • Two 32-bit registers for the integer unit and all
    load/store operations
  • Register R0 treated as a constant zero for some
    instructions
  • Used an MQ register for extended precision
    mutiply/divides
  • Similar to the MIPS HI/LO registers
  • Thirty two 64-bit registers for floating point
    unit
  • Performed only double precision operations
  • Used a condition bit to catch float errors (no
    exceptions!)

7
MQ register
  • The MQ Register is 36 bits
  • During a multiply instruction, MQ contains the
    multiplier
  • During a divide instruction, MQ receives the
    quotient
  • It can be shifted right or left, independently,
    or combined with AC into a 72-bit register

8
PowerPC
  • Born out of a desire to produce a version of the
    POWER that would succeed both the Motorola 68000
    Intel x8086
  • Most notable changes
  • Elimination of the MQ register
  • Replaced by separate upper and lower half
    instructions (able to execute simultaneously)
  • Some complex instructions were removed
  • Emulated in the new PowerPC
  • Support for 32-bit floating point

9
PowerPC 601 (G1)
  • Meant to bridge the POWER1 and PowerPC features
  • Geared towards consumers using workstations
    rather than high end
  • Essentially the same as the POWER1 except for a
    32K cache (rather than separate I/D caches)
  • Held onto many of legacy instructions from the
    POWER1

10
The POWER2 is RISCy
  • The big selling point of the POWER2 was its
    ability to handle six instructions at one time
  • However, it came with the caveat under ideal
    conditions
  • They couldnt be just any old instructions -- to
    maintain that performance, the POWER2 had to mix
    exactly two integer instructions, two
    floating-point instructions, and two branch or
    condition-code instructions

11
POWER2 cont.
  • Other additions to the Power2 were
  • Quad-word load and store instructions
  • Hardware square root instruction
  • New instructions for conversion of floating-point
    values to integers
  • Like the POWER1, this was targeted to high end
    systems, leaving average users to use the PowerPC

12
PowerPC 603 (G2)
  • Separated the load/store ops from the integer
    unit
  • Split the branch unit into a fetch/branch unit, a
    dispatch unit, and a completion/exception unit
  • Added a rename buffer in the dispatch unit for
    speculative execution using renamed integer
    float registers

13
The little processor that couldnt
  • Strategy for reducing the size of the 603
  • Use a split cache design (instead of a more
    complex unified cache)
  • Remove "unused or legacy instructions
  • Reduced the cost and the power, so 603s could be
    made much cheaper, and at higher speeds.
  • Had a slight performance penalty (per MHz) but
    the chips could be made at higher speeds -- which
    would more than make up for it.
  • A good idea, but marketing can be unpredictable

14
603s Marketing Blunder
  • The 603 was compared to the 601 and other high
    end machines
  • MHz per dollar, the 603 beat out the 601
  • But simply comparing MHz to MHz, the 601 was
    largely faster
  • So buyers got the impression that they were
    getting ripped off
  • A case of mistaken expectations!

15
603 The Engergizer processor
  • Despite initial marketing problems, this
    processor became prolific and had far more
    variants than any other PowerPC
  • 603e (603 / Stretch) used to solve cache size
    problems
  • 603ev, 603p (Valiant), 603r, 603er (Goldeneye)
    manufacturing optimization

16
PowerPC 604
  • The G2 processors were split into two different
    families (the 603's and the 604's).
  • The 604's were meant to be the bad boys of the
    desktop - Power and cost were not as important as
    pure blinding speed.
  • Unlike the 601 and 603, the 604 can do as many as
    4 simultaneous instructions

17
PowerPC 604 float support
  • 604's also had tweaks to improve its ability to
    run inside of its larger L2 cache
  • Floating Point units can become very dependant on
    cache and memory performance
  • The results
  • 20 faster than the 603 at integer
  • roughly 70 faster in floating point
  • Just over twice as fast as the Pentiums of the
    same time

18
Dynamic Branch Prediction
  • Processors take big performance penalties if they
    can't preload the cache
  • Being able to accurately "guess" the most likely
    used path can help keep the cache "preloaded" and
    increase processor performance
  • The 604 was the first mainstream processor to use
    "Dynamic Branch Prediction
  • This greatly increased performance

19
G3s The Next Generation
  • Initially, the plan had been to create a new chip
    solely based on the 604
  • But after the highly successful second generation
    of PowerPC's, IBM and Motorola decided to split
    out development and create more processors

20
740 (Arthur)
  • The first was the 603 derivative
  • This processor got some changes to the core (the
    way it executes instructions)
  • Optimized the processor for the Macintosh OS
  • This of course resulted in a large performance
    boost, even more so than the boosts offered by
    the new backside cache
  • The 740 was fast, extremely small and efficient
  • It was outperforming Pentium II's while using
    less than 1/5th the amount of power and size

21
750 (Typhoon)
  • A variant of the 740 that has a fast method of
    access to the L2 backside cache
  • Allows higher performance
  • L2 cache runs much faster than most -- and at
    speeds up to the clock rate of the main processor
  • Cache system really speeds things up, but
    requires more electronics (and pins) than the 740
  • So while the chip cost isn't much more, the added
    cache can drive the cost of the system up (and
    increase the total power usage).
  • Still has very good performance per cost

22
Hardware Aside
  • Aluminum has long been the standard material used
    for semiconductor wiring
  • IBM managed to use copper technology in their
    G3s
  • The result?
  • Enhance chip performance
  • Reduced die size and power consumption
  • 750 first created with standard aluminum design
    operating at up to 300 MHz
  • Applying IBM's copper manufacturing process to
    the same chip, the 750 featured speeds of at
    least 400MHz - a 33 percent performance
    improvement for the same chip!

23
Make room for 4th Generation
  • The 603 derived G3 performed very well with its
    backside cache and was very cheap to make and
    quite scalable by just adding more L2 cache (or
    faster L2 cache)
  • Apple killed clones and focused the product
    lines, which all reduced demands for as many
    different high-end desktop PPC's
  • The end results being that the 604 derived G3's
    (code named Habanero), and some of the other
    flavors (like ones with better MP support) were
    scrapped in favor of focusing on the G4's. Which
    makes sense, considering these other processors
    wouldn't be coming out until basically the same
    time as the G4's anyway, and you shouldn't split
    into that many different development efforts
    (waste of money)

24
G4
  • In direct response to Intels MMX instructions,
    AltiVec extensions were added to the G4 PowerPC
  • AltiVec adds a new set of 128-bit registers
  • Separate vector execution unit instruction set
    supported by branch unit
  • Allows multimedia instruction to be executed in
    parallel with both int and float ops
  • Added an addition VRSAVE register to track which
    vector registers are being used
  • Reduces the of registers needed to be saved

25
G4 cont.
  • Supports a 2 Megabyte L2 Cache which can help
    performance over the previous 1 MB L2 limit.
  • The mpx bus (used on the G4) is asynchronous and
    allows for up 4 outstanding accesses at the same
    time
  • The results are up to a 3 fold performance
    increase for memory bound operations.
  • This is why specs can be so deceptive. Without
    changing the speed of the bus at all,
    Apple/Motorola made it up to 3 times faster!

26
Conclusion
  • Obviously, the PowerPC architecture will play a
    part in imbedded technology for years to come
    (due to low cost energy)
  • As far as personal computers and workstations go,
    the PowerPCs generally outperform their Pentium
    counterparts
  • However, much of whats holding the PowerPC back
    is consumer obsession with MHz

27
MHz vs. Mega Bucks
  • Only weeks ago, Motorola announced at a
    semiconductor conference that it would soon start
    shipping G4 processors operating close to the
    1GHz mark. During his conference call, Jobs
    indicated that Apple would be working closely
    with Motorola to bridge the MHz gap, and
    introduce faster chips into the G4 systems. And
    in a rare preview of the future, Jobs indicated
    that new, faster G4 systems would begin shipping
    within the next 6 months.
  • - G4 Store Special Report

28
Works Cited
  • http//www.g4store.com/news/
  • http//www.mackido.com/Hardware/
  • http//developer.apple.com/technotes/
  • http//www.byte.com/art/9401/sec7/art2.htm
  • http//www3.sk.sympatico.ca/jbayko/cpu5.html
  • http//www.mot.com/SPS/PowerPC/
Write a Comment
User Comments (0)
About PowerShow.com