Chapter 1 Microcomputers and Microprocessors - PowerPoint PPT Presentation

1 / 157
About This Presentation
Title:

Chapter 1 Microcomputers and Microprocessors

Description:

Chapter 1 Microcomputers and Microprocessors Microprocessor Evolution and Performance – PowerPoint PPT presentation

Number of Views:997
Avg rating:3.0/5.0
Slides: 158
Provided by: Michae1764
Category:

less

Transcript and Presenter's Notes

Title: Chapter 1 Microcomputers and Microprocessors


1
Chapter 1 Microcomputers and Microprocessors
  • Microprocessor Evolution and Performance

2
Contents
  • Introduction to microcomputer system
  • Microprocessor evolution
  • the INTEL processor family
  • Microprocessor performance

3
Introduction to Microcomputer
  • An microcomputer can be interpreted as a machine
    with
  • I/O devices for Input/Output,
  • microprocessor for processing,
  • memory units for storage
  • Buses for connecting the above components
  • In 1970, a microcomputer was normally interpreted
    as a computer considerably smaller than a
    mini-computer, possibly using ROM for program
    storage

4
Basic hardware units
  • Input
  • e.g. keyboard, mouse
  • Microprocessor
  • e.g. 8085, 8086, mc68000 microprocessors
  • Memory
  • e.g. RAM, hard disk
  • Output
  • e.g. monitor, printer

5
Buses
  • Buses External connections to input/output unit
  • Major Buses
  • Address bus address of memory locations
    containing instructions or data
  • Data bus contents of memory locations
  • Control Bus synchronization and handshaking
    between components

6
General Architecture
Memory Unit
Secondary memory
Primary memory
Microprocessing unit
Input unit
Output unit
7
Processor History
  • Vacuum Tubes to ICs

8
First Generation Computers
  • Vacuum tube technology
  • Large room, air-conditioned
  • Tube life-time 3,000 hours
  • Useless Machine?
  • 1951 1st Univac I (UNIVersal Automatic Computer)
    delivered
  • 1952 Prediction of presidential election by CBS
  • 1952 IBM Model 710 Data Processing System

9
Second Generation Computers
  • The Transistor Is Born (Solid-State Era)
  • 1948 invention of bipolar transistors
  • 1956 Nobel physics award Drs. William Shockley,
    John Bardeen and Walter H. Brattain (Bell Labs)
  • 1954 Bell Labs all-transistorized computer
    (TRADIC)
  • 800 transistors
  • Much less heat
  • More reliable and less costly

10
Second Generation Computers
  • Mainframe Computers
  • 1958 IBMs 1st transistorized computer 7070/7090
  • 1959 1401 (business-oriented model)
  • Built on circuit boards mounted into rack panels,
    or frames
  • Main frame (mainframe) the CPU portion of the
    computer
  • Popular with business and industry

11
Third Generation Computers
  • Invention of IC 1959
  • Dr. Robert Noyce (Fairchild) and Jack Kilby (TI)
  • Kilby fabricating resistors, capacitors and
    transistors on a germanium wafer, and connecting
    these parts with fine gold wires
  • Noyce isolating individual components with
    reverse-biased diodes, and deposing an adherent
    metal film over the circuit, thus connecting the
    components
  • 1st IC 2-transistor multivibrator
  • By mid 1960s memory chips with 1,000 components
    are common

12
Third Generation Computers
  • 1964 IBM 360 Series (32-bit)
  • The first to use IC technology
  • A family of 6 compatible computers
  • 40 different I/O and auxiliary storage devices
  • Memory capacity 16K words to over 1MB.
  • 32-bit registers x 16
  • 24-bit address bus
  • 128-bit data bus

13
Third Generation Computers
  • 1964 IBM 360 Series (32-bit)
  • 375,000 computations per second
  • (ltlt 150 mips Pentium 100)
  • 5 billion development cost
  • IBM became the leading mainframe company

14
Minicomputer
  • 1960s Space Race between US USSR
  • IC industry boom
  • A tremendous demand by scientists and engineers
    for an inexpensive computer that they could
    operate by themselves
  • 1965 DEC PDP-8 (by Edson de Castros group)
  • Low-cost (25,000) minicomputer
  • 12-bit
  • 16-bit PDP-11
  • Supermini

15
Microprocessors CPU on a Chip
  • 1968 INTEL (Integrated Electronics)
  • Founded by Robert Noyce and Gordon Moore
    (Fairchild)
  • Original goals semiconductor memory market
  • 1969 customized ICs for Busicom for calculator
  • Ted Hoff and Stan Mazor proposed 4-bit CPU on a
    single chip, plus ROM, RAM chips

16
Microprocessors CPU on a Chip
  • 1971 4000 Family
  • By Fredrico Faggin
  • 4001 2K ROM with 4-bit I/O port
  • 4002 320-bit RAM, 4-bit output port
  • 4003 10-bit serial-in parallel-out shift
    register
  • 4004 4-bit processor
  • Processor-on-a-chip Micro-processor era

17
Microprocessors CPU on a Chip
  • 1972 8008, 8-bit
  • 1974 8080, an improved version

18
Microprocessors CPU on a Chip
  • 8-bit CPUs
  • 16-bit address (64K)
  • MC6800 Motorola
  • 6502 MOS Technology (spin-off from Motorola)
  • Apple-II, Apple DOS
  • Z-80 Zilog (spin-off from Intel)
  • Z-80 cards on Apple-II, CP/M

19
Microprocessors CPU on a Chip
  • 16-bit CPUs (Late 1970s)
  • 8086, 80186, 80286 Intel
  • PC, PC-DOS, MS-DOS, SCO-Unix
  • MC68000 Motorola
  • 16-bit instructions
  • Hardware multiply and divide
  • 20-bit address buses (1MB)
  • Workstations Sun3

20
Microprocessors CPU on a Chip
  • 32-bit CPUs
  • 80386, 80486 Intel
  • MC68020, 68030 Motorola
  • 64-bit CPUs
  • Pentium, Pentium Pro (64-bit external data bus,
    32-bit internal registers, not recognized as
    64-bit CPUs in terms of internal register word
    length)

21
Microcomputers Computers Based on Microprocessors
  • 1975 MITS Altair 8800 (Kit)
  • 399, i8080, programmed by depositing 1s/0s via
    front panel switches
  • Other Computers boom
  • 8080 MITS,
  • 6800 SWTPC 6800,
  • Z-80 TRS-80,
  • 6502 Apple I, 8K, programmed with BASIC
  • Steve Jobs Steve Wozniak, millionaires from PC
    COMs

22
Personal Computers the Open Architecture Era
  • 1982 IBM PC
  • A system board (mother board)
  • Intel 8088 processor
  • 16K memory
  • 5 expansion slots
  • Third-party vendors to supply various IO adapter
    cards
  • Open architecture
  • Computer with interchangeable components

23
Micro-controllers Microcomputers on a Chip
  • Microcontroller a computer on a chip
  • Microprocessor, plus
  • On-chip memory, plus
  • Input/output ports
  • 1995 microcontrollers out sold microprocessors
    101
  • embedded on various equipments
  • Thermostat, machine tools, communication,
    automotive,
  • Evolution getting greater IO capabilities
  • Intel MCS-51, MCS-96,

24
High-Performance Processors
  • Supercomputers
  • Aircraft design, global climate modeling,
    oil-bearing formation, molecular design of new
    drugs, financial behavior
  • CDC6600, 7600 Seymour Cray
  • Cray-1 1976, the first true supercomputer
  • ECL, 128 KW power consumption
  • 130 MFLOPS (Pentium 100 150 MFLOPS)
  • 5.1 million

25
High-Performance Processors
  • Parallel Processors
  • Tens of gigaflops
  • Multi-processors wired by a common bus
  • Each is given a portion of the problem to solve
  • Hypercube early 1980s
  • Cosmic Cube, iPSC (with i860/RISC chips)
  • 2D rectangular Mesh architecture multiple
    processor at each node
  • Intel teraflops computer with 4500 nodes, each
    powered by 2 Pentium Pro 200.

26
RISC vs. CISC
  • RISC Reduced Instruction Set Computer (1980s)
  • A small number of fixed-length instructions
  • Simple addressing modes
  • A large number of registers
  • Instructions executed in one clock cycle
  • Intel i860 (Cray on a Chip)
  • 82 instructions, 32-bit long each
  • Four addressing modes
  • 32 general-purpose registers

27
RISC vs. CISC
  • CISC Complex Instruction Set Computer
  • A large number of variable length instructions
  • Multiple addressing modes
  • A small number of registers
  • Multiple number of clock cycles to execute
  • Intel 8086
  • Over 3000 instruction forms, 1-6 bytes
  • 9 addressing modes
  • 8 general-purpose registers
  • Execution from 2 to 80 cycles

28
RISC vs. CISC
  • RISC
  • Control unit is much simpler (simpler
    instructions, execution in 1 CLK)
  • Faster execution with less total on-chip logic
  • Chip area 10 (vs 50 for CISC)
  • More area for register file, data and instruction
    caches, FPU, and co-processor
  • PowerPC 32-bit, by IBM, Apple, Motorola
  • Sparc for SunMicro workstations

29
Application-Specific Processors
  • DSP Chips
  • Mostly for analog signal processing
  • ADC-DSP-DAC architecture
  • Avoid processing analog signals using discrete
    circuits, involving capacitors and inductance
  • DSP conduct complex mathematic functions
  • Digital filter, spectrum analysis

30
Application-Specific Processors
  • DSP Chip Architecture
  • Different data/program areas Harvard
    Architecture
  • Hardware multipliers and adders, optimized to
    execute on a single cycle
  • Arithmetic pipelining several instructions
    operated at once
  • Hardware loop control
  • Multiple IO ports for communication with other
    processors

31
Summary of Processor History
  • 1940s Vacuum tube, large and consuming large
    power
  • 1950s Transistor (1948-)
  • 1959 First IC (second industrial revolution)
  • 1960s IC was popular to build CPUs.
  • 1971 Intel 4004 microprocessor (2300
    transistors)
  • Starts of the microprocessor age
  • Late 1970s 8080/85

32
Summary of Processor History
  • 1980 RISC (reduced instruction set computer)
  • CISC (complicated instruction set computer) vs.
    RISC
  • CISC family Intel 80x86, Pentium Motorola 68000
    series
  • All others are RISC series.

33
Evolution of INTEL Processors
  • 4004 (71)-Pentium Pro (93-)

34
INTEL
  • Integrated Electronics
  • 1968 founded by Robert Noyce and Gordon Moore
  • IA Intel Architecture (e.g, IA-16, IA-32, IA-64)
    since 8008 (72) had became the de facto standard
  • Evolution
  • Internal register sizes
  • External bus widths
  • Real, Protected, and Virtual 8086 modes

35
4-bit Processors
  • 4004
  • first microprocessor
  • became available in 1971
  • 4-bit microprocessor
  • 4-bit registers 4-bit data bus
  • transistors 2250
  • Min. feature size 10 microns
  • Address bus 10 bits/1K
  • 0.06 MIPS (_at_ 0.108 MHz)
  • No internal cache

36
8-bit Processors
  • 8008, 8080, 8085
  • became available in 1974
  • 8-bit microprocessor

37
8086 IA standard
  • Became available in 1978
  • 16-bit data bus
  • 20-bit address bus (was 16-bit for 8080)
  • memory organization 16 segments of 64KB (1 MB
    limit)
  • Re-organize CPU into BIU (bus interface unit) and
    EU (execution unit)
  • Allow fetch and execution simultaneously
  • Internal register expanded to 16-bit
  • Allow access of low/high byte separately

38
8086
  • Hardware multiply and divide instructions
  • External math co-processor
  • Instruction set compatible with 8080/8085
  • 8086 defined the 80x86 architecture

39
8086
  • Not quite successful
  • 16-bit data bus Requires two separate 8-bit
    memory banks
  • Memory chips were expensive

40
8088 PC standard
  • Became available in 1979, almost identical to
    8086
  • 8-bit data bus for hardware compatibility with
    8080
  • 16-bit internal registers and data bus (same as
    8086)
  • 20-bit address bus (was 16-bit for 8080)
  • BIU re-designed
  • memory organization 16 segments of 64KB (1 MB
    limit)
  • Two memory accesses for 16-bit data (less
    efficient)
  • But less cost
  • 8088 used by IBM PC (1982), 16K-64K, 4.77MHz

41
80186, 80188 High Integration CPU
  • PC system
  • 8088 CPU various supporting chips
  • Clock generator
  • 8251 serial IO (RS232)
  • 8253 timer/counter
  • 8255 PPI (programmable periphial interface)
  • 8257 DMA controller
  • 8259 interrupt controller
  • 80186/80188 8086/8088 supporting functions
  • Compatible instruction set ( 9 new instructions)

42
80286
  • Became available in 1982
  • used in IBM AT computer (1984)
  • 16-bit data bus
  • clock speed 25 faster than 8088, throughput 5
    times greater than 8088
  • 24-bit address bus (16 MB) (vs. 20-bit/1M 8086)

43
80286 Real vs. Protected Modes
  • Larger address space 24-bit address bus
  • Real Mode vs. Protected Mode
  • Real Mode
  • Power on default mode
  • Function like a 8086 use 20-bit least
    significant address lines (1M)
  • Software compatible with 286
  • 16 new instructions (for Protected Mode
    management)
  • Faster 286 redesigned processor, plus higher
    clock rate (6-8MHz)

44
80286 Real vs. Protected Modes
  • Protected Mode
  • Multi-program environment
  • Each program has a predetermined amount of memory
  • Addressed via segment selector (physical
    addresses invisible) 16M addressable
  • Multiple programs loaded at once (within their
    respective segments), protected from read/write
    by each other

45
80286 Real vs. Protected Modes
  • Protected Mode
  • Cannot be switch back to real mode to avoid
    illegal access by switching back and forth
    between modes
  • A faster 8086 only?
  • MS-DOS requires that all programs be run in Real
    Mode

46
Clock Speed
  • Electrical signals cannot change instantaneously
    (transition period required)
  • System clock provides timing signal for
    synchronization
  • Cannot be used to compare the performance of
    microprocessors with different instruction sets
  • e.g., a 66 MHz Pentium is twice as fast as a 66
    MHz 80486

47
80386DX (aka. 80386)
  • available in 1985, a major redesign of 86/286
  • Compatibility commitment through 2000
  • 32-bit data and address buses (4 GB memory)
  • Real Address Mode 1M visible, 286 real mode
  • Protected Virtual Address Mode
  • On board MMU
  • Segmented tasks of 1byte to 4G bytes
  • Segment base, limit, attributes defined by a
    descriptor register
  • Page swapping 4K pages, up to 64TB virtual
    memory space
  • Windows, OS/2, Unix/Linux

48
80386DX (aka. 80386)
  • Virtual 8086 mode (a special Protected mode
    feature) permitted multiple 8086 virtual
    machines-multitasking (similar to real mode)
  • Windows (multiple MSDOSs)
  • Clock rate
  • max. 40MHz, 2 pulses per R/W bus cycle
  • External memory cache to avoid wait
  • Fast SRAM
  • 93 hit rate with 64K cache
  • Compatible instructions (14 new)

49
80386SX
  • 80386SX (for transition to 32-bit)
  • 16-bit data bus/32-bit register
  • 24-bit address bus

50
80486DX
  • 1989 a polished 386, 6 new OS level instructions
  • virtually identical to 386 in terms of
    compatibility
  • RISC design concepts
  • fewer clock cycles per operation, a single clock
    cycle for most frequently used instructions
  • Max 50MHz
  • 5 stage execution pipeline
  • Portions of 5 instructions execute at once

51
80486DX
  • Highly Integrated
  • On board 8K memory cache
  • FPP (equivalent to external 80387 co-processor)
  • Twice as fast as 386 at any given clock rate
  • 20Mhz 486 40Mhz 386

52
80486SX
  • 80486SX
  • NOT a 16-bit version for transition purpose
  • no coprocessor
  • No internal cache
  • For low-end applications
  • Max. 33Mhz only

53
80486DX2/DX4 Overdrive Chips
  • Processor speed increased too fast
  • Redesign of microcomputer for compatibility
    becomes harder
  • Solution Separating internal speed with external
    speed, improve performance independently
  • 80486DX2/DX4 internal clock twice/three times
    (NOT four times) the external clock runs faster
    internally

54
80486DX2/DX4 Overdrive Chips
  • System board design is independent of processor
    upgrade (less expensive components are allowed)
  • Processor operate at maximum speed data rate
    internally
  • Only slow access to external data operates at
    system board rate
  • Internal cache offset the speed gap
  • 486DX2 66 66 internal, 33 external
  • 486DX4 100 100 internal, 33 external (3x)
  • Overdrive sockets for upgrading 486dx/sx to
    486dx2/dx4 (with overdrive socket pin-outs)

55
Pentium Superscaler Processor
  • available in 1992
  • 32-bit architecture
  • Superscaler architecture
  • Scaling scaling down etchable feature size to
    increase complexity of IC (e.g., DRAM)
  • 10 microns/4004 to 0.13 microns (2001)
  • Superscaler go beyond simply scaling down
  • Two instruction pipelines each with own ALU,
    address generation circuitry, data cache
    interface
  • Execute two different instructions simultaneously

56
Pentium Superscaler Processor
  • Onboard cache
  • Separate 8K data and code caches to avoid access
    conflicts
  • FPP
  • Instruction pipeline 8 stage
  • Optimized floating point functions
  • 5x-10x FLOPs of 486
  • 2x performance of 486 at any clock rate

57
Pentium Superscaler Processor
  • Compatibility with 386/486
  • Internal 32-bit registers and address bus
  • Data bus expanded to 64-bits for higher data
    transfer rate
  • Compare 8088 to 386sx transition

58
Pentium Superscaler Processor
  • non-clone competition from AMD, Cyrix
  • development of brand identity by Intel

59
Pentium Pro Two Chips in One
  • Became available in 1995
  • Superscaler of degree 3
  • Can execute 3 instructions simultaneously
  • Optimized for 32-bit operating systems (e.g.,
    Windows NT, OS2/Warp)
  • Two separate silicon die on the same package
  • Processor 0.35 u, 5.5 million transistors
  • 256KB(/512K) Level 2 cache included on chip, 15.5
    million transistors in smaller area

60
Pentium Pro Two Chips in One
  • On Board Level 2 cache
  • Simplifies system board design
  • Requires less space
  • Gains faster communication with processor
  • Internal (level 1) cache 8K
  • Pentium Pro 133 2x Pentium 66 4x 486DX2 66

61
Pentium ProDynamic Execution
  • Dynamic execution reduce idle processor time by
    predicting instruction behaviors
  • Multiple Branch Prediction look as far as 30
    instructions ahead to anticipate program branches
  • Data Flow Analysis looks at upcoming
    instructions and determine if they are available
    for processing, depending on other instructions.
    Determine optimal execution sequences.
  • Speculative Execution execute instructions in
    different order as entered. Speculative results
    are stored until final states can be determined.

62
Processor Future
  • Whats More from Moores Law?

63
Moore's Law
  • In 1965, Gordon Moore predicted that
  • The number of transistors per integrated circuit
    would double every 18 months
  • He forecast that this trend would continue
    through 1975

64
Moores Law
65
Other Microprocessors
  • Motorola family
  • from 6809 (Apple II) through 68040
  • PowerPC
  • joint venture between Apple, IBM, and Motorola
  • RISC Processors
  • DEC Alpha, MIPS, Sun SPARC, etc.

66
CISC vs. RISC
  • CISC (Complex Instruction Set Computer)
  • CISC processors have a large versatile
    instruction set that supports many complex
    addressing modes
  • move complexity from software to hardware
  • RISC (Reduced Instruction Set Computer)
  • RISC processors have a small instruction set
  • move complexity from hardware to software

67
Microprocessor Performance
  • Two main factors
  • Respond time
  • the time between the start and completion of a
    task, also referred to as execution time
  • Throughput
  • the total amount of work done in a given time

68
MIPS
  • Million Instructions Per Second
  • MIPS (Instruction count) / (Execution time in
    micro second X 106)
  • It specifies performance inversely to execution
    time
  • Faster machines have a higher MIPS rating

69
Some Problems of MIPS
  • Cannot compare computers with different
    instruction sets, since the instruction count
    will certainly differ
  • MIPS varies between programs on the same computer

70
iCOMP
  • An index provided by Intel for comparison of
    performance of their 32-bit microprocessors
  • Based on a variety of performance components that
    represent integer mathematics, graphics, etc.
  • Combine results of a set of software application
    benchmarks

71
(No Transcript)
72
Chapter 2Computer Codes, Programming, and
Operating Systems
  • Number Systems
  • Computer Codes
  • Programming
  • Operating Systems

73
Number Systems
  • Decimal Base 10
  • Binary Base 2
  • Octal Base 8
  • Hexadecimal Base 16

74
Base Conversion 2?10
  • Binary to Decimal
  • D ?i0,n-1 bi x 2i
  • Decimal to Binary
  • Repeated subtraction
  • D ?i0,m-1 bi x 2i D - 2m (bm1)
  • D lt D m lt m (m max exp. s.t. (bm1)
  • Long division
  • D D/2 bi D lt D

75
(No Transcript)
76
MCS-51 Program Development
.SDT
Symbol Converter
ICE
(CVTSYM)
Program
.SYM
Editor
Assembler
Linker
.ASM
.OBJ
.HEX
(X8051)
(Link)
Target
77
Chapter 380x86 Processor Architecture
  • 8086/88
  • Segmented Memory
  • 80386
  • 80486
  • Pentium
  • Pentium Pro

78
The 8086 and 8088
  • Processor Model
  • Programming Model

79
8086 IA standard
  • Became available in 1978
  • 16-bit data bus
  • 20-bit address bus (was 16-bit for 8080)
  • memory organization 16 segments of 64KB (1 MB
    limit)
  • Re-organize CPU into BIU (bus interface unit) and
    EU (execution unit)
  • Allow fetch and execution simultaneously
  • Internal register expanded to 16-bit
  • Allow access of low/high byte separately

80
8088 PC standard
  • Became available in 1979, almost identical to
    8086
  • 8-bit data bus for hardware compatibility with
    8080
  • 16-bit internal registers and data bus (same as
    8086)
  • 20-bit address bus (was 16-bit for 8080)
  • BIU re-designed
  • memory organization 16 segments of 64KB (1 MB
    limit)
  • Two memory accesses for 16-bit data (less
    efficient)
  • But less cost
  • 8088 used by IBM PC (1982), 16K-64K, 4.77MHz

81
80186, 80188 High Integration CPU
  • PC system
  • 8088 CPU various supporting chips
  • Clock generator
  • 8251 serial IO (RS232)
  • 8253 timer/counter
  • 8255 PPI (programmable periphial interface)
  • 8257 DMA controller
  • 8259 interrupt controller
  • 80186/80188 8086/8088 supporting functions
  • Compatible instruction set ( 9 new instructions)

82
8086 Processor Model BIUEU
  • BIU
  • Memory IO address generation
  • EU
  • Receive codes and data from BIU
  • Not connected to system buses
  • Execute instructions
  • Save results in registers, or pass to BIU to
    memory and IO

83
8086 Processor Model
Address Generation and Bus Control
EU
BIU
Instruction Queue
84
Fetch and Execution Cycle
  • BIUEU allows the fetch and execution cycle to
    overlap
  • 0. System boot, Instruction Queue is empty
  • 1. IP gtBIUgt address bus IP
  • 2. Mem(IP-1) gt Instruction Queuetail
  • 3a. InstrQhead gt EU gt execution
  • 3b. MemIP gt InstrQtail
  • Maybe multiple instructions
  • Repeat 3a3b (overlapped)

85
Waiting Conditions Memory Access
  • BIUEU execute (almost) continuously without
    waiting
  • Waiting Conditions Accessing memory locations
    not in queue
  • BIU suspend instruction fetch
  • Issues external memory address
  • Resumes instruction fetch and execution

86
Waiting Conditions Jump
  • Next Jump Instruction
  • Instructions in queue are discarded
  • EU wait for the next instruction after the jump
    location to be fetched by BIU
  • Resume execution

87
Waiting Conditions Long Instructions
  • Long Instruction is being executed
  • Instruction Full
  • BIU waits
  • Resume instruction fetch after EU pull one or tow
    bytes from queue

88
BIU 8088 vs. 8086
  • BIU is the major difference
  • 8088
  • data bus 8-bit (vs. 16-bit/8086)
  • Instruction queue 4 bytes (vs. 6-byte/8086)
  • Only 30 slower than 8086
  • If queue is kept full

89
8086 Programming Model
90
8086 Programming Model
  • Data Group
  • AX (AHAL) Accumulator
  • BX (BHBL) Base
  • CX (CHCL) Counter
  • DX (DHDL) Data

91
8086 Programming Model
  • Segment Group
  • CS Code Segment
  • DS Data Segment
  • ES Extra Segment
  • SS Stack Segment
  • Segment Registers
  • Base address to particular segments

92
8086 Programming Model
  • Pointer/Index Group
  • IP Instruction Pointer ?CS
  • SI Source Index?DS
  • DI Destination Index?ES
  • SP Stack Pointer?SS
  • Index Registers
  • Index (offset) or Pointer to a Base address

93
8086 Flag Word
  • Flag L

SF ZF X AF X
PF X CF
PF (Even) Parity Flag (even number of 1s in
low-order 8 bits of result)
AF Aux. Carry Carry/Borrow on bit 3 (Low nibble
of AL)
ZF Zero Flag (1 result is zero)
SF Sign Flag (0 positive, 1 negative)
94
8086 Flag Word
  • Flag H

X X X X OF
DF IF TF
TF Trap flag (single-step after next
instruction clear by single-step interrupt)
IF Interrupt-Enable enable maskable interrupts
DF Direction flag auto-decrement (1) or
increment(0) index on string operations
OF Overflow signed result cannot be expressed
within bits in destination operand
95
Segmented Memory
  • Linear vs. Segmented
  • Linear Addressing
  • The entire memory is regarded as a whole
  • the entire memory space is available all the time
  • Segmented
  • memory is divided into segments
  • Process is limited to access designated segments
    at a given time

96
8086 Memory Organization
  • Even and Odd Memory Banks
  • 16-bit data bus?two-byte / two one-byte access
  • Allows processor to work on bytes or on words
    (16-bit)
  • IO operations are normally conducted in bytes
  • Can handle odd-length instructions
  • Single byte instructions
  • Multiple byte (and very long) instructions

97
8086 Memory Organization
  • Memory Space
  • 20-bit address bus
  • Linearly, 1M bytes directly addressable
  • Memory Banks
  • Can read 16-bit data (512K words) from even and
    odd-addressed simultaneously
  • ?need Two memory banks in parallel
  • ?BHE control line allows addressing even/odd
    banks or both

98
Memory Organization Alignment
  • Endianess
  • One way to model multi-byte CPU register
  • AX ? AHAL
  • Two ways to store operands in memory
  • Big-endian CPU (IBM370, M68, Sparc)
  • High-order-byte-first (HOBF)
  • Maps highest-order byte of internal
    register?lowest (1st) memory byte address
  • Operand address?address of MSB
  • MOV R1, N ? N 1st byte in memory MSB of
    register

99
Memory Organization Alignment
  • Little-endian CPU (DEC, Intel)
  • Low-order-byte-first (LOBF)
  • Maps lowest-order byte of register ?1st memory
    byte
  • Operand address ?address of LSB (1st memory byte)
  • MOV AX, N ?N 1st byte in memory LSB of
    register
  • AL?N, AH?N1
  • Configurable
  • Can switch between Big/Little-endian, or
  • Provide instructions which convert 16-/32-bit
    data between two byte ordering (80486)

100
8086 Memory Organization
  • Aligned operand
  • Operand aligned at even-byte (word/dword)
    boundaries
  • Allows single access to read/write one operand
  • Through internal shift/swap mechanism, if
    necessary
  • Mis-aligned words
  • Word operand not start at even address
  • Need 2 read cycles to read/write the word (8086)
  • Issues two addresses to access the two
    even-aligned words containing the operand in
    order to access the operand
  • slower but transparent to programmer

101
8086 Memory Organization
  • 8088
  • always 2 cycles for word operations
  • Aligned or not
  • Because of 8-bit external data bus
  • Single memory bank is sufficient

102
8086 Memory Map
  • Memory Map How memory space is allocated
  • ROM Area boot, BIOS
  • RAM OS/User Apps data
  • Unused
  • Reserved for future hardware/software uses
  • Dedicated for specific system interrupt and rest
    functions, etc.

103
Segment Registers
  • 64K memory segments x 16
  • 16-bit offset each
  • CS, DS, ES, SS

104
Logical and Physical Addresses
  • Physical 20-bit
  • Logical 16-bit
  • 16-byte segment boundaries
  • Address Translation
  • E.g., CSIP

105
80286
  • First with Protection Mode
  • Review of 286 Protected Mode Next

106
80286
  • Became available in 1982
  • used in IBM AT computer (1984)
  • 16-bit data bus
  • clock speed 25 faster than 8088, throughput 5
    times greater than 8088
  • 24-bit address bus (16 MB) (vs. 20-bit/1M 8086)

107
80286 Real vs. Protected Modes
  • Larger address space 24-bit address bus
  • Real Mode vs. Protected Mode
  • Real Mode
  • Power on default mode
  • Function like a 8086 use 20-bit least
    significant address lines (1M)
  • Software compatible with 286
  • 16 new instructions (for Protected Mode
    management)
  • Faster 286 redesigned processor, plus higher
    clock rate (6-8MHz)

108
80286 Real vs. Protected Modes
  • Protected Mode
  • Multi-program environment
  • Each program has a predetermined amount of memory
  • Addressed via segment selector (physical
    addresses invisible) 16M addressable
  • Multiple programs loaded at once (within their
    respective segments), protected from read/write
    by each other

109
80286 Real vs. Protected Modes
  • Protected Mode
  • Cannot be switch back to real mode to avoid
    illegal access by switching back and forth
    between modes
  • A faster 8086 only?
  • MS-DOS requires that all programs be run in Real
    Mode

110
80386 Model
  • Refine 286 Protect Mode
  • Expand to 32-bit registers
  • New Virtual 8086 Mode

111
80386 Review
112
80386DX (aka. 80386)
  • available in 1985, a major redesign of 86/286
  • Compatibility commitment through 2000
  • 32-bit data and address buses (4 GB memory)
  • Real Address Mode 1M visible, 286 real mode
  • Protected Virtual Address Mode
  • On board MMU
  • Segmented tasks of 1byte to 4G bytes
  • Segment base, limit, attributes defined by a
    descriptor register
  • Page swapping 4K pages, up to 64TB virtual
    memory space
  • Windows, OS/2, Unix/Linux

113
80386DX (aka. 80386)
  • Virtual 8086 mode (a special Protected mode
    feature) permitted multiple 8086 virtual
    machines-multitasking (similar to real mode)
  • Windows (multiple MSDOSs)
  • Clock rate
  • max. 40MHz, 2 pulses per R/W bus cycle
  • External memory cache to avoid wait
  • Fast SRAM
  • 93 hit rate with 64K cache
  • Compatible instructions (14 new)

114
80386SX
  • 80386SX (for transition to 32-bit)
  • 16-bit data bus/32-bit register
  • 24-bit address bus

115
80386 Real vs. Protected Modes
  • Larger address space 32-bit address bus (4G)
  • Real Mode vs. Protected Mode (refined from 286)
  • Real Mode
  • Power on default mode
  • Function like a 8086 (1) use only 20-bit least
    significant address lines (1M) (2) segmented
    memory retained (64K)
  • Software compatible with 286
  • New Real Mode Features
  • access to 32-bit register set
  • two new segments F, G

116
80386 Real vs. Protected Modes
  • Protected Mode
  • new addressing mechanism vs. real mode
  • supports protection levels
  • segment size 1 to 4G (not 64K, fixed)
  • segment register pointer to a descriptor table
  • not base address

117
80386 Real vs. Protected Modes
  • Protected Mode
  • descriptor table (8 byte per entry)
  • 32-bit base address of segment
  • segment size
  • access rights
  • memory address base address (in table) offset
    (in instruction)

118
80386 Real vs. Protected Modes
  • Protected Mode
  • Paging mechanism
  • map 32-bit linear address (baseoffset)
    gtphysical address page frame address
  • ?(4K page frames in system memory)
  • 64TB of virtual memory

119
80386 Real vs. Protected Modes
  • Protected Mode
  • Protection mechanism
  • tasks/data/instructions are assigned a privilege
    level (PL)
  • tasks running at lower PL cannot access tasks or
    data segments at a higher PL
  • running programs that are protected from the
    others

120
80386 Real vs. Protected Modes
  • Two Ways to Run 8086 Programs
  • Real Mode
  • Virtual 8086 Mode
  • Virtual 8086 Mode
  • runs multiple 8086other 386 (protected mode)
    programs independently
  • each sees 1 MB (mapped via paging to anywhere in
    4GB space)
  • running V8086 Protected mode simultaneously

121
80386 Processor Model
386
122
80386 Processor Model BIUCPUMMU
  • BIU
  • control 32-bit address and data buses
  • keep instruction queue full (16 bytes)
  • Address pipelining
  • address of next memory location is output halfway
    through current bus cycle
  • more address decode time
  • slower memory chip is OK
  • easier to keep up with faster (2 CLK) bus cycle
    of 386

123
80386 Processor Model BIU
  • dynamic data bus sizing
  • switch between 16-/32-bit data bus on the fly
  • accommodate to external 16-bit memory cards or IO
    devices
  • adjust bus timing to use only the least
    significant 16 bits

124
80386 Processor Model BIU
  • External memory
  • 4 memory banks (4x832bits)
  • BE0-BE3 for bank selection
  • access byte or word or double word
  • aligned operands 1 bus cycle
  • mis-aligned (not 4) 2 bus cycles

125
80386 Processor Model CPU
  • CPUIU (instruction) EU (execution)
  • fetching execution overlap
  • IU
  • retrieval instructions from queue
  • decode
  • store in decoded queue
  • EUALUregisters (32-bit)
  • execute decode instructions

126
80386 Processor Model MMU
  • Segmentation unit
  • Real mode generate the 20-bit physical address
  • Protected mode store base/size/rights in
    descriptor registers
  • cache descriptor tables in RAM
  • faster operations
  • Paging Unit
  • determines physical addresses associated with
    active segments (divided into 4K pages)
  • virtual memory support to allow larger programs

127
80386 Programming Model
  • General Purpose Registers
  • Data Addresses Groups
  • Status Control Flags
  • VM, RF, NT, IOPL
  • Segment Group

128
80386 Programming Model
  • Special purpose Registers

129
80386 Programming Model
  • Memory Management
  • segment descriptors
  • keep base, size, access rights
  • 3 types of tables global (GDT), local (LDT),
    interrupt (IDT)
  • addressing
  • index (to a table) RPL
  • base offset (from instruction)
  • Paging
  • TLB

130
80386 Programming Model
  • Protection (PL)
  • task CPL
  • instruction RPL
  • data segment DPL
  • Gates
  • special descriptors that allows access to higher
    PL tasks from lower PL tasks

131
80486 Review
132
80486DX
  • 1989 a polished 386, 6 new OS level instructions
  • virtually identical to 386 in terms of
    compatibility
  • RISC design concepts
  • fewer clock cycles per operation, a single clock
    cycle for most frequently used instructions
  • Max 50MHz
  • 5 stage execution pipeline
  • Portions of 5 instructions execute at once

133
80486DX
  • Highly Integrated
  • On board 8K memory cache
  • FPP (equivalent to external 80387 co-processor)
  • Twice as fast as 386 at any given clock rate
  • 20Mhz 486 40Mhz 386

134
80486SX
  • 80486SX
  • NOT a 16-bit version for transition purpose
  • no coprocessor
  • No internal cache
  • For low-end applications
  • Max. 33Mhz only

135
80486DX2/DX4 Overdrive Chips
  • Processor speed increased too fast
  • Redesign of microcomputer for compatibility
    becomes harder
  • Solution Separating internal speed with external
    speed, improve performance independently
  • 80486DX2/DX4 internal clock twice/three times
    (NOT four times) the external clock runs faster
    internally

136
80486DX2/DX4 Overdrive Chips
  • System board design is independent of processor
    upgrade (less expensive components are allowed)
  • Processor operate at maximum speed data rate
    internally
  • Only slow access to external data operates at
    system board rate
  • Internal cache offset the speed gap
  • 486DX2 66 66 internal, 33 external
  • 486DX4 100 100 internal, 33 external (3x)
  • Overdrive sockets for upgrading 486dx/sx to
    486dx2/dx4 (with overdrive socket pin-outs)

137
486 Processor Features
  • 386 features
  • Real/Protected Modes
  • Memory Management
  • PLs
  • registers bus sizes
  • New features
  • 6 OS instructions
  • 8K/16K onboard cache (was external before 386)

138
486 Processor Features
  • A better 386
  • 5 stage instruction pipeline
  • IF/ID/EX gt PF/D1/D2/EX/WB
  • PF instructions gt Q (216-bytes)
  • D1 determine opcode
  • D2 determine memory address of operands
  • EX execute indicated OP
  • WB update register

139
486 Processor Features
  • Reduced Instruction Cycle Times
  • 5 stage instruction pipeline (e.g., Fig. 3.18)
  • instruction cycle times
  • 8086 4 CLK
  • 80386 2 CLK
  • 80486 1 CLK (?close to RISC)
  • about 2X faster than 386

140
486 Processor Model 386FPUCache
  • 386 units retained BIU, CPU, MMU
  • new FPU (80387) Cache (8K/16K)
  • FPU
  • 387 onboard
  • 0.8 u gt transistors increased (275K gt 1
    millions)
  • simplified system board design
  • speedup FP operations

141
(No Transcript)
142
486 Processor Model Cache
  • Cache (8K/16K (dx4))
  • Function bridge processor memory bandwidth
  • 8088 4.77MHz
  • 80486 50MHz
  • Pentium 100MHz
  • Pentium Pro 133 MHz
  • Main Memory (DRAM) relatively slow
  • Fast Static RAMs (SRAM) as cache

143
486 Processor Model Cache
  • Organization
  • 8K
  • 4-way set associative
  • 4 direct mapped caches wired in parallel
  • each block maps to a set of 4 lines
  • unified data code in the same cache
  • write-through update cache and memory page on
    write operations

144
486 Processor Model Cache
  • locality (why caches help?)
  • spatial locality e.g., array of data
  • temporal e.g., loops in codes
  • operations on hit/miss
  • 128-bit cache lines
  • 32-bit x N to catch locality (N4)
  • 128-bit 16-byte

145
486 Processor Model Cache
  • Mapping
  • memory gt many-to-many gt cache
  • Data RAM save memory data
  • Tag RAM save memory address information
  • 3 methods of mapping
  • fully associative memory block to any cache line
  • direct map memory block to specific line
  • trashing
  • set associative memory block to a set of cache
    lines

146
486 Processor Model Cache
  • Replacement policy (LRU)
  • valid bits all 4 lines in use ?
  • NO gt use any unused line
  • YES gt find one to replace
  • LRU bits which is least recently used

147
(No Transcript)
148
(No Transcript)
149
Pentium Review
150
Pentium Superscaler Processor
  • available in 1992
  • 32-bit architecture
  • Superscaler architecture
  • Scaling scaling down etchable feature size to
    increase complexity of IC (e.g., DRAM)
  • 10 microns/4004 to 0.13 microns (2001)
  • Superscaler go beyond simply scaling down
  • Two instruction pipelines each with own ALU,
    address generation circuitry, data cache
    interface
  • Execute two different instructions simultaneously

151
Pentium Superscaler Processor
  • Onboard cache
  • Separate 8K data and code caches to avoid access
    conflicts
  • FPP
  • Instruction pipeline 8 stage
  • Optimized floating point functions
  • 5x-10x FLOPs of 486
  • 2x performance of 486 at any clock rate

152
Pentium Superscaler Processor
  • Compatibility with 386/486
  • Internal 32-bit registers and address bus
  • Data bus expanded to 64-bits for higher data
    transfer rate
  • Compare 8088 to 386sx transition

153
Pentium Superscaler Processor
  • non-clone competition from AMD, Cyrix
  • development of brand identity by Intel

154
Pentium Pro Review
155
Pentium Pro Two Chips in One
  • Became available in 1995
  • Superscaler of degree 3
  • Can execute 3 instructions simultaneously
  • Optimized for 32-bit operating systems (e.g.,
    Windows NT, OS2/Warp)
  • Two separate silicon die on the same package
  • Processor 0.35 u, 5.5 million transistors
  • 256KB(/512K) Level 2 cache included on chip, 15.5
    million transistors in smaller area

156
Pentium Pro Two Chips in One
  • On Board Level 2 cache
  • Simplifies system board design
  • Requires less space
  • Gains faster communication with processor
  • Internal (level 1) cache 8K
  • Pentium Pro 133 2x Pentium 66 4x 486DX2 66

157
Pentium ProDynamic Execution
  • Dynamic execution reduce idle processor time by
    predicting instruction behaviors
  • Multiple Branch Prediction look as far as 30
    instructions ahead to anticipate program branches
  • Data Flow Analysis looks at upcoming
    instructions and determine if they are available
    for processing, depending on other instructions.
    Determine optimal execution sequences.
  • Speculative Execution execute instructions in
    different order as entered. Speculative results
    are stored until final states can be determined.
Write a Comment
User Comments (0)
About PowerShow.com