Chapter 2: Data Manipulation - PowerPoint PPT Presentation

1 / 20

About This Presentation

Title:

Chapter 2: Data Manipulation

Description:

general-purpose registers to store data ... test and digits 3-4 are the 1 byte location to jump to ... decode: store value in register 0 at memory location 6E ... – PowerPoint PPT presentation

Number of Views:267

Avg rating:3.0/5.0

Slides: 21

Provided by: rfo7

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 2: Data Manipulation

1
Chapter 2 Data Manipulation

Here, we concentrate on the processing aspects of
Computer Architecture
CPU and machine language
ALU and types of operations
Communication with Main Memory, I/O and Secondary
Storage Devices
And we will briefly examine different kinds of
machine architectures
RISC vs. CISC
Pipelining and Parallel Processing

2
The Central Processing Unit

CPU consists of 3 components
control unit -- performs the machine cycle and
decodes instructions
arithmetic/logic unit -- executes all arithmetic
and logic operation
registers -- for short-term storage of data and
instructions
general-purpose registers to store data
special-purpose registers to store information
that the CPU needs to perform the machine cycle
next instruction location
the current instruction
status flags

Referred to as the CPU
the Brain of the computer
responsible for processing all instructions
within the computer
instructions are machine language instructions
performs the machine cycle

3
Machine Language

The most primitive of programming languages
Each instruction is a single operation for the
CPU such as an add, store or branch
Three kinds of instructions
data transfer
load, store move from/to main memory to/from
registers
arithmetic/logic
, -, , /, AND, OR, NOT, XOR, shift, rotate, ,
lt, gt, ltgt
control
branch and jump
The machine instruction is known as the
instruction set of the computer, and is part of
the architectural design of the computer

4
A Typical Machine Language

The book offers a typical machine language but
it is actually very simplified
16 instructions
Load register with datum from memory
Load register with immediate datum
Move from one register to another
Store from register to memory
Add integer, add floating point
AND, OR, XOR
Rotate right by X bits
Jump if register value is equal to 0
Halt
Details are found in appendix C

5
More on this Machine Language

This sample machine is shown in figure 2.4, p. 86
Instructions are stored in 2 bytes or 4 hex
digits
Digit 1is the opcode (16 different operations)
Digits 2-4 are operand fields and differ from
instruction to instruction
For Add, OR, AND, XOR, digits 3 4 are the two
source registers, digit 2 is the destination
register
For Load and Store, digit 2 is the register to be
loaded or stored, digits 3 4 are the 1 byte
address or the 1 byte datum

For Move, digit 2 is ignored, digit 3 is the
source register and digit 4 is the destination
register
For Rotate, digit 2 is the source register, digit
3 is ignored, and digit 4 is the number of bits
to rotate
For Jump, digit 2 is the register to test and
digits 3-4 are the 1 byte location to jump to
For Halt, digits 2-4 are ignored
Notice that for Loads and Stores, we only have 1
byte to specify an address, so our machine will
only address memory locations 0-255
Also note that each register is denoted by 4
bits, so we will only have 16 registers (numbered
0 15)

6
Machine Cycle

The machine cycle consists of 3 parts
Fetch
fetch the instruction pointed to by the Program
Counter (PC)
for this machine, we fetch 2 consecutive bytes
store the result in the Instruction Register (IR)
increment the PC to point at the next instruction
For this machine, we increment the PC by 2
Decode the instruction in the IR
determine the operation specified by the opcode
of the instruction, and the location of the data
Source and Destination registers, Main Memory
location
Execute the instruction
Perform the operation
data movement or the ALU operation
If Halt, then stop, otherwise return to the Fetch
part

7
Executing Our Sample Program

Third instruction at A4
fetch 50 and 56, increment PC
decode Add values in registers 5 and 6, store
in 0
execute perform the add
Fourth Instruction at A6
fetch 30 and 6E, increment PC
decode store value in register 0 at memory
location 6E
execute perform the copy from register to
memory
Fifth Instruction at A8
fetch C0 and 00, increment PC
decode halt
execute stop program.

The program counter currently stores A0
fetch instruction 15 and 6C, increment PC to A2
decode 156C means load register 5 with the
value stored in memory cell 6C
execute copy the value from 6C to register 5
Continue with the instruction at A2
fetch instruction 16 and 6D, increment PC to A4
decode load register 6 with value at 6D
execute perform the load

8
ALU Instructions

In chapter 1, we introduced binary addition, AND,
OR, NOT and XOR
In section 2.4 (pages 95-98), the book covers
these in more detail
Aside from addition, AND, OR, NOT, XOR, we have
Shift and Rotate
Shift move the data one bit to the left or
right
left shift 10110101 ? 01101010
notice that the first 1 falls off and a 0 is
added on the right
Rotate same as shift but the bits wrap around
so that none fall off as in the Shift
Right shift 01100011 ? 10110001
Shift and Rotate can move data more than 1
position
For instance, Left shift 4 00000010 ? 00100000
Shift is an easy way to multiply or divide by 2
Rotate is an easy way to manipulate a binary
number to gain access to a given bit

9
Using Logic Functions

AND and OR can be used to mask a value
Imagine that you want to know if a given value is
positive or negative, you can do this by
determining whether the first bit is a 0 or 1 AND
the value with 10000000
Load R1, value
Load R2, 80
AND R0, R1, R2
R0 is 0 if R1 is positive, 80 if R1 is negative
Jump R0, location
OR can be used to set bits in a number
XOR is often used to complement a number
10101010 XOR 00000000 ? 01010101, or complements
10101010

10
CPU/Memory Interface

We have seen how the CPU functions, how does the
CPU communicate with memory for a load or store?
A device called the bus is used to transport data
between CPU and memory

Reading from memory
CPU places memory address on the address bus
CPU sends read request to memory
Memory retrieves data from address and sends the
data on the data bus to CPU
Writing to memory
CPU places memory address on address bus and data
on data bus
CPU sends write request to memory
memory reads data and stores it

11
More on the Bus

The bus is actually just a set of parallel wires
Each wire transmits 1 bit (1 electrical current)
from one location to another
Use 8 wires for 1 byte, 16 wires for 2 bytes, etc
The size of the data bus is usually the word
size of the computer
Bus is broken into subparts
address
data
control
Bus is extended to connect CPU and Memory to
other devices (I/O, Storage)

See also figure 2.8 p. 100
12
Communicating with other devices

Modern devices have their own controllers
processors that have their own registers and
instructions
CPU commands a device to activate and lets it
perform its task without supervision
Controllers are often placed on cards that are
plugged into the available slots in the back of
the computers motherboard
One type of controller is in charge of memory
creating direct memory access (DMA) so that
devices can communicate directly with main memory
without requiring the CPUs attention
Alleviates the Von Neumann bottleneck
See figure 2.8, p. 100

13
CPU and I/O Communication

Another problem is how can the CPU address a
given device (or controller) when the device
might be placed in any number of expansion slots
on the motherboard?
Memory-mapped I/O uses memory as an intermediate
between CPU and the device
A given memory location is specified by the
Operating System so that the CPU will perform all
communications with the device at that given
location
This location is known as the port for that
device
See figure 2.9, p. 102
Additionally, a block of memory can be used as a
buffer to temporarily store information being
passed between the two devices such as with disk
storage or video output

14
Communication Rates

Rates vary greatly between devices
Communication rates are in terms of bits per
second and describe the rate of transfer
MODEMs might operate on the order of 1200 bits
per second (bps), 2400 bps, 9600 bps, or up to
57.6 Kbps
Networks operate on the order of millions of bits
per second (10 mbps or 100 mbps)
Main memory and disk drives might operate on an
even faster speed
Two forms of communication parallel and serial
Parallel -- 1 byte at a time (or 1 word at a
time)
Serial -- 1 bit at a time (MODEMs for instance)

15
Comparing CPUs and Power

How do we compare CPUs in terms of their
computing power?
Commonly, we site the speed of the machine cycle,
which operates in 1 (or a few) clock cycles
The clock is measured in terms of number of
pulses per second, or in Megahertz (millions of
pulses/sec)
Unfortunately, machine cycles are not equivalent
across machines
One clock cycle might encompass the entire
fetch-decode-execute process whereas another
might only be the execute phase
Some computers overlap the fetch-decode-execute
process

16
CISC vs. RISC

One major distinction between CPUs is their
instruction set
Simple or reduced instruction set (RISC) or
complex instruction set (CISC)?
A CISC CPU can do a lot in 1 instruction, but the
instruction takes longer to perform
A RISC CPU is cheaper with a slower clock speed,
but possibly faster
So, which one is better?
Examples
Pentium chips is CISC
PowerPC chips is RISC

With the advent of integrated circuits in the
60s
it became easier to add complexity to instruction
sets
CISC computers were developed to make it easier
for the programmer
In the 80s, it was discovered that RISC may
outperform CISC
cheaper hardware means you can add more
processing elements
simpler instructions can be executed faster
CISC -- fewer instructions to be executed slower
RISC -- more instructions to be executed faster

17
Benchmarks

Since the clock speed does not give a good
indication of the processors speed, how do we
compare processors?
Determine the actual time it takes to execute
some common applications
These applications are known as benchmarks
Suites of benchmarks exist to test out different
aspects of a processors power
floating point operations
vector (1-d array) and matrix (2-d array)
operations
looping
graphics and communications

18
More on the Machine Cycle

Notice that our machine cycle has three
components that can be performed using three
different units of hardware
Fetch -- bus, main memory (or cache)
Decode -- control unit
Execute ALU
Given that two hardware units are inactive when
the third is being used, maybe we can take
advantage of this?

19
Pipelining

This observation leads to pipelining, overlapping
multiple machine cycles
Fetch instr1 Decode instr1 Execute
instr1 Fetch instr2 Decode instr2 Execute
instr2 Fetch instr3 Decode instr3
In this case, we might speed up the processing by
3 times!
The idea is to maximize throughput (amount
performed in some unit of time), not speed up a
single operation
There are problems with pipelining, for instance,
a branch instruction means everything in the pipe
is useless
CISC is very difficult to pipeline, but RISC is
not, one reason that RISC might be faster than
CISC

20
Multiprocessing

Another idea is to add processors to the computer
1 task can be distributed across processors
SIMD (single instruction multiple data)
if you have to do the same computation to a
number of data, do them all in parallel
MIMD (multiple instruction multiple data)
either execute several different programs at once
on different processors, or subdivide the program
into parallel parts and execute each on a
separate processor
this can very difficult and leads to concepts of
parallel programming
Two difficulties are dynamic load balancing and
scaling