Computer Architecture and Organization - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Computer Architecture and Organization

Description:

... (a) crossbar; (b) bus; (c) ring; (d) mesh; (e) star; (f) ... Commercial CPLDs may contain as many as 200,000 equivalent gates and have over 3,000 macrocells. ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 35
Provided by: milesmurdo
Category:

less

Transcript and Presenter's Notes

Title: Computer Architecture and Organization


1
Computer Architecture and OrganizationMiles
Murdocca and Vincent Heuring
Chapter 10 Advanced Computer Architecture
2
Chapter Contents
  • 10.1 Parallel Architecture
  • 10.2 Superscalar Machines and the PowerPC
  • 10.3 VLIW Machines, and the Itanium
  • 10.4 Case Study Extensions to the Instruction
    Set The Intel MMX/SSEX and Motorola Altivec
    SIMD Instructions
  • 10.5 Programmable Logic Devices and Custom ICs
  • 10.6 Unconventional Architectures

3
Parallel Speedup and Amdahls Law
  • In the context of parallel processing, speedup
    can be computed

Amdahls law, for p processors and a fraction f
of unparallelizable code
For example, if f 10 of the operations must
be performed sequentially, then speedup can be no
greater than 10 regardless of how many processors
are used
4
Efficiency and Throughput
  • Efficiency is the ratio of speedup to the
    number of processors used. For a speedup of 5.3
    with 10 processors, the efficiency is

Throughput is a measure of how much computation
is achieved over time, and is of special concern
for I/O bound and pipelined applications. For the
case of a four stage pipeline that remains
filled, in which each pipeline stage completes
its task in 10 ns, the average time to complete
an operation is 10 ns even though it takes 40 ns
to execute any one operation. The overall
throughput for this situation is then
5
FlynnTaxonomy
Classification of architectures according to
the Flynn taxonomy (a) SISD (b) SIMD (c) MIMD
(d) MISD.
6
Network Topologies
Network topologies (a) crossbar (b) bus (c)
ring (d) mesh (e) star (f) tree (g) perfect
shuffle (h) hypercube.
7
Crossbar
Internal organization of a crossbar.
8
Crosspoint Settings
(a) Crosspoint settings for connections 0 3
and 3 0 (b) adjusted settings to accommodate
connection 1 1.
9
Three-Stage Clos Network
10
12-Channel Three-Stage Clos Network with n p 6
11
12-Channel Three-Stage Clos Network with n p 2
12
12-Channel Three-Stage Clos Network with n p 4
13
12-Channel Three-Stage Clos Network with n p 3
14
C function computes (x2 y2) y2
15
Dependency Graph
(a) Control sequence for C program (b)
dependency graph for C program.
16
Matrix Multiplication
(a) Problem setup for Ax b (b) equations for
computing the bi.
17
Matrix Multiplication Dependency Graph
18
The PowerPC 601 Architecture
19
128-Bit IA-64 Instruction Word
Each 41 bit instruction consists of three
register addresses (each 7 bits 128 possible
registers), a predicate register (6 bits) and the
opcode and flags or general purpose register (14
bits, varies by instruction).
20
Itanium Instruction Types
21
Allowable Combinations of IA-64 Instruction Types
Assigned to Instruction Slots
22
IA-64 Instruction Issues
Maximum number of IA-64 instructions that can
be executed for each pairing of bundles.
23
Intel MMX (MultiMedia eXtensions)
Vector addition of eight bytes by the Intel
PADDB mm0, mm1 instruction
24
Intel and Motorola Vector Registers
Intel aliases the floating point registers as
MMX registers. This means that the Pentiums 8
64-bit floating-point registers do double-duty as
MMX registers. Motorola implements 32 128-bit
vector registers as a new set, separate and
distinct from the floating-point registers.
25
MMX and AltiVec ArithmeticInstructions
26
Comparing Two MMX Byte Vectors for Equality
27
Conditional Assignment of an MMX Byte Vector
28
A PAL Device
PLAs and PALs are similar except that the OR
gates in a PAL have a fixed number of inputs and
the inputs are not programmable. PALs are more
prevalent than PLAs because they are easier to
manufacture and are less complex.
29
Complex Programmable Logic Device
CPLDs are PAL-like or PLA-like blocks that can be
combined with programmable interconnections.
Commercial CPLDs may contain as many as 200,000
equivalent gates and have over 3,000 macrocells.
30
Field Programmable Gate Array
Unlike CPLDs, which employ large logic blocks and
fewer interconnection options, FPGAs employ small
logic blocks that can be programmably
interconnected.
31
Quantum Computing
Single-particle interference experiment.
32
Multi-Valued Logic
Truth tables for binary and ternary comparison
functions
33
Neural Networks
Model of a living neuron, and model of an
artificial neuron (below).
34
Artificial Neural Network Example
Two simple, feed-forward neural networks with
inputs, weights, and thresholds as shown.
Write a Comment
User Comments (0)
About PowerShow.com