Embedded System HW - PowerPoint PPT Presentation

1 / 127

About This Presentation

Title:

Embedded System HW

Description:

Microprocessors use much more logic to implement a function than does ... (ACORN and Apple Computer) ARM Architecture. ARM versions. ARM assembly language. ... – PowerPoint PPT presentation

Number of Views:87

Avg rating:3.0/5.0

Slides: 128

Provided by: wayne74

Category:

more less

Transcript and Presenter's Notes

Title: Embedded System HW

1
Embedded System HW
2
Why use microprocessors?

Alternatives field-programmable gate arrays
(FPGAs), custom logic, etc.
Microprocessors are often very efficient can use
same logic to perform many different functions.
Microprocessors simplify the design of families
of products.

3
The performance paradox

Microprocessors use much more logic to implement
a function than does custom logic.
But microprocessors are often at least as fast
heavily pipelined
large design teams
aggressive VLSI technology.

4
Power

Custom logic is a clear winner for low power
devices.
Modern microprocessors offer features to help
control power consumption.
Software design techniques can help reduce power
consumption.

5
Microprocessor varieties

Microcontroller includes I/O devices, on-board
memory.
Digital signal processor (DSP) microprocessor
optimized for digital signal processing.
Typical embedded word sizes 8-bit, 16-bit,
32-bit.

6
Many Types of Programmable Processors

Past
Microprocessor
Microcontroller
DSP
Graphics Processor

Now / Future
Network Processor
Sensor Processor
Cryptoprocessor
Game Processor
Wearable Processor
Mobile Processor

7
Application-Specific Instruction Processors
(ASIPs)

Processors with instruction-sets tailored to
specific applications or application domains
instruction-set generation as part of synthesis
Pluses
customization yields lower area, power etc.
Minuses
higher h/w s/w development overhead
design, compilers, debuggers
higher time to market

8
Reconfigurable SoC
Other Examples Atmels FPSLIC(AVR
FPGA) Alteras Nios(configurable RISC on a PLD)

Triscends A7 CSoC

9
Instruction Sets
10
von Neumann architecture

Memory holds data, instructions.
Central processing unit (CPU) fetches
instructions from memory.
Separate CPU and memory distinguishes
programmable computer.
CPU registers help out program counter (PC),
instruction register (IR), general-purpose
registers, etc.

11
CPU memory
memory
address
CPU
PC
200
data
IR
ADD r5,r1,r3
ADD r5,r1,r3
200
12
Harvard architecture
address
CPU
data memory
PC
data
address
program memory
data
13
von Neumann vs. Harvard

Harvard cant use self-modifying code.
Harvard allows two simultaneous memory fetches.
Most DSPs use Harvard architecture for streaming
data
greater memory bandwidth
more predictable bandwidth.

14
RISC vs. CISC

Complex instruction set computer (CISC)
many addressing modes
many operations.
Reduced instruction set computer (RISC)
load/store
pipelinable instructions.

15
Instruction set characteristics

Fixed vs. variable length.
Addressing modes.
Number of operands.
Types of operands.

16
Programming model

Programming model registers visible to the
programmer.
Some registers are not visible (IR).

17
Multiple implementations

Successful architectures have several
implementations
varying clock speeds
different bus widths
different cache sizes
etc.

18
ARM Architecture

Advanced RISC Machines(1990)
(ACORN and Apple Computer)

19
ARM Architecture

ARM versions.
ARM assembly language.
ARM programming model.

20
ARM versions

ARM architecture has been extended over several
versions.
We will concentrate on ARMv5

21
Evolution of the ARM architecture versions
22
ARMv6 Improvement

Memory management
Multiprocessing
Multimedia support SIMD capability

23
Evolution of the ARM architecture
ARM11
24
Introduction

To allow very small, yet high-performance
implementations
RISC
Large uniform register file
Load/store architecture
Simple addressing modes
Uniform and fixed-length instr fields
Auto-increment and auto-decrement addr modes
Conditional execution of all instrcutions

25
ARM assembly language

Fairly standard assembly language
LDR r0,r8 a comment
label ADD r4,r0,r1

26
Programming Model
27
ARM data types

Byte
Halfword 16 bits
Must be aligned to two-byte boundaries
Word 32 bits
Must be aligned to four-byte boundaries
ARM addresses can be 32 bits long.
Address refers to byte.
Address 4 starts at byte 4.
Can be configured at power-up as either little-
or bit-endian mode.

28
Processor modes

User usr Normal program execution modes
FIQ fiq Supports a high-speed data transfer or
channel process
IRQ irq Used for general-purpose interrupt
handling
Supervisor svc A protected mode for OS
Abort abt Implements VM and/or memory
protection
Undefined und Supports software emulation of
HW coprocessors
System sys Runs privileged OS tasks
fiq, irq, svc, abt, und exception modes

29
Registers
r0
r8
r1
r9
0
31
r2
r10
CPSR
r3
r11
r4
r12
r5
r13
r6
r14
r7
r15 (PC)
Link register
unbanked registers
banked registers
30
(No Transcript)
31
Endianness

Relationship between bit and byte/word ordering
defines endianness

bit 31
bit 0
bit 0
bit 31
byte 3
byte 2
byte 1
byte 0
byte 0
byte 1
byte 2
byte 3
little-endian
big-endian
32
ARM status bits

Every arithmetic, logical, or shifting operation
may set CPSR (current program statues register)
bits
N (negative), Z (zero), C (carry), V (overflow).
Examples
-1 1 0 NZCV 0110.
231-11 -231 NZCV 0101.

33
ARM data processing operand addressing

Instruction syntax
ltopcodegtltcondgtS ltRdgt, ltRngt, ltshifter-operandgt
ltshifter-operandgt has 11 options

34
Condition field

Almost all ARM instrs. conditionally executed

35
ARM data processing operand addressing
Data processing immediate shift
Data processing register shift
Data processing 32-bit immediate
36
Shifter operand

Immediate
8-bit constant and a 4-bit rotate (0,2,4,8,,30)
mov r0, 0
add r9, r9,1
Register operand
mov r2, r0
Shifted register operand
ASR, LSL, LSR, ROR, RRX (by one bit)
mov r2, r0, LSL 2 shift r0 left by 2, write
to r2 (r2r0x4)
sub r10,r9,r8, LSR 4 r10 r9 - r8/16
sov r10,r9,r8, ROR r3 r10 r9 - (r8 rotated by
value of r3)

37
ARM data-processing

AND
EOR
SUB Rd Rn - shifter operand
RSB Rd shifter operand - Rn
ADD
ADC (with carry)
SBC
RSC (reverse SBC)

TST update flags after Rn AND shifter operand
TEQ
CMP
CMN copmare negated
ORR (logical OR)
MOV
BIC
MVN (mov not)

38
ARM data-processing

Shift, Rotate ? shifter-operand
LSL, LSR logical shift left/right
ASR arithmetic shift left/right
ROR rotate right
RRX rotate right extended with C

39
Data operation varieties

Logical shift
fills with zeroes.
Arithmetic shift
fills with sign extension
RRX performs 33-bit rotate, including C bit from
CPSR above sign bit.

40
Load and Store instructions

Two types
32-bit word or an 8-bit unsigned byte
Load and store halfword and load signed byte
Addressing modes
Base register
Any one of GPR (including the PC)
Offset
Three format

41
Addressing modes

Offset
Immediate unsigned number (12 bits or 8 bits)
Register GPR (not the PC)
Scaled register shifted by an immediate value
LSL, LSR, ASR, ROR, RRX
Three ways to form the memory address
EA Base register or Offset
Offset
Pre-indexed
Post-indexed

42
Addressing modes

Base-plus-offset addressing
LDR r0,r1,16
Loads from location r116
Pre-indexing increments base register
LDR r0,r1,16!
Post-indexing fetches, then does offset
LDR r0,r1,16
Loads r0 from r1, then adds 16 to r1.

43
Load and store

LDR
LDRB
LDRH
LDRSB (signed byte)
LDRSH (signed halfw)

STR
STRB
STRH

44
Examples

LDR R1, R0 load R1 from the address in R0
LDR R8, R3, 4 EA R3 4
LDR R8, R3, -4 EA R3 4
STRB R10, R7, -R4 EA R7 R4
LDR R11, R3, R5, LSL 2 EA R3 (R5x4)
LDR R3, R9, 4 EA R9, R9 R9 4
post-indexed
LDR R1, R0, 2 ! EA R02, R0R02
pre-indexed
LDR R0, PC, 40 load R0 from PC0x40 (
address of the instruction 8 0x40)

45
Load and store multiple

Addressing modes
IA increment after
IB increment before
DA decrement after
DB decrement before

46
Load and store multiple

LDM
STM
Examples
LDMIA r0, r5 r8
load multiple r5-r8 from
the
address in r0
STMDA r1!, r2, r5, r7 r9, r11
update r1

47
Branch instructions

Conditional branch forwards or backwards up to 32
MB
Sign-extending the 24-bit imm_data to 32 bits
Shifting the result left two bits
Adding this to the PC (the addr of branch 8)
Approximately 32MB
B, BL

48
Examples

B label
BCC label branch if carry flag is clear
BEQ label if zero flag is set
MOV PC, 0 branch to location zero
BL func subroutine call
MOV PC,LR return
MOV LR, PC
LDR PC, func

49
ARM ADR pseudo-op

Cannot refer to an address directly in an
instruction.
Generate value by performing arithmetic on PC.
ADR pseudo-op generates instruction required to
calculate address
ADR r1,FOO

50
Examples

start MOV r0, 10
ADR r4, start gt SUB r4,pc,0xc
start pc - 4 - 8 pc - 12 pc - 0xc

51
Example C assignments

C
x (a b) - c
Assembler
ADR r4,a get address for a
LDR r0,r4 get value of a
ADR r4,b get address for b, reusing r4
LDR r1,r4 get value of b
ADD r3,r0,r1 compute ab
ADR r4,c get address for c
LDR r2r4 get value of c

52
C assignment, contd.

SUB r3,r3,r2 complete computation of x
ADR r4,x get address for x
STR r3r4 store value of x

53
Example C assignment

C
y a(bc)
Assembler
ADR r4,b get address for b
LDR r0,r4 get value of b
ADR r4,c get address for c
LDR r1,r4 get value of c
ADD r2,r0,r1 compute partial result
ADR r4,a get address for a
LDR r0,r4 get value of a

54
C assignment, contd.

MUL r2,r2,r0 compute final value for y
ADR r4,y get address for y
STR r2,r4 store y

55
Example C assignment

C
z (a ltlt 2) (b 15)
Assembler
ADR r4,a get address for a
LDR r0,r4 get value of a
MOV r0,r0,LSL 2 perform shift
ADR r4,b get address for b
LDR r1,r4 get value of b
AND r1,r1,15 perform AND
ORR r1,r0,r1 perform OR

56
C assignment, contd.

ADR r4,z get address for z
STR r1,r4 store value for z

57
Example if statement

C
if (a lt b) x 5 y c d else x c - d
Assembler
compute and test condition
ADR r4,a get address for a
LDR r0,r4 get value of a
ADR r4,b get address for b
LDR r1,r4 get value for b
CMP r0,r1 compare a lt b
BGE fblock if a gt b, branch to false block

58
If statement, contd.

true block
MOV r0,5 generate value for x
ADR r4,x get address for x
STR r0,r4 store x
ADR r4,c get address for c
LDR r0,r4 get value of c
ADR r4,d get address for d
LDR r1,r4 get value of d
ADD r0,r0,r1 compute y
ADR r4,y get address for y
STR r0,r4 store y
B after branch around false block

59
If statement, contd.

false block
fblock ADR r4,c get address for c
LDR r0,r4 get value of c
ADR r4,d get address for d
LDR r1,r4 get value for d
SUB r0,r0,r1 compute a-b
ADR r4,x get address for x
STR r0,r4 store value of x
after ...

60
Example Conditional instruction implementation

true block
MOVLT r0,5 generate value for x
ADRLT r4,x get address for x
STRLT r0,r4 store x
ADRLT r4,c get address for c
LDRLT r0,r4 get value of c
ADRLT r4,d get address for d
LDRLT r1,r4 get value of d
ADDLT r0,r0,r1 compute y
ADRLT r4,y get address for y
STRLT r0,r4 store y

61
Conditional instruction implementation, contd.

false block
ADRGE r4,c get address for c
LDRGE r0,r4 get value of c
ADRGE r4,d get address for d
LDRGE r1,r4 get value for d
SUBGE r0,r0,r1 compute a-b
ADRGE r4,x get address for x
STRGE r0,r4 store value of x

62
Example FIR filter

C
for (i0, f0 iltN i)
f f cixi
Assembler
loop initiation code
MOV r0,0 use r0 for I
MOV r8,0 use separate index for arrays
ADR r2,N get address for N
LDR r1,r2 get value of N
MOV r2,0 use r2 for f

63
FIR filter, cont.d

ADR r3,c load r3 with base of c
ADR r5,x load r5 with base of x
loop body
loop LDR r4,r3,r8 get ci
LDR r6,r5,r8 get xi
MUL r4,r4,r6 compute cixi
ADD r2,r2,r4 add into running sum
ADD r8,r8,4 add one word offset to array
index
ADD r0,r0,1 add 1 to i
CMP r0,r1 exit?
BLT loop if i lt N, continue

64
Nested subroutine calls

Nesting/recursion requires coding convention
f1 LDR r0,r13 load arg into r0 from stack
call f2()
STR r14,r13! store f1s return adrs
STR r0,r13! store arg to f2 on stack
BL f2 branch and link to f2
return from f1()
SUB r13,4 pop f2s arg off stack
LDR r15,r13! restore register and return

65
Summary

Load/store architecture
Most instructions are RISCy, operate in single
cycle.
Some multi-register operations take longer.
All instructions can be executed conditionally.

66
MPC850

Integrated Communication Microprocessor

67
Reference Manuals

MPC850 Family User Manual
PowerPC Programming Environment Manual
Course Home Page http//calab.kaist.ac.kr/maeng/c
s310/micro02.htm
Motorola Home Page
http//e-www.motorola.com

68
Overview

Versatile, one-chip, integrated communication
processor
Embedded PowerPC core
Versatile memory controller
Communication processor module (CPM)
Serial communication controllers (SCCs)
One USB
Etc.

69
(No Transcript)
70
Embedded PowerPC core

Single issue, 32-bit version
Branch folding and prediction
2-K byte I-cache, 1K byte D-cache
2-way set-associative
Physical
MMUs with 8-entry TLBs
4K, 16K, 256K, 512K, and 8MB page sizes

71
Other Features

Dynamic data bus sizing 8-, 16-, 32-bit
CPU clock 0-80MHz
System Integration Unit (SIU)
Memory Controller
General Purpose timer
CPM, SCCs, SMCs, etc.

72
PowerPC Architecture
73
PowerPC instruction set

Overview
Operand Conventions
PowerPC Registers and programming model
Addressing Modes
Instruction Set
Cache model
Exception Model
Memory management model

74
PowerPC Architecture

Motorola, IBM, Apple computer
Power Architecture RS/6000 family
64-bit architecture with a 32-bit subset
Three Levels of the architecture
Flexibility degrees of SW compatibility
UISA (User instruction set architecture)
VEA (Virtual environment architecture)
OEA (Operating environment architecture)

75
Features not defined by the PowerPC Architecture

For flexibility
System bus interface signals
Cache design
The number and the nature of execution units
Other internal micro-architecture issues

76
Endianness

Relationship between bit and byte/word ordering
defines endianness

bit 31
bit 0
bit 0
bit 31
byte 3
byte 2
byte 1
byte 0
byte 0
byte 1
byte 2
byte 3
little-endian
big-endian
PowerPC, IBM, Motorola
ARM, Intel
77
Programming Model Registers
78
(No Transcript)
79
PowerPC programming model - Register Set

User Model UISA (32-bit architecture)

Condition register
GPR0(32)
FGPR0(64)
CR(32)
GPR1(32)
FGPR1(64)

FP status and control register
GPR31(32)
FPSCR(32)
FGPR31(64)
XER register
Link register
Count register
CTR(64/32)
XER(32)
LR(64/32)
80
Condition Registers (CR)

For testing and branching

CR0
CR1
CR7
CR6
CR5
CR4
CR3
CR2
0
31
FP
Condition register CRn Field Compare Instruction
For all integer instrs. Bit0 Negative(LT) Bit1
Positive(GT) Bit2 Zero (EQ) Bit3 Summary
Overflow(SO)
back
81
XER Register (XER)
back
82
XER Register (XER), contd
83
Link Register (LR), Count Register (CTR)
bclrx (bc to link register) Branch with link
update
84
Counter Register

Loop count

85
VEA Register Set Time Base
86
OEA Register Set
87
Machine State Register (MSR)
88
(No Transcript)
89
(No Transcript)
90
Addressing Modes

Effective Address Calculation
Register indirect with immediate index mode
Register indirect with index mode
Register indirect mode

91
Register Indirect with Immediate Index Addressing
back
92
Register Indirect with Index
back
93
Register Indirect
back
94
Instruction Formats

4 bytes long and word-aligned
Bits 0-5 always specify the primary opcode
Extended opcode

95
Instruction set

Integer
Floating-point
Load and store
Flow control
Processor control
Memory synchronization
Memory control
External control

96
Integer Instructions

Arithmetic, compare, logical, rotate and shift
Integer arithmetic, shift, rotate, and string
move
May update or read values from the XER
The CR may be updated if the Rc bit is set.
addic - addic.

97
(No Transcript)
98
(No Transcript)
99
(No Transcript)
100
Integer Compare

Algebraically, logically
crfD can be omitted if the result is to be placed
in CR0
crfD field the target CR
The L bit has no effect on 32-bit operations

101
Integer compare, contd
102
Integer Logical
103
Integer Logical, contd
104
Rotate and Shift Instructions

SH specify the number of bits to rotate
MB mask start
ME mask stop

105
Integer Rotate
106
Integer Shift
107
Load and Store

Integer load and store
Integer load and store with byte-reverse
Integer load and store multiple
FP load and store
Memory synchronization

108
(No Transcript)
109
(No Transcript)
110
(No Transcript)
111
(No Transcript)
112
Branch and Flow Control

EA calculation
Branch relative
Branch conditional to relative address
Branch to absolute address
Branch conditional to absolute address
Branch conditional to link register
Branch conditional to count register

113
Branch Relative
114
Branch conditional to relative
115
Branch to Absolute
116
Branch conditional to absolute
117
Branch conditional to LR
118
Branch conditional to count register
119
Conditional Branch control
120
Branch Instructions
121
CR logical Instructions
122
Trap, System Linkage
123
Processor Control
124
(No Transcript)
125
Memory Synchronization
126
Example