Title: COMP 3221 Microprocessors and Embedded Systems Lecture 4: Memory Access http://www.cse.unsw.edu.au/~cs3221
1COMP 3221 Microprocessors and Embedded Systems
Lecture 4 Memory Access http//www.cse.unsw.e
du.au/cs3221
- March, 2004
- Modified from Notes by Saeid Nooshabadi
2Overview
- Memory Access in Assembly
- Data Structures in Assembly
3ReviewInstruction Set (ARM 7TDMI)
- Set of instruction that a processor can execute
- Instruction Categories
- Data Processing or Computational (Logical and
Arithmetic - Load/Store (Memory Access or transferring data
between memory and registers) - Control Flow (Jump and Branch)
- Floating Point
- coprocessor
- Memory Management
- Special
Registers
4Review ARM Instructions So far
- add, sub,mov
- and,bic, orr, eor
- Data Processing Instructions with shift and
rotate - lsl, lsr, asr, ror
- Multiplications
- mul, mla,umull, umlal, smull,
- smlal
5Assembly Operands Memory
- C variables map onto registers what about large
data structures like arrays? - 1 of 5 components of a computer memory contains
such data structures - But ARM arithmetic instructions only operate on
registers, never directly on memory. - Data transfer instructions transfer data between
registers and memory - Memory to register
- Register to memory
6Data Transfer Memory to Reg (1/4)
- To transfer a word of data, we need to specify
two things - Register specify this by number (r0 r15)
- Memory address more difficult
- Think of memory as a single one-dimensional
array, so we can address it simply by supplying a
pointer to a memory address. - Other times, we want to be able to offset from
this pointer.
7Data Transfer Memory to Reg (2/4)
- To specify a memory address to copy from, specify
two things - A register which contains a pointer to memory
- A numerical offset (in bytes), or a register
which contain an offset - The desired memory address is the sum of these
two values. - Example v1, 8
- specifies the memory address pointed to by the
value in v1, plus 8 bytes - Example v1, v2
- specifies the memory address pointed to by the
value in v1, plus v2
8Data Transfer Memory to Reg (3/4)
- Load Instruction Syntax
- 1 2, 3, 4
- where
- 1) operation name
- 2) register that will receive value
- 3) register containing pointer to memory
- 4) numerical offset in bytes, or another
shifted index register - Instruction Name
- ldr (meaning Load register, so 32 bits or one
word are loaded at a time from memory to register)
9Data Transfer Memory to Reg (4/4)
- Example ldr a1, v1, 8
- This instruction will take the pointer in v1,
add 8 bytes to it, and then load the value from
the memory pointed to by this calculated sum into
register a1 - Notes
- v1 is called the base register
- 8 is called the offset
- offset is generally used in accessing elements of
array base reg points to beginning of array
- Example ldr a1, v1, v2
- This instruction will take the pointer in v1,
add an index offset in register v2 to it, and
then load the value from the memory pointed to by
this calculated sum into register a1 - Notes
- v1 is called the base register
- v2 is called the index register
- index is generally used in accessing elements of
array using an variable index base reg points to
beginning of array
10Data Transfer Other Mem to Reg Variants (1/2)
- Pre Indexed Load Example
- ldr a1, v1,12!
- This instruction will take the pointer in v1, add
12 bytes to it, and then load the value from the
memory pointed to by this calculated sum into
register a1. - Subsequently, v1 is updated by computed sum of v1
and 12, ( v1 ? v1 12).
- Pre Indexed Load Example
- ldr a1, v1, v2!
- This instruction will take the pointer in v1,
add an index offset in register v2 to it, and
then load the value from the memory pointed to by
this calculated sum into register a1. - Subsequently, v1 is updated by computed sum of v1
and v2, (v1? v1 v2).
11Data Transfer Other Mem to Reg Variants (2/2)
- Post Indexed Load Example
- ldr a1, v1, 12
- This instruction will load the value from the
memory pointed to by value in register v1 into
register a1. - Subsequently, v1 is updated by computed sum of
v1 and 12, ( v1 ? v1 12).
- Example ldr a1, v1, v2
- This instruction will load the value from the
memory pointed by value in register v1, into
register a1. - Subsequently, v1 is updated by computed sum of
v1 and v2, ( v1 ? v1 v2).
12Data Transfer Reg to Memory (1/2)
- Also want to store value from a register into
memory - Store instruction syntax is identical to Load
instruction syntax - Instruction Name
- str (meaning Store from Register, so 32 bits or
one word are stored from register to memory at a
time)
13Data Transfer Reg to Memory (2/2)
- Example str a1,v1, 12
- This instruction will take the pointer in v1,
add 12 bytes to it, and then store the value from
register a1 into the memory address pointed to by
the calculated sum
- Example str a1,v1, v2
- This instruction will take the pointer in v1,
adds register v2 to it, and then store the value
from register a1 into the memory address pointed
to by the calculated sum.
14Data Transfer Other Reg to Mem Variants (1/2)
- Pre Indexed Store Example
- str a1, v1,12!
- This instruction will take the pointer in v1,
add 12 bytes to it, and then store the value from
register a1 into the memory address pointed to by
the calculated sum. - Subsequently, v1 is updated by computed sum of
v1 and 12, ( v1 ? v1 12).
- Pre Indexed Store Example
- str a1,v1, v2!
- This instruction will take the pointer in v1,
adds register v2 to it, and then store the value
from register a1 into the memory address pointed
to by the calculated sum. - Subsequently, v1 is updated by computed sum of
v1 and v2 ( v1 ? v1 v2).
15Data Transfer Other Reg to Mem Variants (2/2)
- Post Indexed Store Example
- str a1, v1,12
- This instruction will store the value from
register a1 into the memory address pointed to by
register v1. - Subsequently, v1 is updates by computed sum of
v1 and 12, ( v1 ? v1 12).
- Post Indexed Store Example
- str a1,v1, v2
- This instruction will store the value from
register a1 into the memory address pointed to
by register v1. - Subsequently, v1 is updated by computed sum of
v1 and v2, ( v1 ? v1 v2).
16Pointers v. Values
- Key Concept A register can hold any 32-bit
value. That value can be a (signed) int, an
unsigned int, a pointer (memory address), etc. - If you write add v3,v2,v1 then v1 and v2
better contain values - If you write ldr a1,v1 then v1 better contain
a pointer - Dont mix these up!
17Addressing Byte vs. halfword vs. word
- Every word in memory has an address, similar to
an index in an array - Early computers numbered words like C numbers
elements of an array - Memory0, Memory1, Memory2,
- Computers needed to access 8-bit bytes, half
words (2 bytes/halfword) as well as words (4
bytes/word) - Today machines address memory as bytes, hence
- Half word addresses differ by 2
- Memory0, Memory2, Memory4,
- word addresses differ by 4
- Memory0, Memory4, Memory8,
Called the address of a word
18Compilation with Memory
- What offset in ldr to select my_Array8 in C?
- 4x832 to select my_Array8 byte v. word
- Compile by hand using registers g h
my_Array8 - g v1, h v2, v3 base address of my_Array
- 1st transfer from memory to register
- ldr v1, v3,32 v1 gets my_Array8
- Add 32 to v3 to select my_Array8, put into v1
- Next add it to h and place in gadd v1,v2,v1 v1
h my_Array8
19Same thing in pictures
0
my_Array
my_Array0
32
my_Array8
v1
g
v2
h
v3
0xFFFFFFFF
20Compile with variable index
- What if array index not a constant? g h
my_Arrayi - g v1, h v2, i v3, v4 base address of
my_Array - To load my_Arrayi into a register, first turn i
into a byte address multiply by 4 - How multiply using adds?
- i i 2i, 2i 2i 4i
- mov a1,v3 a1 i
- add a1,a1 a1 2i
- add a1,a1 a1 4i
Better alternative mov a1, v3, lsl 2
21Compile with variable index, cont
- Now load my_Arrayi my_Array0 4i into v1
register - ldr v1, v4, a1 v1 my_Arrayi
- Finally add to h to it and put sum in g
- add v1,v1, v2 g h my_Arrayi
22Compile with variable index Summary
- C statement
- g h my_Arrayi
- Compiled ARM assembly instructions
- mov a1, v3, lsl 2 a1 4i
- ldr v1, v4, a1 v1 my_Arrayi
- Finally add to h to it and put sum in g
- add v1,v1, v2 g h my_Arrayi
23Compile with variable index Example
- Compile this into ARM code
- B_Arrayi h A_Arrayi
- h v1, iv2, v3base address of A_Array, v4base
address of B_Array
24Compile with variable index Example (Solution)
- Compile this C code into ARM
- B_Arrayi h A_Arrayi
- h v1, iv2, v3base address of A_Array, v4base
address of B_Array - mov a1, v2, lsl 2 a1 4i
- ldr a2, v3, a1 v4 a1
addrB_Arrayi a2 A_arrayi - add a2, a2, v1 a2 h A_Arrayi
- str a2, v4, a1 v4 a1
addrB_Arrayi B_Arrayi a2
25COMP3221 Reading Materials (Week 4)
- Week 4 Steve Furber ARM System On-Chip 2nd
Ed, Addison-Wesley, 2000, ISBN 0-201-67519-6. We
use chapters 3 and 5 - ARM Architecture Reference Manual On CD ROM
26Notes about Memory
- Pitfall Forgetting that sequential word
addresses in machines with byte addressing do not
differ by 1. - Many an assembly language programmer has toiled
over errors made by assuming that the address of
the next word can be found by incrementing the
address in a register by 1 instead of by the word
size in bytes. - So remember that for both ldr and str, the sum of
the base address and the offset must be a
multiple of 4 (to be word aligned)
27More Notes about Memory Alignment (1/2)
- ARM requires that all words start at addresses
that are multiples of 4 bytes
- Called Alignment objects must fall on address
that is multiple of their size. - Some machines like Intel allow non-aligned
accesses
28More Notes about Memory Alignment (2/2)
- Non-Aligned memory access causes byte rotation in
right direction within the word
29Role of Registers vs. Memory
- What if more variables than registers?
- Compiler tries to keep most frequently used
variable in registers - Writing less common to memory spilling
- Why not keep all variables in memory?
- Smaller is fasterregisters are faster than
memory - Registers more versatile
- ARM Data Processing instructions can read 2,
operate on them, and write 1 per instruction - ARM data transfer only read or write 1 operand
per instruction, and no operation
30Overview
- Word/ Halfword/ Byte Addressing
- Byte ordering
- Signed Load Instructions
- Instruction Support for Characters
31Review Assembly Operands Memory
- C variables map onto registers what about large
data structures like arrays? - 1 of 5 components of a computer memory contains
such data structures - But ARM arithmetic instructions only operate on
registers, never directly on memory. - Data transfer instructions transfer data between
registers and memory - Memory to register
- Register to memory
32Review Data Transfer Memory ?? Reg
Similar instructions For STR
- Example ldr a1, v1,12!
- Pre Indexed Load Subsequently, v1 is updates by
computed sum of v1 and 12, ( v1 ? v1 12).
- Example ldr a1, v1, v2!
- Pre Indexed Load Subsequently, v1 is updates by
computed sum of v1 and 12, ( v1 ? v1 v2).
- Example ldr a1, v1,12
- Post Indexed Load Subsequently, v1 is updates by
computed sum of v1 and 12, ( v1 ? v1 12).
- Example ldr a1, v1, v2
- Post Indexed Load Subsequently, v1 is updates by
computed sum of v1 and 12, ( v1 ? v1 v2).
33Review Memory Alignment
- ARM requires that all words start at addresses
that are multiples of 4 bytes
- Called Alignment objects must fall on address
that is multiple of their size. - Some machines like Intel allow non-aligned
accesses
34Data Transfer More Mem to Reg Variants (1/2)
- Load Byte Example
- ldrb a1, v1,12
- This instruction will take the pointer in v1, add
12 bytes to it, and then load the byte value from
the memory pointed to by this calculated sum into
register a1.
- Load Byte Example
- ldrb a1, v1, v2
- This instruction will take the pointer in v1,
add an index offset in register v2 to it, and
then load the byte value from the memory pointed
to by this calculated sum into register a1.
35Data Transfer More Mem to Reg Variants (2/2)
- Load Half Word Example
- ldrh a1, v1,12
- This instruction will take the pointer in v1, add
12 bytes to it, and then load the half word value
from the memory pointed to by this calculated sum
into register a1.
- Load Byte Example
- ldrh a1, v1, v2
- This instruction will take the pointer in v1,
add an index offset in register v2 to it, and
then load the half word value from the memory
pointed to by this calculated sum into register
a1.
36Data Transfer More Reg to Mem Variants (1/2)
- Store Byte Example
- strb a1, v1,12
- This instruction will take the pointer in v1,
add 12 bytes to it, and then store the value from
lsb Byte of register a1 into the memory address
pointed to by the calculated sum.
- Store Byte Example
- strb a1,v1, v2
- This instruction will take the pointer in v1,
adds register v2 to it, and then store the value
from lsb Byte of register a1 into the memory
address pointed to by the calculated sum.
37Data Transfer More Reg to Mem Variants (2/2)
- Store Half Word Example
- strh a1, v1,12
- This instruction will take the pointer in v1,
add 12 bytes to it, and then store the value from
half word of register a1 into the memory address
pointed to by the calculated sum.
- Store Half Word Example
- strh a1,v1, v2
- This instruction will take the pointer in v1,
adds register v2 to it, and then store the value
from half word of register a1 into the memory
address pointed to by the calculated sum.
38Compilation with Memory (Byte Addressing)
- What offset in ldr to select my_Array8 (defined
as Char) in C? - 1x88 to select my_Array8 byte
- Compile by hand using registers g h
my_Array8 - g v1, h v2, v3base address of my_Array
- 1st transfer from memory to register
- ldrb v1, v3,8 v1 gets my_Array8
- Add 8 to r3 to select my_Array8, put into v1
- Next add it to h and place in gadd v1,v2,v1 v1
h my_Array8
39Compilation with Memory (half word Addressing)
- What offset in ldr to select my_Array8 (defined
as halfword) in C? - 2x816 to select my_Array8 byte
- Compile by hand using registers g h
my_Array8 - g v1, h v2, v3base address of my_Array
- 1st transfer from memory to register
- ldrh v1, v3, 16 v1 gets my_Array8
- Add 16 to r3 to select my_Array8, put into v1
- Next add it to h and place in gadd v1,v2,v1 v1
h my_Array8
40More Notes about Memory Word
- How are bytes numbered in a word?
- Gullivers Travels Which end of egg to open?
- Cohen, D. On holy wars and a plea for peace
(data transmission). Computer, vol.14, (no.10),
Oct. 1981. p.48-54. - Little Endian address of least significant byte
Intel 80x86, DEC Alpha, - Big Endian address of most significant byte HP
PA, IBM/Motorola PowerPC, SGI, Sparc - ARM is Little Endian by default, However it can
be made Big Endian by configuration.
41Endianess Example
42Code Example
- Write a segment of code that add together
elements x to x(n-1) of an array, where the
element x 0 is the first element of the array. - Each element of the array is word sized (ie. 32
bits). - The segment should use post-indexed addressing.
- At the start of your segments, you should assume
that - a1 points to the start of the array.
- a2 x
- a3 n
43Code Example Sample Solution
- add a1, a1, a2, lsl 2 Set a1 to address
of element x - add a3, a1, a3, lsl 2 Set a3 to address
of element x (n-1) - mov a2, 0 Initialise accumulator
- Loop
- ldr a4, a1, 4 Access element and
move to next - add a2, a2, a4 Add contents to
counter - cmp a1, a3 Have we reached element
xn? - blt loop If not - repeat for next
element - on exit sum contained in a2
44Sign Extension and Load Byte Load Half Word
- ARM instruction (ldrsb) automatically extends
sign of byte for load byte.
- ARM instruction (ldrsh) automatically extends
sign of half word for load half word.
45Instruction Support for Characters
- ARM (and most other instruction sets) include
instructions to operate on bytes - move byte (ldrb) loads a byte from memory/reg,
placing it in rightmost 8 bits of a register, or
vice versa - Declares byte variables in C as char
- Assume x, y are declared char. x in memory at
v1,4and y at v1,0. What is ARM code for x
y ? - ldrb a1, v1,0
- strb a1, v1,4 transfer y to x
46Strings in C Example
- String simply an array of charvoid strcpy (char
x, char y)int i 0 / declare,initialize
i/while ((xi yi) ! \0) / 0 / i
i 1 / copy and test byte /
- function
- i, addr. of x0, addr. of y0 v1, a1, a2 ,
func ret addr. lr - strcpy mov v1, -1 i -1L1 add
v1, v1, 1 i i 1 ldrb a3,
a2,v1 a1 yi strb a3, a1,v1
xiyi cmp a3, 0 bne L1
yi!0 goto L1 mov pc, lr
return
47Strings in C Example using pointers
- String simply an array of charvoid strcpy2 (char
px, char py)while ((px py) ! \0)
/ 0 / / copy and test byte /
- function
- addr. of x0, addr. of y0 v2, v3 func ret
addr.lr - strcpyL1 ldrb a1, v3,1 a1 py, py
py 1 strb a1, v2,1 px py, px px
1 cmp a1, 0 bne L1 py!0
goto L1 mov pc, lr return
- ideally compiler optimizes code for you
48Block Copy Transfer (1/5)
- Consider the following code
str a1, v1,4 str a2, v1,4 str a3,
v1,4 str a4, v1,4
a1
a2
a3
a4
- Consider the following code
str a1, v1, 4! str a2, v1, 4! str a3, v1,
4! str a4, v1, 4!
a1
a2
a3
a4
49Block Copy Transfer (2/5)
- Consider the following code
str a1, v1,-4 str a2, v1,-4 str a3,
v1,-4 str a4, v1,-4
a4
a3
a2
a1
- Consider the following code
str a1, v1, -4! str a2, v1, -4! str a3,
v1, -4! str a4, v1, -4!
a4
a3
a2
a1
50Block Copy Transfer (3/5)
- Consider the following code
str a1, v1 str a2, v1,4 str a3, v1,8 str
a4, v1,12
a1
a2
a3
a4
- Consider the following code
str a1, v1, 4 str a2, v1, 8 str a3, v1,
12 str a4, v1, 16
a1
a2
a3
a4
51Block Copy Transfer (4/5)
- Consider the following code
str a1, v1 str a2, v1,-4 str a3,
v1,-8 str a4, v1,-12
a4
a3
a2
a1
- Consider the following code
str a2, v1,-4 str a3, v1,-8 str a4,
v1,-12 str a1, v1,16
a4
a3
a2
a1
52Block Data Transfer (5/5)
- Similarly we have
- LDMIA Load Multiple Increment After
- LDMIB Load Multiple Increment Before
- LDMDA Load Multiple Decrement After
- LDMDB Load Multiple Decrement Before
For details See Chapter 3, page 61 62 Steve
Furber ARM System On-Chip 2nd Ed,
Addison-Wesley, 2000, ISBN 0-201-67519-6.
53COMP3221 Reading Materials (Week 4)
- Week 4 Steve Furber ARM System On-Chip 2nd
Ed, Addison-Wesley, 2000, ISBN 0-201-67519-6. We
use chapters 3 and 5 - ARM Architecture Reference Manual On CD ROM
54And in Conclusion (1/2)
- In ARM Assembly Language
- Registers replace C variables
- One Instruction (simple operation) per line
- Simpler is Better
- Smaller is Faster
- Memory is byte-addressable, but ldr and str
access one word at a time. - Access byte and halfword using ldrb, ldrh,ldrsb
and ldrsh - A pointer (used by ldr and str) is just a memory
address, so we can add to it or subtract from it
(using offset).
55And in Conclusion(2/2)
- New Instructions
- ldr, str
- ldrb, strb
- ldrh, strh
- ldrsb, ldrsh