Title: Using the Assembler
1Using the Assembler
- Chapter 4
- Operators and Expressions
- JMP and LOOP Instructions
- Indirect Addressing
- Using a Link Library
2Producing the .lst and .map files
- With MASM 6.11, the ML command assemble and links
a .ASM file - ml hello.asm
- It produces a .OBJ file and a .EXE file
- Option /Zi produces debugging info for CV
- Option /Fl produces a source listing
- Option /Fm produces a map file
- Ex to produce all of these, type (with spaces)
- ml /Zi /Fl /Fm hello.asm
3Examining the .lst and .map files
- hello.lst
- The R suffix of an address indicates that it is
relocatable (resolved at loading time) - With the .model small directive, MASM aligns the
segments in memory in the following way - the stack segment is align at the next available
paragraph boundary (ie at a physical address
divisible by 10h) - the code and data segments are aligned at the
next available word boundary (ie at a physical
address divisible by 2) - The (relative) starting physical address of the
segments are found in the hello.map file
4Alignment of the data segment
- The offset address of the first data is generally
not 0 if the data segment is loaded on a word
boundary - Ex if the code part of hello.asm takes 11h bytes
and starts at 30000h. The first data will start
at 30012h and DS will contain 3001h. - .data
- message db "Hello, world!",0dh,0ah,'
- The offset address of message will be 2 instead
of 0 (check this with Code View) - mov bx, offset message BX2
- mov bx, offset message1 BX3 ...
5Alignment of the data segment (cont.)
- TASM however will align the data segment at the
first paragraph boundary (with the .model small
directive) - So the offset address of the first data in the
data segment will always be 0 - (like it is indicated in section 4.1.4 of the
textbook)
6Memory Models supported by MASM and TASM
7Processor Directives
- By default MASM and TASM only enable the assembly
of the 8086 instructions - The .386 directive enables the assembly of 386
instructions - instructions can then use 32-bit operands,
including 32-bit registers - Place this directive just after .model
- Same principle for all the x86 family
- Ex use the .586 directive to enable the assembly
of Pentium instructions
8Using a Link Library
- A link library is a file containing compiled
procedures. Ex irvine.lib contains procedures
for doing I/O and string-to-number conversion
(see table 5 and appendix e). Ex - Readint reads a signed decimal string from the
keyboard and stores the corresponding 16-bit
signed number into AX - Writeint_signed displays the signed decimal
string representing the signed number in AX - To use these procedures in intIO.asm
- ml irvine.lib intIO.asm
9The EXTRN directive
- A pgm must use the EXTRN directive whenever it
uses a name that is defined in another file. Ex.
in intIO.asm we have - extrn Readintproc, Writeint_signedproc
- For externally defined variables we use either
byte, word or dword. Ex - extrn byteVarbyte, wordVarword
- For an externally defined constant, we use abs
- extrn trueabs, falseabs
10The LOOP instruction
- The easiest way to repeat a block of statements a
specific number of times - LOOP label
- where the label must precede LOOP by less than
127 bytes of code - LOOP produces the sequence of events
- 1) CX is decremented by 1
- 2) IF (CX0) THEN go to the instruction following
LOOP, ELSE go to label - LOOPD uses the 32-bit ECX register as the loop
counter (.386 directive and up)
11The LOOP instruction (cont.)
- Ex the following will print all the ASCII codes
(starting with 00h) - mov cx,128
- mov dl,0
- mov ah,2
- next
- int 21h
- inc dl
- loop next
- If CX would be initialized to zero
- after executing the block for the 1st time, CX
would be decremented by 1 and thus contain FFFFh.
- the loop would thus be repeated again 64K times!!
Break ...
12Indirect Addressing
- Up to now we have only used direct operands
- such an operand is either the immediate value we
want to use or a register/variable that contains
the value we want to use - But to manipulate a set of values stored in a
large array, we need an operand that can index
(and run along) the array - An operand that contains the offset address of
the data we want to use is called an indirect
operand - To specify to the assembler that an operand is
indirect, we enclose it between
13Indirect Addressing (cont.)
- Ex if the word located at offset 100h contains
the value 1234h, the following will load AX with
1234h and SI100h - mov ax,si AX1234h if SI100h
- In contrast, the following loads AX with 100h
- mov ax,si AX100h if SI100h
- In conclusion
- mov ax,si loads AX with the content of SI
- mov ax,si loads AX with the word pointed by
SI
14Ex summing the elements of an array
- .data
- Arr dw 12,26,43,13,97,16,73,41
- count ( - Arr)/2 number of elements
- .code
- mov ax,0 AX holds the sum
- mov si,offset Arr
- mov cx,count
- L1
- add ax,si
- add si,2 go to the next word
- loop L1
15Indirect Addressing (cont.)
- For 16-bit registers
- only BX, BP, SI, DI can be used as indirect
operands - For 32-bit registers
- EAX, EBX, ECX, EDX, EBP, ESP, ESI, EDI can be
used as indirect operands - Caution when using 32-bit registers in real mode
(only the 1st MB is addressable) - mov ebx, 1000000h
- mov ax, ebx outside real-mode address space
16Indirect Addressing (cont.)
- The default segment used for the offset
- it is SS whenever BP, EBP or ESP is used
- it is DS whenever the other registers are used
- This can be overridden by the operator
- mov ax, si offset from DS
- mov ax, essi offset from ES
- mov ax, bp offset from SS
- mov ax, csbp offset from CS
- With indirect addressing, the type is adjust
according to the destination operand - mov ax,edi 16-bit operand
- mov ch,ebx 8-bit operand
- mov eax,si 32-bit operand
17Base and Index Addressing
- Base registers (BX and BP), index registers (SI
and DI) and 32-bit registers can be use with
displacements (ie constant and/or variable) - If A is a variable, the following forms are
permitted - mov ax, bp4
- mov ax, 4bp same as above
- mov ax, siA
- mov ax, Asi same as above
- mov ax, Aedx4
18Base and Index Addressing (cont.)
- Example of using displacements
- .data
- A db 2,4,6,8,10
- .code
- mov si,3
- mov dl, Asi1 DL 10
19Base-Index Addressing
- Base-index addressing is used when both a base
and an index register is used as an indirect
operand. - When two 16-bit registers are used as indirect
operands the first one must be a base and the
second one must be an index - mov ah, bpbx invalid, both are base
- mov ah, sidi invalid, both are index
- mov ah, bpsi valid, segment is in SS
- mov ah, bxsi valid, segment is in DS
20Base-Index Addressing (cont.)
- A two dimensional array example
- .data
- rowsize 3
- arr db 10h, 20h, 30h
- db 0Ah, 0Bh, 0Ch
- .code
- mov bx, rowsize choose 2nd row
- mov si, 2 choose 3rd column
- mov al, arrbxsi AL 0Ch
- mov al, arrbxsi
21Base-Index Addressing with 32-bit registers
- Both of the 32-bit registers can be base or index
(previous restriction is lifted) - mov ax, ecxedx permitted, both are index
- mov ax, ebxedx permitted, base and index
- mox ax, ebxedx same as above
- The 1st register determines the segment used
- mov ax,esiebp offset from DS
- mov ax,ebpesi offset from SS
- We can also add displacements
- mov dh, Aesiedi2
Break ...
22The OFFSET Operator
- The OFFSET returns the distance of a label or
variable from the beginning of its segment. - Example
- .data
- bList db 10h, 20h, 30h, 40h
- wList dw 1000h, 2000h, 3000h
- .code
- mov al, bList al 10h
- mov di, offset bList di 0000
- mov bx, offset bList1 bx 0001
23The SEG Operator
- The SEG operator returns the segment part of a
label or variables address. - Example
- push ds
- mov ax, seg array
- mov ds, ax
- mov bx, offset array
- .
- pop ds
24The PTR Operator (directive)
- Sometimes the assembler cannot figure out the
type of the operand. Ex - mov bx,1
- should value 01h be moved to the location pointed
by BX, or should it be value 0001h ? - The PTR operator forces the type
- mov byte ptr bx, 1 moves 01h
- mov word ptr bx, 1 moves 0001h
- mov dword ptr bx, 1 moves 00000001h
25The LABEL directive
- It gives a name and a size to an existing storage
location. It does not allocate storage. - It must be used in conjunction with byte, word,
dword, qword... - .data
- val16 label word
- val32 dd 12345678h
- .code
- mov eax,val32 EAX 12345678h
- mov ax,val32 error
- mov ax,val16 AX 5678h
- val16 is just an alias for the first two bytes of
the storage location val32
26The TYPE Operator
- It returns the size, in bytes, of a variable
- .data
- var1 dw 1, 2, 3
- var2 dd 4, 5, 6
- .code
- mov bx, type var1 BX 2
- mov bx, type var2 BX 4
- Handy for array processing. Ex If SI points to
an element of var2, then to make SI point to the
next element, we can simply write - add si, type var2
27The LENGTH, SIZE Operators
- The LENGTH operator counts the number of
individual elements in a variable that has been
defined using DUP. - .data
- var1 dw 1000
- var2 db 10, 20, 30
- array dw 32 dup(0)
- .code
- mov ax, length var1 ax1
- mov ax, length var2 ax1
- mov ax, length array ax32
- The SIZE operator is equivalent to LENGTHTYPE
28Sign and Zero Extend Instructions
- MOVZX (move with zero-extend)
- MOVSX (move with sign-extend)
- Both move the source into a destination of larger
size (valid only for 386 and later processors) - imm operands are not allowed
- mov bl, 07h
- mov bh, 80h
- movzx ax,bh AX 0080h
- movsx dx,bh DX FF80h
- movsx ax, bl AX 0007h
- movzx ecx,dx ECX 0000FF80h
- movsx ecx,dx ECX FFFFFF80h
- movsx ecx,ax ECX 00000007h