Appendix C: Assembly Language Programming - PowerPoint PPT Presentation

1 / 64
About This Presentation
Title:

Appendix C: Assembly Language Programming

Description:

ISA level is just above the microarchitecture level ... AX is the accumulator. Used for results and as the target of many instructions ... – PowerPoint PPT presentation

Number of Views:178
Avg rating:3.0/5.0
Slides: 65
Provided by: markt2
Category:

less

Transcript and Presenter's Notes

Title: Appendix C: Assembly Language Programming


1
Appendix C Assembly Language Programming
  • CS 271 Computer Architecture
  • Indiana University Purdue University Fort Wayne

2
Machine and assembly language
  • Machine language
  • Used to program the ISA level of a computer
    system
  • ISA level is just above the microarchitecture
    level
  • Instructions consist of strings of 1s and 0s
  • Writing programs is very difficult and tedious
  • Assembly language
  • An easier-to-use symbolic representation of
    machine language
  • Uses mnemonics for operations
  • ADD, MUL, MOV, CMP, PUSH, etc.
  • Also includes calls to operating system service
    routines
  • An assembly language program is translated into
    machine language by a program called an assembler

3
Intel 8088 assembly language
  • Every distinct processor has its own assembly
    language
  • However, most assembly languages are similar
  • The 8088 processor was used in the original IBM
    PC
  • 8088 assembly language programs run on modern
    Pentium 4 processors
  • Most of the Pentiums core instructions are the
    same as the 8088s
  • But act, however, on 32-bit registers instead of
    16-bit registers
  • Learning about computer architecture is
    facilitated by learning an assembly language
  • Intel 8088 assembly language is a good choice and
    also is a gentle introduction to Pentium assembly
    language programming

4
8088 sample assembly language program
(a) An assembly language program (b) The
corresponding tracer display
5
8088 sample assembly language program
  • In the sample program make note of . . .
  • Constant definitions (used by the assembler)
  • _EXIT 1, _WRITE 4, _STDOUT 1
  • Pseudoinstructions (commands to the assembler)
  • .SECT .TEXT ! Activity section
  • .SECT .DATA ! Data section
  • .ASCII Hello World\n ! 12-byte string
    initialization
  • Labels (converted to memory addresses by
    assembler)
  • start, hw, de
  • Instructions (translated into machine language)
  • MOV, PUSH, ADD, SUB
  • Operating system call SYS

6
Intel 8088 assembly language
  • To program the 8088, it is necessary to have
    detailed knowledge of the instruction set
    architecture
  • Processor and fetch / execute cycle
  • Registers
  • Memory and addressing
  • The instruction set

7
8088 processor and fetch / execute cycle
  • The fetch / execute cycle involves special
    registers
  • Program Counter (PC)
  • Also known as Instruction Pointer (IP)
  • The PC always contains the address of the next
    instruction
  • Fetch-execute cycle
  • Fetch the next instruction from the memory
    location referred to by the
  • PC register
  • 2. Increment the PC
  • 3. Decode the fetched instruction
  • 4. Fetch any needed data from memory and/or
    processor registers
  • 5. Execute the instruction
  • Store the results of the instruction in memory
    and/or registers
  • Go back to step 1 and repeat

8
8088 registers
  • There is a set of 14 registers
  • Each register is 16-bits wide

9
8088 registers
  • The registers AX, BX, CX, and DX are general
    registers
  • The High and Low bytes of each can be accessed
  • E.g., AH and AL are the high and low bytes of AX
  • AX is the accumulator
  • Used for results and as the target of many
    instructions
  • BX is the base register
  • Used as a pointer to a memory address for some
    instructions
  • CX is the counter register
  • Used for a loop counter
  • Automatically decremented
  • Loop ends when CX reaches 0
  • DX is the data register
  • Used with AX to hold high-order bits when a
    32-bit long word when needed

10
8088 registers
  • Maximum memory is 1 MB
  • Each byte is addressable
  • 20-bit addresses are needed to address 1 MB (
    220)
  • 16-bit registers can address only 216 64 KB
  • Segment registers point to the base address of a
    64 KB area of memory known as a segment
  • A segment register gives the 16 high-order bits
  • Remaining 4 bits are all zero
  • Base addresses must be evenly divisible by 24
    16
  • Segment registers are CS, DS, SS, and ES

11
8088 segment registers
  • CS is the code pointer
  • Points to the code segment containing program
    instructions
  • DS is the data pointer
  • Points to the data segment containing program
    data
  • SS is the stack pointer
  • Points to the system stack segment used for
    subroutine linkage
  • ES points to the extra segment
  • Can be used whenever another segment is needed

12
The 8088 system stack
  • The system stack . . .
  • Holds stack frames and temporary variables
  • Grows toward smaller addresses
  • Only 2-byte words are allowed at even addresses
  • A stack frame is created each time a method is
    called for holding . . .
  • Return address
  • Parameters
  • Local variables
  • Temporary variables are also pushed onto and
    popped from the stack as the subroutine runs

13
8088 pointer and index registers
  • There are four registers in this group SP, BP,
    SI, DI
  • SP is the stack pointer
  • An index that is added to SS to point to the top
    of the system stack
  • For a PUSH or CALL, the SP is decremented
  • For a POP, the SP is incremented
  • BP is the base pointer
  • Contains an index to a location within the stack
  • Typically points to the beginning of the current
    subroutines stack frame

14
8088 pointer and index registers
  • SI is the source index register
  • DI is the destination index register
  • SI and DI are often used . . .
  • with BP to address data on the stack
  • with BX to compute the address of a data location
    in memory
  • An additional index register is the PC (also
    called IP)
  • The PC indexes into into the code segment to
    address the next instruction in a program
  • The programmer has no direct control over the PC

15
Segments
higher addresses
stack
combined stack and data segment (64 KB)
BP
current stack frame
  • Typically, the stack segment and the data segment
    are the same

SP
top of stack
data
SSDS
code segment (64 KB)
program code
PC
CS
0
memory
16
Condition codes and flags
  • The flag register is actually a set of 1-bit
    registers
  • Also called condition code register
  • Some of the bits are set according to the result
    of arithmetic instructions
  • Z set if result is 0
  • S set result is negative
  • O set if overflow occurred
  • C set by a carry
  • P set according to the parity of the result
  • Other bits control processor operation
  • I bit enables interrupts
  • T bit enables tracing mode for debugging
  • D bit controls direction of string operations

17
Data
  • The 8088 supports 4 data types
  • 1-byte byte
  • 2-byte word
  • 4-byte long
  • binary coded decimal (not supported by
    interpreter)
  • The 8088 is little endian
  • The low-order part of a word is stored in the
    lower address
  • A long is stored in the AX DX combination with
    the low-order word in AX

18
Addressing
  • Addressing refers to techniques (addressing
    modes) for representing the locations of data
    elements in memory or in registers
  • An operand is an assembly language code used to
    represent a data element
  • An effective address is an address in a memory
    segment
  • Parentheses around a register indicate the
    register is a pointer to the effective address
  • In describing addressing, the symbol indicates
    a numerical value or label

19
Addressing
  • Instructions can have 0, 1, or 2 operands
  • The operands of two-address instructions are
    typically called destination and source
  • Example MOV AX, BX
  • AX is the destination
  • BX is the source
  • This instruction replaces the contents of AX by a
    copy of BX
  • Sometimes an operand is implicit (not mentioned)
  • Example MULB BL
  • This multiplies AX by BL (1 byte) and stores the
    result in AX

20
Addressing modes
  • Register addressing example AL and CL
  • MOV CL, AL
  • The operand is simply the name of a byte or word
    register

21
Data segment addressing modes
  • Direct addressing examples ()
  • ADD CX, (20)
  • The word in the data segment at index 20 is added
    to CX
  • Involves addresses 20 and 21
  • ADD CL, (20)
  • The byte in the data segment at index 20 is added
    to CX
  • Register indirect addressing example (SI)
  • MOV CX, (SI)
  • Move the data segment word pointed to by SI into
    CX

22
Data segment addressing modes
  • Register displacement addressing example 20(SI)
  • MOVB AL, 20(SI)
  • If SI contains 17, then the effective address is
    byte 37
  • Move the data segment byte at effective address
    37 into AL
  • Register with index addressing example (BX)(DI)
  • PUSH (BX)(DI)
  • The effective address is the sum of the BX and DI
    registers
  • PUSH the data segment word at the effective
    address on the system stack

23
Data segment addressing modes
  • Register index displacement addressing example
    (BX)(DI)
  • This combines the previous two modes
  • PUSH 20(BX)(DI)
  • The effective address is the sum of the BX and DI
    registers plus 20

24
Stack segment addressing modes
25
Stack segment addressing modes
  • Except for direct addressing, all the data
    segment modes carry over for the stack segment
  • However . . .
  • The BP pointer is used in place of BX
  • Neither SI nor DI may be used in indirect or
    displacement modes
  • The names of the stack segment addressing modes
    are
  • Base pointer indirect
  • Base pointer displacement
  • Base pointer with index
  • Base pointer index displacement

26
Immediate addressing
  • With immediate addressing, a source operand is a
    constant byte or word
  • Example
  • MOV AX, 23
  • The AX register is loaded with a decimal 23

27
Implied addressing
  • The operand is implicit in the instruction itself
  • Example PUSH AX
  • This decrements SP by 2 and copies AX to the
    location pointed to by SP
  • Example CLC
  • Sets the carry flag

28
The 8088 instruction set
  • There are various groups of instructions
  • Data transfer
  • Arithmetic
  • Logical
  • Shift and rotate
  • Test and bit flag
  • Looping
  • Repetitive string operations
  • Jump and call

29
Notation
  • Operand type
  • r - a register
  • e - an effective address in memory
  • - immediate data
  • label
  • string
  • Direction, if relevant or
  • Status flags indicates the flag is
    affected
  • MOV(B) indicates both . . .
  • word version MOV
  • byte version MOVB

30
The 8088 instruction set
  • Move (actually copy), exchange, and stack
    instructions
  • Arithmetic (addition / subtraction,
    multiplication / division)

31
The 8088 instruction set
  • More arithmetic
  • Logical
  • Shift and rotate

32
The 8088 instruction set
  • Test and bit flag
  • Looping (destination label must be within 128
    bytes of PC)
  • Repetitive string operations REPx (used with next
    group)
  • Jump and call

33
Jump, call, and return instructions
  • CALL and unconditional JMP may be near or far
  • A near jump is within the current code segment
  • A far jump . . .
  • is anywhere within the 20-bit address space
  • A new value for CS must be supplied
  • Conditional jumps
  • Must be within 128 bytes of the PC
  • Otherwise, for example, replace
  • by
  • This is done automatically by the assembler

JNZ ahead JMP
farlabel ahead - - -
JZ farlabel
34
Conditional jumps
  • Usually depend on the values in status flags
  • Status flags are set by a prior TEST of CMP
    instruction
  • For signed operations . . .
  • Use greater than or less than
  • For unsigned operations . . .
  • Use above or below

35
Conditional jumps
36
Subroutine call and return instructions
  • Parameters (arguments) are pushed onto the stack
    in reverse order prior to the call
  • The subroutine call instruction CALL . . .
  • Pushes the PC onto the stack
  • This is the return address
  • Loads the PC with the label or effective address
  • The return instruction RET . . .
  • Pops the return address from the stack and stores
    it in the PC
  • Execution thus continues at the instruction
    immediately after the CALL instruction

37
Subroutine calls
  • RET , with immediate
  • Adds bytes to the SP to eliminate arguments
    from the stack
  • To access arguments, the subroutine should
  • Push BP onto the stack
  • Copy SP into BP
  • Now . . .
  • Return address is at BP 2
  • Argument 2 is at BP 6
  • Local variable 2 is a BP - 4

38
Subroutine calls
  • To clean up local variables and temporary results
    on the stack before returning . . .
  • Copy BP into SP
  • Pop the old BP
  • Good subroutine practice
  • Caller should assume AX and DX will change
  • The caller should stack them prior to pushing
    arguments as needed
  • The subroutine should stack any registers it will
    change and restore them prior to returning

39
System calls and function calls
  • System calls invoke operating system services
  • To invoke . . .
  • Push the needed arguments in reverse order
  • Push the call number
  • Execute SYS

40
System calls
  • After a system call
  • Return values are left in AX or DXAX
  • Arguments remain on the stack
  • Caller should adjust SP accordingly
  • It is good practice to define system call numbers
    as constants at the beginning of an assembler
    program
  • _OPEN and _CREAT have 2 arguments
  • The name argument is the effective address of the
    start of a string for the file name
  • The second argument is 0, 1, or 2
  • 0 open for reading
  • 1 open for writing
  • 2 open for both reading and writing
  • The return integer in AX is a file descriptor to
    be used for reading, writing, and closing the file

41
System calls
  • Some files are automatically opened
  • Standard input (descriptor 0)
  • Standard output (descriptor 1)
  • Standard error output (descriptor 2)
  • _READ and _WRITE have 3 arguments
  • File descriptor
  • The starting address of a buffer to hold data
  • Number of bytes to transfer
  • _CLOSE involves only the file descriptor

42
Function calls
  • _GETCHAR reads one character from standard input
    and puts it in AL (with AH set to zero)
  • _PUTCHAR writes a byte to standard output
  • _PRINTF outputs formatted information
  • The first argument is the address of a format
    string
  • d converts an integer to a decimal string
  • x and o convert to hex and octal, respectively
  • There should be one argument on the stack for
    each value expected by the format string
  • s indicates a null-terminated string with
    effective address on the stack
  • Format string x d and y d\n prints 2
    numbers followed by a line feed

43
Function calls
  • _SPRINTF is like _PRINTF except the formatted
    string is sent to a buffer in memory
  • _SSCANF is the reverse of _SPRINTF and . . .
  • Reads a string containing integers in decimal,
    octal, or hex from a buffer
  • Converts the values according to a format string
  • Places the converted values into memory locations
    indicated by additional arguments

44
The assembler
  • An assembler is a program that translates an
    assembly language program into machine language
  • The assembly language program is written with
    mnemonics such as ADD and AX together with labels
    and constant definitions to represent computer
    activity in symbolic form
  • The output of the assembler is an object file
  • The object file must be combined with the object
    files of any needed system subroutines
  • A linker program performs this task
  • The result is a single executable binary file
  • The executable binary file may be loaded into
    memory and executed

45
The assembler
  • An assembler typically makes two passes through a
    program in order the translate it
  • Pass 1 builds a symbol table
  • A symbol table associates the identifiers used
    for labels and constant definitions with numbers
  • Constants can be entered directly
  • Labels represent addresses and must calculated
  • Label calculation
  • The assembler maintains an internal location
    counter that keeps track of the number of bytes
    allocated so far for data and instructions
  • When a label appears, it is given the current
    value of the location counter

46
The assembler
  • Pass 2 does code generation
  • The value of every symbol is known at the
    beginning of pass 2
  • Each instruction is read again and . . .
  • If an instruction refers to a label, the symbol
    table is consulted
  • The numerical equivalent is written into the
    object file
  • The assembler also initializes data in any data
    section
  • This results from pseudoinstructions such as . .
    .
  • message .ASCII Hello World
  • table .WORD 11, 19, 26
  • Note constant definitions, labels, and
    documentation are not carried over into the
    object file

47
The as88 assembler
  • To assemble prog.s, enter as88 prog at a command
    prompt
  • Program comments start with ! and continue until
    end-of-line
  • Program sections
  • .SECT .TEXT
  • For processor instructions placed in the code
    segment
  • .SECT .DATA
  • To reserve memory in the data segment that and
    initialize it
  • .SECT .BSS
  • Block Started by Symbol section
  • To reserve memory in the data segment that is not
    initialized
  • A program may have many occurrences of each
    section
  • However, .TEXT must be first, .DATA second, and
    .BSS third
  • The linker arranges the sections in the code and
    data segments
  • Each section has its own location counter

48
The as88 assembler
  • Labels
  • Any instruction or data word may begin with a
    label
  • A label all by itself is associated with the next
    instruction of word
  • Global labels
  • Alphanumeric identifier followed by a colon, such
    as here
  • These must all be unique and not keywords or
    mnemonics
  • Must appear at the start of each section
  • Local labels
  • Single digit followed by a colon, such as 5
  • Instruction JMP 3f jumps forward to the closest
    3 label
  • Instruction JMP 2b jumps backward to the closest
    2 label

49
The as88 assembler
  • Constant symbols may be defined
  • TABLESIZE 100
  • System defined values usually begin with an
    underscore
  • _WRITE 4
  • Numerical values
  • Decimal
  • Default, as in 1234
  • Hex
  • Starts with Ox, as in Ox713
  • Constants and labels
  • Only the first 8 characters are significant
  • Arithmetic operations are allowed by the
    assembler , -, , /,
  • For grouping, use square brackets instead
    of parentheses

50
The as88 assembler
  • Pseudoinstructions
  • Directives to the assembler
  • .BYTE, .WORD, and .LONG expect comma-separated
    list of constant expressions

51
The as88 assembler
  • Pseudoinstructions
  • .ASCIZ and .ASCII
  • Represent the string supplied in double quotes
  • .ASCIZ appends an additional zero byte
  • Escape symbols are allowed in strings

52
The as88 assembler
  • Pseudoinstructions
  • .SPACE n increments the location counter by n
  • Used in the .BSS section to reserve memory area
  • .ALIGN 2 and .ALIGN 4 advance the location
    counter to the first address evenly divisible by
    2 or 4, respectively
  • Used before the .WORD and .LONG
    pseudoinstructions
  • .EXTRN identifier requests that the identifier is
    made available to the linker for external
    references
  • Used, for example, when identifier is the entry
    point of a subroutine that will be called from a
    separately assembled program

53
The as88 tracer
  • After assembly of program prog.s, enter . . .
  • s88 prog to run the program
  • t88 prog to trace or debug the program
  • Tracer windows are indicated below

54
The as88 tracer
  • Tracer commands

Each command must be followed by a carriage
return (the Enter key). An empty box indicates
that just a carriage return is needed. Commands
with no Address field listed above have no
address. The symbol represents an integer
offset.
55
The as88 tracer
  • Tracer commands

Each command must be followed by a carriage
return (the Enter key). An empty box indicates
that just a carriage return is needed. Commands
with no Address field listed above have no
address. The symbol represents an integer
offset.
56
Examples
  • To understand techniques for programming in
    assembly language study five of the examples
    found in the textbook
  • Hello World example, HlloWrld.s, pp. 736 739
  • General registers example, genReg.s, pp. 740
    741
  • Vector product example, vecprod.s, pp. 741 744
  • Debugging the arrayprt.s program, pp. 744 747
  • Dispatch table example, jumptbl.s, pp. 750 752
  • The description of the code leads you through
  • Study and understand the program code

57
Hello World example, HlloWrld.s
(a) An assembly language program (b) The
corresponding tracer display
58
General registers example, genReg.s
  • (a) Part of a program.
  • (b) The tracer register window after line 7 has
    been executed.
  • (c) The tracer register window after 7 loop
    iterations (note DXAX pair)

59
Vector product example, vecprod.s
60
Vector product example (continued)
61
Vector product example (continued)
  • Execution of vecprod.s when it reaches line 28
    for the first time.

62
Debugging the arrayprt.s program
63
Dispatch table example, jumptbl.s
  • A program demonstrating a multiway branch using a
    dispatch table.

64
Dispatch table example (continued)
Write a Comment
User Comments (0)
About PowerShow.com