Title: What is Assembly Language?
1Introduction
- Chapter 1
- What is Assembly Language?
- Data Representation
2Table 1. Software Hierarchy Levels
3What is Assembly Language?
- A low-level processor-specific programming
language design to match the processors machine
instruction set - each assembly language instruction matches
exactly one machine language instruction - we study here Intels 80x86 (and Pentiums)
4Why learn Assembly Language?
- To learn how high-level language code gets
translated into machine language - i.e. to learn the details hidden in HLL code
- To learn the computers hardware
- by direct access to memory, video controller,
sound card, keyboard - To speed up applications
- direct access to hardware (ex writing directly
to I/O ports instead of doing a system call) - good ASM code is faster and smaller rewrite in
ASM the critical areas of code
5Assembly Language Applications
- Application programs are rarely written
completely in assembly language - only time-critical parts are written in ASM
- Ex an interface subroutine (called from HLL
programs) is written in ASM for direct hardware
access - Ex2 device drivers (called from the OS)
- ASM often used for embedded systems (programs
stored in PROM chips) - computer cartridge games, microcontrollers
(automobiles, industrial plants...),
telecommunication equipment - Very fast and compact but processor-specific
6Table 2. Comparison of Assembly Language and
High-Level Languages
7Machine Language
- An assembler is a program that converts ASM code
into machine language code - mov al,5 (Assembly Language)
- 1011000000000101 (Machine Language)
- most significant byte is the opcode for move
into register AL - the least significant byte is for the operand 5
- Directly programming in machine language offers
no advantage (over Assembly)...
8Binary Numbers/Storage Size
- are used to store both code and data
- On Intels x86
- byte 8 bits (smallest addressable unit)
- word 2 bytes
- doubleword 2 words
- quadword 2 doublewords
9Data Representation
- Even if we know that a block of memory contains
data, to obtain its value we need to choose an
interpretation - Ex memory content 0100 0001 can either
represent - the number 26 1 65
- or the ASCII code of character A
10Data Representation
- Number Systems
- Binary/Octal/Decimal/Hexadecimal
- Converting between various number systems
- Signed/Unsigned Interpretation
- Twos Complement
- Addition/Subtraction
- Character Storage
11Number Systems
- A written number is meaningful only with respect
to a base - To tell the assembler which base we use
- Hexadecimal 25 is written as 25h
- Octal 25 is written as 25o or 25q
- Binary 1010 is written as 1010b
- Decimal 1010 is written as 1010 or 1010d
- You are supposed to know how to convert from one
base to another (see appendix A)
12Binary Numbers
- Digits are 1 and 0
- 1 true
- 0 false
- MSB most significant bit
- LSB least significant bit
- Bit numbering
13Converting between various number systems
- Converting Binary to Decimal
- Converting Decimal to Binary
- Converting Binary to Hexadecimal
- Converting Hexadecimal to Decimal
14Signed and Unsigned Interpretation
- When a memory block contains a number, to obtain
its value we must choose either - the signed interpretation in that case the most
significant bit (msb) represents the sign - Positive number (or zero) if msb 0
- Negative number if msb 1
- the unsigned interpretation in that case all the
bits are used to represent a magnitude (ie
positive number, or zero)
15Signed Integers
- The highest bit indicates the sign. 1 negative,
0 positive
If the highest digit of a hexadecimal integer is
gt 7, the value is negative. Examples 8A, C5, A2,
9D
16Twos Complement Notation
- Used to represent negative numbers
- The twos complement of a positive number X,
denoted by NEG(X), is obtained by complementing
all its bits and adding 1 - NEG(X) NOT(X) 1
- Ex NEG(10) NOT(10) 1
- NOT(0000 1010b) 1
- (1111 0101b) 1 1111 0110b NEG(10) -10
- It follows that X NEG(X) 0
17Forming the Two's Complement
- Negative numbers are stored in two's complement
notation - Represents the additive Inverse
Note that 00000001 11111111 00000000
18Binary Subtraction
- To perform the difference X - Y
- the machine executes the addition X NEG(Y)
- 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0
- 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 1
- 0 0 0 0 1 0 0 1
Practice Subtract 0101 from 1001.
19Maximum and Minimum Values
- The msb of a signed number is used for its sign
- fewer bits are left for its magnitude
- Ex for a signed byte
- smallest positive 0000 0000b
- largest positive 0111 1111b 127
- largest negative -1 1111 1111b
- smallest negative 1000 0000b -128
20Ranges of Unsigned Integers
Standard sizes
What is the largest unsigned integer that may be
stored in 20 bits?
21Ranges of Signed Integers
The highest bit is reserved for the sign. This
limits the range
Practice What is the largest positive value that
may be stored in 20 bits?
22Signed/Unsigned Interpretation (again)
- To obtain the value of a number we need to chose
an interpretation - Ex memory content 1111 1111 can either
represent - -1 if a signed interpretation is used
- 255 if an unsigned interpretation is used
- Only the programmer can provide an interpretation
of the content of memory
23Character Storage Systems
- Character sets
- Standard ASCII (0 127)
- Extended ASCII (0 255)
- ANSI (0 255)
- Unicode (0 65,535)
- Null-terminated String
- Array of characters followed by a null byte
24ASCII vs Extended ASCII
- The ASCII code (from 00h to 7Fh)
- Only codes from 20h to 7Eh represent printable
characters. The rest are control codes (used for
printing, transmission). - Extended ASCII character set (codes 80h to FFh)
- Varies from one system to another
- MS-DOS usage for accentuated characters, Greek
symbols and some graphic characters
25The ASCII character set
- CR carriage return (MSDOS move to beginning
of line) - LF line feed (MSDOS move directly one line
below) - SPC blank space
26Text Files
- These are files containing only ASCII characters
- But different conventions are used for indicating
an end-of line - MS-DOS ltCRgtltLFgt
- UNIX ltLFgt
- MAC ltCRgt
- This is at the origin of many problems
encountered during transfers of text files from
one system to another
27Strings and numbers
- A strings is stored as an array of characters
- A 1-byte ASCII code is stored for each char
- Hence, we can either store the number 123 in
numerical form or as the string 123 - The string form is best for display
- The numerical form is best for computations