Title: Association of Computing Machinery Intro to Intel Assembly Language
1Association of Computing MachineryIntro to Intel
Assembly Language
- By Michael Kornbluh
- kornbluh_at_acm.jhu.edu
- March 27, 2003
2Layout of Talk
- Pros/Cons of Assembly
- Intel vs. MIPS
- The general-purpose registers
- NASM instruction syntax
- The Instructions Themselves
- Interfacing C and Assembly
- Comparisons
- Optimizing
- Demonstrations
- Where to find more information
3Pros of Programming In Assembly
- Needed to do stuff not in a high-level language,
or that is processor specific e.g. disable
interrupts. - You know exactly what the computer is doing.
- Learn ASM learn your chip.
- SPEED!
- Cool tricks.
- Ultra-tight code for time-critical sections and
slow processors.
4Cons of Assembly
- Takes so many lines of code to do quite a small
amount of work. (decreased productivity) - Can allow the most horrible spaghetti code ever.
Assembly code can get tangled in ways that would
make GOTO blush. - Annoying side-effects of commands (can make
debugging horrible) - Compilers are getting better every day, and know
the architecture. - ASM is processor specific
- No error-checking. (e.g. type-checking)
5Intel vs. MIPS
- Intel assembly has higher level stuff. E.g.
pushing is one command. - Is RISC (like MIPS) faster?
- Each instruction completed faster
- But more instructions to do things
- Does it matter? Intel is more widely supported.
6The Registers
Also, some segment registers, but you shouldnt
touch those.
Newer processors have such great innovations as
floating point registers, etc. But were only
talking about the basics today.
7Commonly-Used Flags
- Some parts of EFLAGS (the register that holds all
flags) - C true if last math operation carried
- Z true if last math operation gave a zero
- O true if last math operation overflowed
- S true if result of last operation was negative
- I true if interrupts enabled
- Flag commands
- Stc set carry flag to true
- Clc clear carry flag (set to false)
- Similarly sti, cli, etc.
8NASM Instruction syntax (1)
- Just the name. e.g. nop
- Name, then argument. E.g. call 1337
- Destination, then source e.g. mov eax, 5 means
let eax 5. - Another example Mov esi, ebx means let esi
ebx - Destination AND arg1, then arg2.
- e.g. add eax, edx means let eax eax
edx. Thus, eax is an arg, and it is where the
result is stored. A lot of instructions do this.
9NASM Instruction syntax (2)
- For registers or numbers, just type them. E.g.
add ecx, 5 - For memory locations, put them in brackets
E.g. mov 72, eax means move the number in
eax into the variable at address 72. - You can even put registers in brackets mov
eax, bh means let the variable pointed to by
eax be loaded with the value in bh. - You cant access memory twice in one instruction,
so mov 8, 3 is illegal. - Source and destination must be same size.
- Advanced e.g. mov eax8ebx78, ecx (but,
lets not worry about that yet)
10NASM Instruction syntax (3)
- You must specify the size youre transferring if
its not obvious to the assembler. - For example, mov eax, ebx is obviously moving
32-bits, because eax and ebx are 32-bits.
However, mov 7, 3 is illegal, because the
variable at address 7 could be a byte, or
whatever. - So byte 8 bits, word 16 bits, dword 32
bits, etc. (ones bigger than dword are not used
quite so often) - So, write mov word 7, 3 or mov dword 7,
3, depending on how many bits that variable is.
11Instructions (general)
- Mov copies value from second arg to first arg.
E.g. mov eax, ebx copies the value in ebx into
eax. (mov should really be called copy, since
thats what it does. All well.) - Add adds its two args together and stores answer
in the first one add ebx, ecx means let ebx
ebx ecx - Sub works just like add.
- Cmp like sub, but doesnt store the result
anywhere. (well see why its still useful later) - And takes bitwise AND of both args, and stores
answer in first arg. So, and eax, edx means
let eax eax edx Or and xor work the same. - Mul and div are complicated, so well ignore them
for now. - Push push its argument onto the stack. E.g.
push eax - Pop pop stuff off the stack into the argument.
E.g. pop eax.
12Instructions (control flow)
- Call call a function. E.g. call 500 calls the
function at memory address 500. - Ret return from function. works like return
- Jmp like goto. jmp 100 goes to 100.
- Conditional jumps only jumps if a condition is
met. E.g. jz 100 jumps only if last
instruction produced a (z)ero result. - Use cmp and conditional jumps to do ifs (as well
see later.)
13Interfacing C and Assembly (Calling a Function)
- Push arguments onto the stack from last to first,
and get rid of them later - printf(stringPointer, 5, 7) becomes
- Push dword 7
- Push dword 5
- Push dword stringPointer
- Call printf
- Pop eax
- Pop eax
- Pop eax
- The 3 pop eaxs would probably be optimized to
just add or subtract ESP directly. Also, a
register besides eax is fine. - Always keep stack even! (every push should have
a pop) - There are a whole bunch of ways to pass an
argument in assembly youre not restricted to
how C does it.
14Interfacing C and Assembly (Returning a value in
a Function)
- Put the return value in EAX before returning.
- Thus, return 42 becomes
- Mov eax, 42
- Ret
15Interfacing C and Assembly(linking ASM and C)
- The C side
- includeltiostreamgt
- void asmFunc() //prototype its defined in the
ASM file - int cFunc()
- return 84
-
- int main()
- cout ltlt asmFunc() //call asmFunc, which returns
91 in EAX. -
- The ASM side
- extern cFunc means that cFunc is defined
outside the ASM file - global asmFunc means that other files can use
the symbol asmFunc - asmFunc
- Call cFunc calls cFunc (in cfile.c),
which returns 84 in EAX
16Interfacing C and Assembly(dropping ASM right
into a C file)
- Much easier than dealing with linking ASM and C.
This is how the Linux kernel uses ASM. - But, you have to use ATT (not NASM) syntax.
- Just type asm() with the instructions in the
parentheses. - e.g
- //this function returns 42
- int giveAnswer()
- asm(movl 7, eax) //same as mov eax, 7
17Comparison
- Use cmp and conditional jumps
- Cmp someVariable, 3
- Jne 200
- This will CoMPare someVariable to 3.
- Jne will jump to 200 if theyre not equal. (jne
Jump when Not Equal) - This works with jg (greater than), jl (less
than), je (equal), etc.
18Optimizing
- Use registers as much as possible.
- Use as few jumps as possible, since they mess up
the pipeline. - For example, try to set up conditional jumps to
fall through more often than they jump. - Dont use archaic instructions like loop.
- Short instructions are good less time spent
getting instructions from memory. (But this rule
doesnt usually apply to archaic instructions,
which should still be avoided.)
19Demonstrations
20Where to get more information
- http//webster.cs.ucr.edu/Page_asm/ArtOfAsm.html
- where I learned assembly. - http//nasm.sourceforget.net To download the
NASM assembler. - http//developer.intel.com For Intels official
information.