Title: Alphanumeric shellcode for ARM
1Alphanumeric shellcode for ARM
2Conventions for this presentation
- We will count bits from bit 0 onwards
- Least significant bit is on the right (bit 0)
- Most significant bit is on the left
- Example
- 0000 0001
- Bit 0 is set to 1, Bits 1-7 are set to 0
- I.e. for a word (32 bits, 4 bytes)
3Overview
- Shellcode
- Alphanumeric shellcode
- ARM architecture
- Alphanumeric shellcode for ARM
- Building our shellcode
- Conclusion
4Shellcode
- When exploiting a vulnerability, attackers will
often want to execute new code - They must inject code into memory
- Code must be in binary form (machine code) so
that the processor can execute it - Such binary code used for exploitation is called
shellcode - \x6a\x68\x68\x2\x62\x61\x73\x68\x2f\x62\x69
- \x67\x89\xe3\x31\xd2\x52\x53\x89\xe1\x31\xc0
- \xb0\x0b\xcd\x80
4
5Overview
- Shellcode
- Alphanumeric shellcode
- ARM architecture
- Alphanumeric shellcode for ARM
- Building our shellcode
- Conclusion
6Alphanumeric shellcode
- Attackers sometimes need to bypass filters
- Encoding/decoding
- URL, UTF
- Surviving reg-exp filters
- e.g. a-Z0-9
- Exists for x86
- x86 instructions 1 to 17 bytes
- ARM instructions 32-bit (4 byte)
- Using 1 or 2 byte instructions can do most
interesting x86 things
7Why did we do this?
- Needed to demonstrate the impact of a
vulnerability on mobile device - Inserted buffer overflow in a web browser
- Browser surfs to a page, buffer overflow triggers
- Executes shellcode
- Problem
- Browser would UTF-8-decode the shellcode
- Mangles the shellcode
- Alphanumeric shellcode would prevent this
8Overview
- Shellcode
- Alphanumeric shellcode
- ARM architecture
- Registers
- Addressing modes
- Conditional execution
- Thumb mode
- Alphanumeric shellcode for ARM
- Building our shellcode
- Conclusion
9ARM architecture
- Registers 16 general purpose registers
- r0-r15
- Like in Intel special meanings for some
- r11 is also fp frame pointer
- r13 is also sp stack pointer
- r14 is also lr link register (for return
addresses) - r15 is also pc program counter
- Conventions
- r0-r3 used for arguments and temporary
- r4-r11,r13,r14 are permanent (should be restored
on function exit) - r12 is temporary
10Overview
- Shellcode
- Alphanumeric shellcode
- ARM architecture
- Registers
- Addressing modes
- Conditional execution
- Thumb mode
- Alphanumeric shellcode for ARM
- Building our shellcode
- Conclusion
11ARM architecture
- Many different addressing modes
- Data processing (ADD, ...)
- Load/store word or unsigned byte
- Miscellaneous loads and stores
- Load/store multiple
- Load/store coprocessor
12ARM architecture
- Many different addressing modes
- Data processing (ADD, ...)
- Load/store word or unsigned byte
- Miscellaneous loads and stores
- Load/store multiple
- Load/store coprocessor
13ARM data processing addressing mode
- Most instructions look like
- ltopcodegtltcondgtS ltRdgt, ltRngt, ltshifter_operandgt
- opcode is the instruction being called
- cond is for conditonal execution (discussed
later) - S means write back to status register (discussed
later) - Rd, destination register
- Rn, operand register, part of the operation
- Example
- ADD r0, r1, 20
- r0 r1 20
14ARM data processing addressing mode
- ltopcodegtltcondgtS ltRdgt, ltRngt, ltshifter_operandgt
- shifter_operand 12 bits large, 11 possible modes
- When shift_imm is used it is 4 bits large
- immediate 8 bits large, optionally rotated left
by shift_imm - ltRmgt a register is used as operand
- ltRmgt, LSL/LSR ltshift_immgt register shifted
left/right by shift_imm - ltRmgt, LSR/LSR ltRsgt register shifted left/right
by register Rs - ltRmgt, ASR ltshift_immgt/ltRsgt register shifted
right (arithmetically) by shift_imm or Rs
(preserves sign bit) - ltRmgt, ROR ltshift_immgt/Rs register rotated right
by shift_imm or Rs - ltRmgt, RRX register rotated right by 1 bit,
replacing the free bit with the carry flag and
the carry flag with the rotated bit
15ARM data processing addressing mode
- Examples
- ADD r0, r1, 20 -gt r0 r1 20
- ADD r0, r1, r2 -gt r0 r1 r2
- ADD r0, r1, r2 LSL 3 -gt r0 r1 (r2 ltlt 3)
- ADD r0, r1, r2 LSR r3 -gt r0 r1 (r2 gtgt r3)
16ARM load/store word/byteaddressing mode
- Loads and stores will generally look like
- LDRltcondgtBT ltRdgt, addressing_mode
- cond is for conditonal execution (discussed
later) - if B is specified, then only 1 byte is loaded,
otherwise 1 word - if T is specified then if the instruction is used
in privileged mode, it will be executed as if in
user mode - Useful for determining whether user mode has
access to memory - Not important for us
- Rd is the register that the value will be loaded
to, stored from.
17ARM load/store word/byteaddressing mode
- LDRltcondgtBT ltRdgt, addressing_mode
- addressing_mode
- ltRngt, /-ltimm_12gtlt!gt Rn contains base
address, optionally with a 12-bit immediate used
as offset. If ! is specified, then the result of
the calculation is stored back in memory - ltRngt, /-ltRmlt!gt Rn base address, Rm offset
- ltRngt, /-ltRmgt, ltshiftgt ltshift_immgtlt!gt, Rn
base, Rm offset, shifted (LSL, LSR, ASR, ROR or
RRX) by shift_imm - Same 3 modes exist post-indexed
- ltRngt, /-ltimm_12gt Use Rn for address,
add/substract imm_12 to Rn after load/store and
store new address in Rn - ltRngt, /-ltRmgt and ltRngt, /-ltRmgt, ltshiftgt
ltshift_immgt
17
18ARM load/store word/byteaddressing mode
- Examples
- LDR r3, r4, -48 -gt r3 (r4-48)
- LDRB r3, r4, -48 -gt r3 (r4-48) AND
0x000000FF (if positive) - LDR r3, r4, -48! -gt r4 r4 - 48 r3 r4
- LDR r3, r4, r5 -gt r3 (r4r5)
- LDR r3, r4, 48 -gt r3 r4 r4 r4 - 48
19ARM load/store multipleaddressing mode
- Loads/stores of multiple registers in one time
- LDMltcondgtltaddressing_modegt ltRngt!,
ltregistersgt - ltRngt base register containing address of location
to load from - ! write calculated address to Rn
- ltregistersgt 16 bits denoting if that registered
should be loaded/stored or not (e.g. if r0 must
loaded, the last bit will be set) - ltaddressing_modegt need a way of finding location
of next register to load - IA (increment after) Rn contains address of
first register to load/store, then Rn Rn 4,
next address loaded/stored, then Rn Rn 4,
etc. - IB (increment before) Rn Rn 4, then
load/store - DA (decrement after) load/store, then Rn Rn -
4 - DB (decrement before) Rn Rn - 4, then
load/store
20ARM load/store multipleaddressing mode
- Examples
- LDMIA r5!, r0, r1, r2, r6, r8, lr
- r0 r5
- r1 (r54)
- r2 (r58)
- r6 (r512)
- r8 (r516)
- lr (r520)
- r5 r520
21Overview
- Shellcode
- Alphanumeric shellcode
- ARM architecture
- Registers
- Addressing modes
- Conditional execution
- Thumb mode
- Alphanumeric shellcode for ARM
- Building our shellcode
- Conclusion
22Conditional execution
- ARM commands can be executed conditionally
- Based on status flags
- Example
- if (err ! 0)
- printf("Error\n)
- else
- printf("OK!\n")
r1 contains err cmp r1, 0
ldrne r0, ERROR blne printf
ldreq r0, OK bleq printf
23Conditional execution
- 16 different condition codes
- EQ / NE Equal / Not Equal
- CS/HS Carry set/unsigned higher or same
- CC/LO Carry clear/unsigned lower
- MI/PL Negative/Positive or zero
- VS/VC Overflow/No overflow
- HI/LS Unsigned higher/unsigned lower or same
- GE/LT Signed greater than or equal/Signed less
than - GT/LE Signed greater than/Signed less than or
equal - AL Always (unconditional)
24Overview
- Shellcode
- Alphanumeric shellcode
- ARM architecture
- Registers
- Addressing modes
- Conditional execution
- Thumb mode
- Alphanumeric shellcode for ARM
- Building our shellcode
- Conclusion
25Thumb mode
- Special mode that uses 16 bit instructions
- Active when T bit is set in status register
- Only present in T versions of ARM for older
versions - Mandatory since ARMv6
- Entered by setting last bit of address and then
calling BX - SUB r6, pc, -1
- BX r6
- Exit by having last bit set to 0 when calling BX
- BX pc
26Overview
- Shellcode
- Alphanumeric shellcode
- ARM architecture
- Alphanumericness in ARM
- Properties of alphanumeric values
- ARM instructions
- Building our shellcode
- Conclusion
27Properties of alphanumeric values
- What properties do alphanumeric values share?
- Alphanumeric
- a-z
- A-Z
- 0-9
- Lets take a look at the ascii table
- 0 - 9 0x30 - 0x39
- A - Z 0x41 - 0x5A
- a - z 0x61 - 0x7A
- Lets look for patterns
28Ascii values
29Ascii values
30Ascii values
31Ascii values
32Alphanumeric ascii values
- In summary
- Bit 7 always 0
- Bit 6 or bit 5 always set (or both)
- If bit 6 is 0
- bit 5 and 4 must be set
33Overview
- Shellcode
- Alphanumeric shellcode
- ARM architecture
- Alphanumericness in ARM
- Properties of alphanumeric values
- ARM instructions and addressing modes
- Building our shellcode
- Conclusion
34ARM Instructions
- 32 bit instructions
- 4 bytes to get alphanumeric
- Instruction layout based on type
- Data processing
- Load/store unsigned byte or word
- Load/store multiple
35ARM Instructions
- Dataprocessing
- Base layout
- Layout with immediate
- Layout with registers
0
0
0
0
1
Shft
0
Rm
0
0
0
3
5
36ARM Instructions
- Load/store byte/word with immediate offset
- Load/store byte/word with register offset
(shifted or not) - Load/store multiple
0
37Alphanumeric addressing modes
- Immediate
- ltRmgt
- ltRmgt, LSL ltshift_immgt
- ltRmgt, LSL ltRsgt
Rotate
immediate
Rm
3
Rm
Shift_imm
3
Rm
Rs
3
38Alphanumeric addressing modes
- ltRmgt, LSR ltshift_immgt
- ltRmgt, LSR ltRsgt
- Usable modes
- Immediate with imm 0x30-0x39, 0x41-0x5A,
0x61-0x7A - ltRmgt, LSR ltRsgt, with Rm lt r10
- ltRmgt, ASR ltRsgt
- ltRmgt, ASR ltshift_immgt, with Rm ! r0
- ltRmgt, ROR ltshift_immgt
- ltRmgt, ROR ltRsgt, with Rm lt r11
- ltRmgt, RRX
38
39Alphanumeric addressing modes
- Addressing modes similar for load/store byte
- Same usable modes for similar modes
- ltRngt, /-ltimm_12gtlt!gt
- ltRngt, - ltRmgt, shift shift_immlt!gt
- shift
- ASR with Rm ! r0
- ROR
- RRX
- Load/store multiple
- Increment addressing mode sets bit 23, so only
decrement can be used
40Conditional execution
31
28
- Cant use AL (always)
- If we use two complementary, we can ensure that
we can always use a prefix - VS/VC is hard to use (overflow/no overflow)
- MI/PL negative/positive is easy to manipulate
41Instructions
- Going through all 147 instructions of ARM
- Remove all instructions that are not alphanumeric
- Remove all for specific architectures
- 18 instructions left B/BL, CDP, EOR, LDC, LDM
(1), LDM (2), LDR, LDRB, LDRBT, LDRT, MCR, MRC,
RSB, STM (2), STRB, STRBT, SUB, SWI - MRC/MCR move to/from coprocessor, privileged on
ARMltv6 -gt discard - CDP/LDC command for coprocessor, not on all -gt
discard - B/BL branch, must branch at least 12MB -gt discard
42Instructions
- 13 instructions left, grouping them together
gives 7 - EOR exclusive OR
- LDM load multiple registers from consecutive
memory - LDR load value from memory to register
- STM store multiple registers to consecutive
memory - STR store value from register to memory
- SUB subtract
- SWI software interrupt (i.e. system call)
- Some restrictions on usage
43Overview
- Shellcode
- Alphanumeric shellcode
- ARM architecture
- Alphanumericness in ARM
- Building our shellcode
- Conclusion
44Building our shellcode
- Different steps needed to build our shellcode
- When shellcode starts up we do not know anything
- No idea what the status flags are
- No idea what the values in registers are
- First step get a known value in a register
- MOV not alphanumeric
- Cant do SUB/EOR with two registers not
alphanumeric - EOR/SUB with immediate, but we dont know the
values of the registers - Cant arithmetically get a known value in a
register
45 Getting a known value in a register
- We know PC points to our shellcode its
executing - We know what our shellcode is
- Load from memory using PC as base register
- Cant use PC directly in LDR (not alphanumeric)
- SUB with PC works SUB r3, pc, 48
- 48 0x30, smallest value that we can use that
is alphanumeric - Next LDRB r3, r3, -48
- Now we have loaded the byte at PC - 96 into r3
- In our case it will be 56
- We can now use r3 as base register subtract 56
to load 0, subtract 57 to load -1, 55 to load 1,
etc
46Status flags
- We must execute conditionally AL condition is
not alphanumeric - Use MI/PL
- What is the status when our code starts up?
- Duplicate all instructions until we can set the
status flags - SUBMI r3, pc, 48 copy
PC to r3 - SUBPL r3, pc, 48
- LDRPLB r3, r3, -48 load
known value from mem - LDRMIB r3, r3, -48 r3 now
contains 56 - SUBMIS r5, r3, 57 subtract
57 from r5 -1 - SUBPLS r5, r3, 57 negative
flag is set after this
47Writing to r0 - r2
- All interesting shellcode must do system calls
- To do a system call, we must pass arguments to
the system call - Arguments are stored in r0 - r3
- Cant write to r0 - r2 directly with SUB, LDR or
EOR (not alpha) - Use STM and LDM
- Put correct values in other registers
- Store registers in memory with STM
- Load registers from memory with LDM into r0, r1
and r2
48Writing to r0 - r2
- Example
- SUBPL r7, SP, 48
- STMPLDB r7, r0, r4, r5, r6, r8, lr
- SUBPL r5, SP, 48
- LDMPLFA r5!, r0, r1, r2, r6, r8, lr
- Stores r4, r5 and r6 to the stack (rest are
dummy, to make alphanumeric) - Decrement before / decrement after will misalign
r4 will load to r0, r5 -gt r1, r6 -gt r2
49Self-modification
- No branch instruction makes it hard to do
anything useful - Thumb mode only need to make 2 bytes alphanumeric
and has branch instruction - Problems
- BX is not alphanumeric, cant enter thumb mode
- Thumb mode SWI instruction not alphanumeric
- Need self-modification
- Cant just overwrite instruction cache will
still execute original code
50Flush the cache?
- Possible using MCR, but privileged in ARM lt v6
- Possible using SWI
- mov r0, 0
- mov r1, -1
- mov r2, 0
- swi 0x9F0002
- -gt flush(0, -1, 0)
- Flushes instruction cache for the range
0-0xFFFFFFFF - Problem 0x9F0002 not alphanumeric, cant just
call it
51Flush the cache
- SWI instruction sees its instruction as data, not
as part of the instruction - Reads via the data cache
- Our instructions are not in the data cache
normally - Do self-modification to overwrite argument to SWI
- First overwrite instructions we want to modify
(BX for thumb, SWI calls in Thumb mode) - Then overwrite argument to SWI
- Load values in r0, r1, r2 with LDM/STM
- Execute SWI instruction cache flush
52Shellcode
53Thumb mode
- MUL MultiplyNEG NegateORR Logical OrSTR Store
RegisterSUB SubstractTST Test
- Many more instructions and registers available
- ADC Add with Cary
- ADD Add
- ASR Arithmetic Shift Right
- BX Branch and Exchange
- CMP Compare
- LDR Load Register
- MOV Move
54Thumb Shellcode
54
55Shellcode
56Shellcode
57Overview
- Shellcode
- Alphanumeric shellcode
- ARM architecture
- Alphanumericness in ARM
- Building our shellcode
- Conclusion
58Conclusion
- Alphanumeric shellcode on ARM is possible
- Described in detail in
- Y. Younan, P. Philippaerts. Alphanumeric RISC ARM
Shellcode. Phrack 66, June 2009. - Alphanumeric ARM shellcode (even without Thumb
mode) is even Turing complete - Proof will be presented next week at ACM CCS in
Chicago - Y. Younan, P. Philippaerts, F. Piessens, S.
Lachmund, T. Walter. Filter-resistant code
injection on ARM, ACM Conference on Computer and
Communication Security, Nov. 2009.
59Backup
60Turing completeness
- Arithmetic operations
- ADD can be simulated with SUB
- Load negative value into register using SUB
- Cant directly subtract two registers from each
other (must perform rotation) - Can rotate with register containing 0
- Multiplication and division follow from repeated
ADD or SUB
61Turing completeness
- Bitwise operations
- Rotate and shift
- Right shift/right rotate is possible (using a SUB
instruction) - Left shift possibly by multiplying by 2
- Left rotate can be emulated by a right rotation
over a larger area - Exclusive OR
- available as an instruction (EOR)
- NOT
- EOR with -1
62Turing completeness
- AND
- For every bit
- Left shift by (31 minus location of current
bit) - Right shift so bit becomes least significant
- Multiply registers
- AND is performed over those bits
- Add result to register containing final result
- OR
- Follows from NOT and AND and application of De
Morgans law
63Turing completeness
- Memory access
- Possible using LDR and STR
- Control flow
- Unconditional branches
- Dummy instructions are placed here
- Need self-modifying code, must overwrite location
of branch instruction with correct bytes for
branch - Conditional branches
- Rewrite condition as conditions that only test
for PL and MI - System calls
- Argument to SWI is not alphanumeric -gt
self-modification needed
63
64Turing completeness
- Self-modification
- Problem instruction cache prevents
self-modification - Must flush instruction cache before executing
modified code - Need system call (SWI) to perform instruction
cache flush - Argument to SWI is not alphanumeric
- However, argument to SWI is read by SWI as data
- So
- Rewrite code to what it should be
- Rewrite argument to SWI
- Execute SWI
- Correct code will be executed
65Turing completeness
- Implemented an interpreter for simple language
BrainFck - Assumes array of memory available with a pointer
pointing to it - gt increases pointer to point to next memory
location - lt decreases pointer
- increases value of memory location pointer is
pointing to - - decreases value
- . write memory location to screen
- , read from user and write to memory location
- loop if memory pointed to is not 0
- continue loop if memory pointed to is not 0
66Turing completeness
- Why implement an interpreter for BF?
- BF is Turing Complete
- This means that any program that can be written
in another language that is Turing complete (i.e.
all regular programming languages) can also be
written in BF - Because we implement an interpreter for a Turing
complete language, meaning we can execute any
program written in that language, we can prove
that we are also Turing complete