Title: errata
1(No Transcript)
2errata
- EuSecWest website references IOActive
- No longer IOActive employee
- Independent contractor with IOActive
- Research is the work of myself and does not
relate to IOActive - No one at IOActive doing similar research
3What?!
- Interpreters serve as abstraction layer
- Conceptually similar to VMs used in managed
languages (i.e. Java or .Net) - Attacks against interpreted languages typically
focus around traditional web-app attacks - Poison NULL byte complications
- SQL Injection
- Et cetera
- Traditionally thought of as being immune to
problems that plague other languages- i.e. buffer
overflows - / PSz 12 Nov 03
-
- Be proud that perl(1) may proclaim
- Setuid Perl scripts are safer than C
programs - Do not abandon (deprecate) suidperl. Do not
advocate C wrappers. -
4Reason for Rhyme
- Usage of web and managed applications only going
to increase - Gap between how these are attacked
- Protections against application based attack are
C-centric - Stack cookies
- Higher layers of abstraction may have their own
call stacks - Heap cookies protect against heap memory
corruption - Many languages implement their own allocator that
lack cookies - The unlink() macro sanity checks the
forward/backward pointers - Many languages implement their own allocator that
lack sanity checks - Linked lists often implemented on top of block of
memory - NX protects against execution
- Byte code is read/interpreted, not executed
- ASLR protects against return-to-libc/et cetera
- Still valid
5But, the future of insecurity?
- Hacking community is largely content with the
world as it is - World is changing
- Most OSs ship with some hardening anymore
- GCC ships with SSP
- Visual Studio 2005 is pretty effective
- Interpreted Managed language use on the rise
- We dont get to choose what the applications we
break are written in - Adapt or die
- Maybe not the future
- But, Im at least thinking about it
6Goals Prior Art
- Goals
- Memory Corruption bug in interpreter
- Attack interpreted language metadata
- Return into interpreted language bytecode
- Stephan Esser
- Hardened PHP, Month of PHP bugs, et cetera
- Mark Dowd
- Leveraging the ActionScript Virtual Machine
7Damn the torpedoes
- In Mid-April 2008 Google rolled out AppEngine
- AppEngine enables you to build web applications
on the same scalable systems that power Google
applications - AKA Heres a python interpreter, you cant break
us. - A flagship example is the shell application
- Literally a web-based interface to the
interpreter - Interpreter runs in a restricted environment
- All file-based I/O is (supposed to be) disabled
and a Google specific datastore API is provided - Subprocesses, threads, et cetera disabled
- No sockets
- Many modules disabled or modified
- Perfect target.
8Abba Zabba, you my only friend
- Having direct access to the interpreter allows a
lot of flexibility - Stopping address space leaks becomes incredibly
problematic (sys._getframe() ?) - Attacker can manipulate interpreter state to
match necessary conditions - Situation is the same that shared hosting
providers have faced for years - Except now its Google, theyre pushing this for
enterprise use, and the attack surface has been
acknowledged
9But interpreted languages dont have buffer
overflows
- More common than expected
- CVE-2008-1679 Multiple integer overflows in
imageop module - CVE-2008-1887 Signedness issues in
PyString_FromStringAndSize() - CVE-2008-1721 Signedness issues cause buffer
overflow in zlib module - CVE-2008-XXXX Integer overflow leads to buffer
overflow in Unicode processing - CVE-2008-XXXX Integer overflow leads to buffer
overflow in Buffer objects - Et cetera
- Interpreter code still relatively virgin
- Many in Python due to extensive use of signed
integers
10On 0-day
- Over next few slides several bugs are discussed
- Some are reported and patched
- Some are reported and unpatched
- Some are undisclosed and unpatched
- Not all bugs are equal
- Most occur in unusual circumstances
- Most require direct interpreter access
- Others are typically unexploitable (i.e. memcpy()
of 4G) - Most undisclosed were found in a very short
period of time - Point is, they exist theyre not hard to find
11Ethics of 0-day
- Arguments would be easier to take serious if
contracts didnt have clauses like this
12When good APIs go bad
- Patched in CVS, broken in Python versions up to
2.5.2 - Also in PyBytes_FromStringAndSize()
PyUnicode_FromStringAndSize() - 52 PyObject
- 53 PyString_FromStringAndSize(const char
str, Py_ssize_t size) - 54
- 55 register PyStringObject op
- 56 assert(size gt 0)
- 57 if (size 0 (op nullstring) !
NULL) -
- 63
- 64 if (size 1 str ! NULL
- 65 (op charactersstr UCHAR_MAX) !
NULL) - 66
-
- 72
- 73
- 74 / Inline PyObject_NewVar /
- 75 op (PyStringObject )PyObject_MALLOC(s
izeof(PyStringObject) size)
13Where the wild things roam..
- Currently reported but unpatched
- Like previous example causes faults in numerous
places including core data types - 85 define PyMem_New(type, n)
\ - 86 (assert((n) lt PY_SIZE_MAX /
sizeof(type)) , \ - 87 ((type ) PyMem_Malloc((n)
sizeof(type)))) - 88 define PyMem_NEW(type, n)
\ - 89 (assert((n) lt PY_SIZE_MAX / sizeof(type)
) , \ - 90 ((type ) PyMem_MALLOC((n)
sizeof(type)))) - 91
- 92 define PyMem_Resize(p, type, n)
\ - 93 (assert((n) lt PY_SIZE_MAX /
sizeof(type)) , \ - 94 ((p) (type ) PyMem_Realloc((p), (n)
sizeof(type)))) - 95 define PyMem_RESIZE(p, type, n)
\ - 96 (assert((n) lt PY_SIZE_MAX /
sizeof(type)) , \ - 97 ((p) (type ) PyMem_REALLOC((p), (n)
sizeof(type))))
140xbadc0ded
- Reported, but currently unpatched
- static int
- unicode_resize(register PyUnicodeObject unicode,
Py_ssize_t length) - ...oldstr unicode-gtstrPyMem_RESIZE(unicod
e-gtstr, Py_UNICODE, length 1) - ...unicode-gtstrlength 0
- static
- PyUnicodeObject _PyUnicode_New(Py_ssize_t
length) -
- ... / Unicode freelist memory
allocation / if (unicode_freelist)
if ((unicode-gtlength lt
length) unicode_resize(unicode, length) lt 0)
- else unicode-gtstr
PyMem_NEW(Py_UNICODE, length 1)
150xbadc0ded
- Reported and patched in CVS, versions up to 2.5.2
are vulnerable - 768 static PyObject
- 769 PyZlib_unflush(compobject self, PyObject
args) - 770
- 771 int err, length DEFAULTALLOC
- 772 PyObject retval NULL
- 773 unsigned long start_total_out
- 774
- 775
- 776 if (!PyArg_ParseTuple(args, "iflush",
length)) - 777 return NULL
- 778 if (!(retval PyString_FromStringAndSize(
NULL, length))) - 779 return NULL
-
- 783 start_total_out self-gtzst.total_out
- 784 self-gtzst.avail_out length
- 785 self-gtzst.next_out (Byte
)PyString_AS_STRING(retval) - 786
160xbadc0ded
- Currently undisclosed unpatched
- static PyObject
- array_fromunicode(arrayobject self, PyObject
args) - Py_UNICODE ustr Py_ssize_t
n if (!PyArg_ParseTuple(args,
ufromunicode", ustr, n))
return NULL ... if (n gt 0)
Py_UNICODE item (Py_UNICODE )
self-gtob_item if (self-gtob_size
gt PY_SSIZE_T_MAX - n)
return PyErr_NoMemory()
PyMem_RESIZE(item, Py_UNICODE,
self-gtob_size n) if (item
NULL)
PyErr_NoMemory() return
NULL
self-gtob_item (char ) item
self-gtob_size n
self-gtallocated self-gtob_size
memcpy(item self-gtob_size - n, ustr, n
sizeof(Py_UNICODE))
170xbadc0ded
- Currently undisclosed unpatched
- static PyObject
- encoder_encode_stateful(MultibyteStatefulEncoderCo
ntext ctx, - PyObject
unistr, int final) - PyObject ucvt, r NULL
Py_UNICODE inbuf, inbuf_end, inbuf_tmp
NULL Py_ssize_t datalen,
origpending ... datalen
PyUnicode_GET_SIZE(unistr) origpending
ctx-gtpendingsize if (origpending gt 0)
if (datalen gt PY_SSIZE_T_MAX -
ctx-gtpendingsize)
- inbuf_tmp
PyMem_New(Py_UNICODE, datalen
ctx-gtpendingsize) - memcpy(inbuf_tmp, ctx-gtpending,
Py_UNICODE_SIZE ctx-gtpendingsize)
18ya dig?
- Currently undisclosed unpatched
- static PyObject
- posix_execv(PyObject self, PyObject args)
- if (!PyArg_ParseTuple(args,
"etOexecv", Py_FileSystemDefaultEncoding,
path,
argv)) return NULL if
(PyList_Check(argv)) argc
PyList_Size(argv) else if
(PyTuple_Check(argv)) argc
PyTuple_Size(argv) argvlist
PyMem_NEW(char , argc1) ... for
(i 0 i lt argc i) if
(!PyArg_Parse((getitem)(argv, i), "et",
Py_FileSystemDefaultEncodi
ng,
argvlisti)) ...
19Goals review
- Goal 0 memory corruption bugs
- Bugs are just as prevalent as other traditional
applications - Some of them are pretty silly
- Lots are still easy to spot and require very
little in the way of deep thinking - Some are exploitable, some require specific
circumstances, others are just bugs - Goal 1 attack interpreter level metadata..
20Python Call stack
- Simple test program
- !/usr/bin/python
- import time
- while 1
- time.sleep(500)
- gdbgt r
-
- Program received signal SIGINT, Interrupt.
- Switching to Thread 0x2b9d5bdd4d00 (LWP 28588)
- 0x00002b9d5bb4f043 in select () from
/lib/libc.so.6 - gdbgt bt
- 0 0x00002b9d5bb4f043 in select () from
/lib/libc.so.6 - 1 0x00002b9d5bdd869f in time_sleep
- 2 0x0000000000486097 in PyEval_EvalFrameEx
- 3 0x0000000000488002 in PyEval_EvalCodeEx
- 4 0x00000000004882a2 in PyEval_EvalCode
- 5 0x00000000004a969e in PyRun_FileExFlags
- 6 0x00000000004a9930 in PyRun_SimpleFileExFlags
21Bytecode flow overview
22Python Objects
- Most (interesting) object types start with a
reference to PyObject_VAR_HEAD - i.e.
- typedef struct _xyz
- PyObject_VAR_HEAD
-
- PyObject_VAR_HEAD macro expands to
- Contain the objects reference count
- Contain pointers to next/previous in-use object
(doubly linked list) - Contains a pointer to the objects type
- This point is way more important at first may
seem
23PyCodeObject
- / Bytecode object /
- typedef struct
- PyObject_HEAD
- int co_argcount / arguments,
except args / - int co_nlocals / local
variables / - int co_stacksize / entries
needed for evaluation stack / - int co_flags / CO_...,
see below / - PyObject co_code / instruction
opcodes / - PyObject co_consts / list (constants
used) / - PyObject co_names / list of strings
(names used) / - PyObject co_varnames / tuple of strings
(local variable names) / - PyObject co_freevars / tuple of strings
(free variable names) / - PyObject co_cellvars / tuple of strings
(cell variable names) /
- PyObject co_filename / string (where it
was loaded from) / - PyObject co_name / string (name, for
reference) / - int co_firstlineno / first
source line number / - PyObject co_lnotab / string (encoding
addrlt-gtlineno mapping) / - void co_zombieframe / for optimization
only (see frameobject.c) / - PyCodeObject
24PyEval_EvalCodeEx()
- PyEval_EvalCode() is a simple wrapper to
PyEval_EvalCodeEx() - Uses default arguments for last seven parameters
passes NULL or 0 - Takes a PyCodeObject as a parameter
- Creates a PyFrameObject
- Sets up local/global/et cetera variables
- Serves essentially to setup environment
25PyFrameObject
- typedef struct _frame
- PyObject_VAR_HEAD
- struct _frame f_back / previous frame,
or NULL / - PyCodeObject f_code / code segment /
- PyObject f_builtins / builtin symbol
table (PyDictObject) / - PyObject f_globals / global symbol
table (PyDictObject) / - PyObject f_locals / local symbol
table (any mapping) / - PyObject f_valuestack / points after the
last local / - / Next free slot in f_valuestack. Frame
creation sets to f_valuestack. - Frame evaluation usually NULLs it, but a
frame that yields sets I - to the current stack top. /
- PyObject f_stacktop
- PyObject f_trace / Trace function
/ -
- PyThreadState f_tstate
- int f_lasti / Last
instruction if called / -
- int f_iblock / index in
f_blockstack / - PyTryBlock f_blockstackCO_MAXBLOCKS / for
try and loop blocks /
26PyFrameObject destruction
- As frames go out of scope, frame_dealloc() is
called to destroy them - During destruction, only locals, exception and
debugging information is cleared - Frame can end up in the PyCodeObjects zombie
frame, the free list, or just destroyed - void
- frame_dealloc(PyFrameObject f)
-
- ...for (p f-gtf_localsplus p lt valuestack
p) Py_CLEAR(p)...Py_CLEAR(f-gtf_locals)
Py_CLEAR(f-gtf_trace)Py_CLEAR(f-gtf_exc_type)Py_
CLEAR(f-gtf_exc_value)Py_CLEAR(f-gtf_exc_traceback
)co f-gtf_codeif (co-gtco_zombieframe
NULL) co-gtco_zombieframe felse if (numfree lt
MAXFREELIST) numfree f-gtf_back
free_list free_list f
27Zombies attack!!1
- PyFrameObject PyFrame_New(PyThreadState tstate,
PyCodeObject code, PyObject globals, PyObject
locals) -
- ... if (code-gtco_zombieframe ! NULL)
f code-gtco_zombieframe
code-gtco_zombieframe NULL
assert(f-gtcode code) - else ...
- f-gtf_stacktop f-gtf_valuestack
- f-gtf_builtins builtinsPy_XINCREF(back)
f-gtf_back backPy_INCREF(code)Py_INCREF(globa
ls)f-gtf_globals globals...return f
28Unleashing your zombie army..
- Attacking zombie frame not always necessary, or
doing so may not make sense - Many heap overflows occur in direct control of
byte stream - Many others either also allow direct control of
the argument stack or both - Plenty of instances where you dont hit either
- Zombie frame is useful for pointer sized writes
anywhere in memory - On smaller overflows, fairly typical to corrupt
members of object - Many objects destructors use linked lists with
unprotected unlinking functionality
29PyEval_EvalFrameEx()
- Implements state-machine for processing bytecode
- define INSTR_OFFSET() ((int))(next_instr
first_instr)) - define NEXTOP() (next_instr)
- define NEXTARG() (next_instr 2,
(next_instr-1ltlt8) next_instr-2) - PyObject
- PyEval_EvalFrameEx(PyFrameObject f, int
throwflag) - ...first_instr (unsigned char)
PyString_AS_STRING(co-gtco_code)...next_instr
first_instr f-gtf_lasti 1stack_pointer
f-gtf_stacktop...for () ...
f-gtf_lasti INSTR_OFFSET() ... opcode
NEXTOP() oparg 0 / allows oparg
to be stored in a register because
it doesn't have to be remembered across a full
loop / if (HAS_ARG(opcode))
oparg NEXTARG() - switch (opcode)
30Important variables
- first_instr
- Taken directly from f-gtf_code-gtco_code
- Determines first instruction in
PyCodeObject/bytecode to be executed - Local (stack based) variable in
PyEval_EvalFrameEx() - Points to heap data
- next_instr
- Derived from first_instr starts out pointing to
same location - Incremented by one to three bytes per opcode
- Dictates next instruction in bytecode to be
interpreted - Local (stack based) variable in
PyEval_EvalFrameEx() - Points to heap data
- stack_pointer
- Derived from f-gtf_stacktop
- Determines next argument to given opcode
- Makes up data stack
- Local (stack based) variable in
PyEval_EvalFrameEx() - Points to heap data
31Only handguns and tequila allow bigger mistakes
faster
- define JUMPTO(x) (next_instr first_instr
(x))while (why ! WHY_NOT f-gtf_iblock gt
0) PyTryBlock b PyFrame_BlockPop(f) asser
t(why ! WHY_YIELD) if (b-gtb_type SETUP_LOOP
why WHY_CONTINUE) PyFrame_BlockSetup(f,
b-gtb_type, b-gtb_handler,b-gtb_level) why
WHY_NOT JUMPTO(PyInt_AS_LONG(retval)) Py_DEC
REF(retval) break ... if (b-gtb_type
SETUP_LOOP why WHY_BREAK) why
WHY_NOT JUMPTO(b-gtb_handler) break if
(b-gtb_type SETUP_FINALLY (b-gtb_type
SETUP_EXCEPT why WHY_EXCEPTION))
... why WHY_NOT JUMPTO(b-gtb_handler)
break / unwind stack /
32Other interpreter targets..
- Pythons debugging functionality allows for
tracing of the application - Whether the currently executing byte-code is
being traced is determined by a member in the
stack frame - Tracing allows for calls before opcode execution,
function entry, exception handling, et cetera - Many Objects have functionality that can be
abused with small amounts of memory corruption - Be creative, they havent been beat on like the
various libcs, so they havent hardened their
implementations
33Goal Review
- Goal 1 Attack interpreter level metadata
- In most cases overwriting a PyCodeObject or the
stack_pointer is trivial - In others attacking the zombie frame allows for
an interesting and humorous exercise - Pythons exception handling can also be abused
- Objects are unhardened
- This allows us to bypass many of the hardening
functionality in existence, i.e. stack/heap
cookies, unlink() hardening, et cetera - Goal 2 Return into byte-code
34opcodes
- Python opcodes are a single char
- As of 2.5.2 there are 103 opcodes
- Opcodes take optional 16-bit modifier
- Can be thought of as like a sub-opcode
- Arguments/parameters are pointed to by
stack_pointer - Thus, parameters need to be placed on the stack
first, then the opcode in question called - i.e.
- gtgtgt def test()
- print PsychoAlphaDiscoBetaBioAquaDoLoop
-
- gtgtgt __import__(dis).dis(test)
- 2 0 LOAD_CONST 1
('PsychoAlphaDiscoBetaBioAquaDoLoop') 3
PRINT_ITEM - 4 PRINT_NEWLINE
- 5 LOAD_CONST 0 (None)
- 8 RETURN_VALUE
35Our House..
- Easiest method is to abuse the support for
run-time functions (lambdas) - Opcode is MAKE_FUNCTION
-
- case MAKE_FUNCTION
- v POP() / code object / x
PyFunction_New(v, f-gtf_globals)
Py_DECREF(v) / XXX Maybe this
should be a separate opcode? / if (x
! NULL oparg gt 0) v
PyTuple_New(oparg)
if (v NULL)
Py_DECREF(x)
x NULL
break
while (--oparg gt
0) w POP()
PyTuple_SET_ITEM(v, oparg, w)
err PyFunction_SetDefaults(x, v)
Py_DECREF(v) - PUSH(x)
break
36In the middle of our street..
- Python natively generates code that has a
STORE_FAST/LOAD_FAST - Dont think theyre necessary
- Pressed for time, so didnt investigate whether
they were necessary or not -
- define GETLOCAL(i) (fastlocalsi) define
SETLOCAL(i, value) do PyObject tmp
GETLOCAL(i) \
GETLOCAL(i) value \
Py_XDECREF(tmp) while (0) - case LOAD_FAST
- x GETLOCAL(oparg) if
(x ! NULL)
Py_INCREF(x)
PUSH(x) goto
fast_next_opcode
- break case
STORE_FAST v POP()
SETLOCAL(oparg, v)
goto fast_next_opcode
37Let there be light.
- Now that weve built the function and setup the
argument stack, its just time to call it - Accomplished via CALL_FUNCTION opcode
- case CALL_FUNCTION
PyObject sp
PCALL(PCALL_ALL) sp
stack_pointerifdef WITH_TSC
x call_function(sp, oparg, intr0,
intr1)else x
call_function(sp, oparg)endif
stack_pointer sp
PUSH(x) if (x !
NULL) continue
break
38Calling the call_function() function
- static PyObject
- call_function(PyObject pp_stack, int
opargifdef WITH_TSC , uint64
pintr0, uint64 pintr1endif ) -
- int na oparg 0xffint nk (oparggtgt8)
0xffint n na 2 nkPyObject pfunc
(pp_stack) - n - 1PyObject func
pfuncPyObject x, w if (PyCFunction_Check(f
unc) nk 0) ... if (flags
METH_NOARGS na 0) C_TRACE(x,
(meth)(self,NULL)) else if (flags
METH_O na 1) PyObject arg
EXT_POP(pp_stack)
C_TRACE(x, (meth)(self,arg))
Py_DECREF(arg)
...
39calling the call_function() function
- static PyObject
- call_function(PyObject pp_stack, int oparg
- ifdef WITH_TSC
- , uint64 pintr0, uint64 pintr1
- endif
- )
-
-
- if (PyCFunction_Check(func) nk 0)
... else if (PyMethod_Check(func)
PyMethod_GET_SELF(func) ! NULL)
PyObject self PyMethod_GET_SELF(func)
... - else ... if
(PyFunction_Check(func)) x
fast_function(func, pp_stack, n, na, nk) -
40A final note about call_function()
- Often after overflow the code returned into is
call_function() - call_function() cleans up the argument stack
after calling the function - while ((pp_stack) gt pfunc) w
EXT_POP(pp_stack) Py_DECREF(w) PCALL(PCALL_
POP) -
- Py_DECREF() will almost certainly cause a
destructor to get called - If data stack_pointer pointed to was corrupted,
this will be the first place its felt - Unless youre ready for a ret-into-libc type
attack, make sure that w points to valid memory
that has a value greater than 1
41PyCodeObjects dont grow on trees you know!
- Where to get a PyCodeObject?
- Two options, dependant on context
- AppEngine, et al
- x unicode(compile(print zdravstvoyte mir!,
ltstringgt, exec)) - x will contain string along the lines of
- ltcode object ltmodulegt at 0x4e058f1c37daf18, file
ltstringgt, line 1gt - Now stack_pointer just needs to point at
0x4e058f1c37daf18 - Less controlled environments
- Use compile() to obtain code object
- References in header need to be updated--
PyCodeObject-gtco_code - Requires address space leak
42But..
- Returning into bytecode in AppEngine doesnt make
much sense? - Already have control of the interpreter
- Return into same restricted environment
- Not exactly true-- but ret-into-libc or similar
eventually becomes necessary - Ret-into-libc requires address space info
- Non-AppEngine attacks require address space info
43Tell me about your mother..
- One reason for return into byte-code on
AppEngine PRINT_EXPR opcode - case PRINT_EXPR
- v POP()w PySys_GetObject("displayhook")
- if (w NULL) PyErr_SetString(PyExc_RuntimeEr
ror, "lost sys.displayhook") err
-1 x NULLif (err 0) x
PyTuple_Pack(1, v) if (x NULL)
err -1if (err 0) w
PyEval_CallObject(w, x) Py_XDECREF(w) if (w
NULL) err -1Py_DECREF(v)Py
_XDECREF(x)break
44All we had to do was ask..
- Typical results of memory leak
- \x00\x00\x00\x00\x00\x00\x00\x0f\x00\x00\x80\x00
\x00\x01\x00\x00\x00\x01\x00\x00\x00\xa0\x91\x81\x
00\x00\x00\x00\x00\x00\x90\x91\x81\x00\x00\x00\xb0
\x91\x81\x12\x1e\x01\x00\x00\x02\ - Leaking heap addresses in this example
0x8191XXXX - If stack_pointer is controlled, can point
anywhere in the address space - Leak is really only bounded by how much byte-code
you can get into the stream - Objects used to verify typing information are
statically allocated and use fixed offsets thus
once you know the low-order bytes, you can spot
them easily - Problematic because it prints to standard out,
which can be redirected but post-overflow is a
lot of trouble - Good strategy is to take advantage of fact that
were dealing with a web-app that can crash an
infinite number of times
45Loose APIs sink ships.
- Python is one of those overly helpful languages
- Pretty much all objects can be printed out
- When you print an object, the address of the
object is leak - Object can also be converted to a string where
the same string that gets printed ends up in
string, i.e. unicode(compile()) gives you the
address of a PyCodeObject - Leak PyFrameObject addresses
- sys._getframe() returns frame object
- sys._currentframes() returns a dictionary with
each threads current frame - builtin function id()
- Each object has a unique id
- This is accomplished by using the address of the
object for the id - i.e. id(None) yields address of None object
(think about that in context of obtaining type
object addresses) - Builtin functions dir() and getattr()
- dir() allows you to enumerate attributes of an
object - getattr() allows you to obtain its value
- Useful when function pointers cannot be avoided
46Goals in review
- Goal 2 return into interpreter byte-code
- Easier to accomplish than initially thought
- Process of executing byte-code is more
problematic due to type-checks - Successful exploitation absolutely requires
address space leaks - Python provides us with nice opcodes to allow
leaking - Easiest method is to use MAKE_FUNCTION/CALL_FUNCTI
ON combination as bootstrap mechanism - Restricted interpreters require return-into-C
code to break out of - Returning into byte-code provides these
advantages - Non-executable memory is not necessary- byte-code
is interpreted not executed - Because its not executed we can use it to dump
address space info
4722
- Overall process
- Obtain address space information via memory
corruption and executing PRINT_EXPR opcode - Obtain addresses that were valid at one time and
fix up data with addresses - Craft a PyCodeObject either in the address space
or in the shellcode, update PyCodeObject header
information if injecting into address space - Execute following opcodes MAKE_FUNCTION/STORE_FAS
T/LOAD_FAST/CALL_FUNCTION - Return into PyCodeObject
- From PyCodeObject return into executable memory
48Other available opcodes
- LOAD_CONST loads constant onto argument stack
using oparg index into consts member of
PyCodeObject - POP_TOP removes member from top of argument
stack, decreases reference - ROT_TWO/ROT_THREE/ROT_FOUR rotates position of
argument stack, moving 2, 3 or 4 arguments around - DUP_TOPX duplicates either 2 or 3 pointers on
argument stack - STORE_SLICEX some code paths do not have type
checks and instead create a new object, allows
object creation - PRINT_ITEM_TO allows redirection of output,
pops output data from variable stack, falls
through to PRINT_ITEM which may redirect to
stdout - LOAD_LOCALS places f-gtf_locals onto argument
stack, great opcode to prefix PRINT_X opcodes - YIELD_VALUE takes return value from argument
stack, sets f-gtf_stacktop to point to
stack_ponter - POP_BLOCK Obtains PyTryBlock from argument
stack, decrements references for each record - STORE_GLOBAL / LOAD_GLOBAL same as with other
STORE_X/LOAD_X opcodes, except operates on
globals
49More opcodes
- JUMP_ABSOLUTE like JUMP_FORWARD except is not
relative to first_instr - FOR_ITER makes function pointer call from
pointer retrieved from argument stack, good once
address space layout is known - EXTENDED_ARGS advances to next opcode, obtains
another 16-bit oparg and combines it with
existing 16-bit oparg, combined with
JUMP_ABSOLUTE allows byte-code to exist anywhere
(on 32-bit machines) - LOAD_CLOSURE places pointer on argument stack
from different section of heap memory - BUILD_X several opcodes, allows for creation of
Tuples, Lists, Maps and Slices - JUMP_FORWARD advances position in bytecode by
oparg bytes - Other opcodes perform type checks or have
callbacks - Many of the ones with type checks can be coaxed
into usage by first building an object via one of
the BUILD_X opcodes - Once address space is know, opcodes with
callbacks are not dangerous
50python.fin()
- Bugs in Python exist and are easy to find
- Data structures and general metadata is easy to
abuse - Byte-code is position independent and thus easy
to make - Because of its PIC nature argument stack exists
elsewhere - Ownership can be transferred with one or the
other, but made more difficult - Hardest part of returning into byte-code is ASLR
- Python is really helpful there.
51What about PERL?
- PERL has bugs too
- Reading PERL is an exercise in patience
- Friend I still maintain that PERL was not
written.. It was found.. On a crashed UFO - Yeah, it is that bad
- Be careful when looking into the abyss..
52Ugh, wtf?
- Just an example (from 5.8.8)
- int
- perl_parse(PTHXx_, XSINIT_t xsinit, int argc,
char argv, char env) -
-
- ifdef PERL_FLEXIBLE_EXCEPTIONS
- CALLPROTECT(aTHX_ pcur_env, ret,
- MEMBER_TO_FPTR(S_vparse_body),
- env, xsinit)
- else
- JMPENV_PUSH(ret)
- endif
53Are you kidding me?
- If PERL_FLEXIBLE_EXCEPTIONS is defined
- define CALLPROTECT CALL_FPTR(PL_protect)
- define CALL_FPTR(fptr) (fptr)
- define PL_protect (aTHX-gtTprotect)
- define aTHX PERL_GET_THX
- define aTHX_ aTHX,
-
- ifdef USE_5005THREADS
- define PERL_GET_THX ((struct perl_thread
)PERL_GET_CONTEXT) - else
- ifdef MULTIPLICITY
- define PERL_GET_THX ((PerlInterpreter
)PERL_GET_CONTEXT) - endif
- endif
54Larry Wall is trying to kill me
- And some more
- ifndef PERL_GET_CONTEXT
- define PERL_GET_CONTEXT ((void )NULL)
- define PERL_GET_CONTEXT Perl_get_context()
- define MEMBER_TO_FPTR(name) name
- So, on conditional compilation expands to
- perl_get_context()-gtTprotect((struct perl_thread
)Perl_get_context(), - pcur_env,
- ret,
- S_Vparse_body,
- env,
- xsinit)
55You outsourced to the guy who wrote procmail
didnt you?!
- If PERL_FLEXIBLE_EXCEPTIONS is not defined
- define JMPENV_PUSH(v) JMPENV_PUSH_ENV((JMPENV)p
cur_env, v) - define JMPENV_PUSH_ENV(ce, v) \
- STMT_START \
- if (!(ce).je_noset) \
- DEBUG_1(Perl_deb(aTHX_ Setting up jumplevel
p, was p\n, \ - cl, PL_top_env)) \
- JMPENV_PUSH_INIT_ENV(ce, NULL) \
- EXCEPT_SET_ENV(ce, PerlProc_setjmp((ce).je_buf,
SCOPE_SAVES_SIGNAL_MASK)) \ - (ce).je_noset 1 \
- \
- else \
- EXCEPT_SET_ENV(ce, 0) \
- JMPENV_POST_CATCH_ENV(ce) \
- (v) EXCEPT_GET_ENV(ce) \
- STMT_END
56sigh
- Does do while(0) not work somewhere??
- if defined(__GNUC__) !defined(PERL_GCC_BRACE_G
ROUPS_FORBIDDEN) !defined (__cplusplus) - define STMT_START (void)
- define STMT_END
- else
- if (VOIDFLAGS) (defined(sun)
defined(__sun__)) !defined(__GNUC__) - define STMT_START if (1)
- define STMT_END else (void)0
- else
- define STMT_START do
- define STMT_END while (0)
- endif
- endif
57- Were just gonna skip expanding Debug_1() and
guess that it probably deals with debugging
output - define JMPENV_PUSH_INIT_ENV(ce, THROWFUNC) \
- STMT_START \
- (ce).je_throw (THROWFUNC) \
- (ce).je_ret -1 \
- (ce).je_mustcatch FALSE \
- (ce).je_prev PL_top_env \
- PL_TOP_env (ce) \
- OP_REG_TO_MEM \
- STMT_END
- define PL_top_env (aTHX-gtTtop_env)
- ifdef OP_IN_REGISTER
- define OP_REG_TO_MEM PL_opsave op
-
- else
- define OP_REG_TO_MEM NOOP
58Anyone wanna bet how many slides it takes to
explain one line of PERL?
- Almost there
- define EXCEPT_SET_ENV(ce, v) ((ce).je_ret (v))
- define JMPENV_POST_CATCH_ENV(ce) \
- STMT_START \
- OP_MEM_TO_REG \
- PL_top_env (ce) \
- STMT_END
- define EXCEPT_GET_ENV(ce) ((ce).je_ret)
59Huzzah! One line expanded!
- seven slides later..
- Two of over a dozen possible conditional
compilations were explored - Weve successfully decoded one line of perl
- Except we havent, now we have to find out where
the function pointers get initialized - And of course, discover where the heck op came
from - I dont think an hour is long enough to cover
just PERL, much less PERL and Python
600xbadc0ded
- Undisclosed unpatched
- do i_img im read_one_tiff(tif, 0)
if (!im) break if (count gt
result_alloc) if (result_alloc 0)
result_alloc 5 results
mymalloc(result_alloc sizeof(i_img ))
else i_img newresults
result_alloc 2 newresults
myrealloc(results, result_alloc sizeof(i_img
)) if (!newresults)
i_img_destroy(im) / don't leak it /
break results newresults
resultscount-1 im while
(TIFFSetDirectory(tif, dirnum))
610xbadc0ded
- Undisclosed unpatched
- static i_img
- read_one_rgb_tiled(TIFF tif, int width, int
height, int allow_incomplete) ... uint32
raster NULL ... uint32 tile_width,
tile_height i_color line ...
TIFFGetField(tif, TIFFTAG_TILEWIDTH,
tile_width) TIFFGetField(tif,
TIFFTAG_TILELENGTH, tile_height) ...
raster (uint32)_TIFFmalloc(tile_width
tile_height sizeof (uint32)) ... line
mymalloc(tile_width sizeof(i_color)) for(
row 0 row lt height row tile_height )
for( col 0 col lt width col tile_width )
/ Read the tile into an RGBA array
/ if (myTIFFReadRGBATile(img, col, row,
raster)) ...
620xbadc0ded
- Undisclosed unpatched
- i_img
- read_one_rgb_lines(TIFF tif, int width, int
height, int allow_incomplete) i_img
im uint32 raster NULL
uint32 rowsperstrip, row i_color
line_buf int alpha_chan
int rc im make_rgb(tif, width,
height, alpha_chan) ... rc
TIFFGetField(tif, TIFFTAG_ROWSPERSTRIP,
rowsperstrip) ... raster
(uint32)_TIFFmalloc(width rowsperstrip
sizeof (uint32)) ... line_buf
mymalloc(sizeof(i_color) width)
for( row 0 row lt height row rowsperstrip )
uint32 newrows, i_row
if (!TIFFReadRGBAStrip(tif, row, raster))