SI485H: Stack Based Binary Exploits and Defenses (F15)

Home Policy Calendar Resources

Lec. 09: Shell Code in x86

Table of Contents

1 System Calls in x86

To review, let's recall how system calls work in x86. To invoke a system call, we first must issue a trap instruction, or interrupt. The interrupt for a system call is 0x80. The arguments to a system call are passed through the registers:

  • eax : the system call number specifying which system call. For example, 0x4 is write, 0x1 is exit.
  • ebx : the first argument
  • ecx : the second argument
  • edx : the third argument

For example, when we were examining how the system call executed for starting a shell with execve, we saw that this call

char * args[] = {"/bin/sh",NULL};
execve(args[0], args, NULL);

was transformed into a stack with the following settings

                           ___________.--> "/bin/sh"
                          /           |                  
esp + 0x20 -> |   0x80bea08  |<-.   .-'
esp + 0x1c -> |              |  |  /                  
esp + 0x18 -> |              |  | |    eax: 0xb       
esp + 0x14 -> |              |  | |    ebx: 0x80bea08 
esp + 0x10 -> |     0x00     |  | |    ecx: [esp+0x20]
esp + 0xc  -> |  [esp+0x20]  | -' |    edx: 0x00          
esp + 0x8  -> |  0x80bea08   | ---'                      
esp + 0x4  -> |  ret addr    |
esp + 0x0  -> |  saved ebx   |
              '--------------'

And, we can write down the ultimate register values to their ultimate values:

  • eax : 0xb
  • ebx : 0x80bea08 : a memory reference to the string "/bin/sh"
  • ecx : [esp+0x20] : a memory reference to the memory reference to "/bin/sh" (double pinter)
  • edx : 0x00 : a null byte

2 An x86 Programming

Now that we know what shell code looks like when we compile a program from C, our goal is to write our own program in x86 that will perform the necessary system call for executing a shell. We'll do this using the NASM compiler and the gnu linker to build the executable, but we first need to get a grip on x86 programming generally. So we start with Hello World!

2.1 Hello x86 programming

We can write straight up assembly programs. Typically these files are described as ASM files, and have the .asm extension. Here's a hello world program to get us started:

SECTION .data                   ;data section to store string Hello, World\n
        hello:   db "Hello, World!",0x0a,0x00 ;newline and null terminated


SECTION .text                   ; Code section
        global _start           ; Make label available to linker

_start:

        mov edx,0x10            ;number of bytes to write

        mov ecx,hello           ;memory reference to write

        mov ebx,0x1             ;write to stdout

        mov eax,0x4             ;system call number 4 for write
        int 0x80                ;interupt

        mov ebx,0               ;exit value
        mov eax,1               ;exit systemcall number
        int 0x80                ;interupt

First off, main() doesn't exist in assembly, it is a c-language construct. The real starting point of any program is _start. It is the special tag that indicates where to start a program. This tag, or label, needs to be declared globally so that the linker can find it and it must be declared in the .text section, which is where the code goes.

Also within the text section is the main functionality of the program to set up the registers for the system call. Recall that the arguments to the write() system call are:

ssize_t write(int fd, const void *buf, size_t count);

The system call number for write() is 4, so looking at the setting of the registers prior to the first interrupt we must do the following:

  • eax : 0x4 : system call number for write
  • ebx : 0x1 : write to stdout file descriptor 1
  • ecx : hello : reference to data segment to string "Hello, World\n\0"
  • edx : 0x10 : write 16 bytes at the reference, i.e., the string "Hello World\n"

You may also notice the following line in the .data section:

hello:   db "Hello, World!",0x0a,0x00

This instruction creates a named variable in asm called hello which at that reference's location in memory exist the sequence of bytes representing "Hello, World!" followed by 0x0a (the newline symbol) and 0x00 (NULL). This is an easy convention for adding bytes to the program with a reference to be determined later.

Lastly, we preform another system call at the end of the program to perform an exit. The system call number of exit() is 1 and the return value in ebx is 0, a successful exit.

2.2 Compiling and Executing ASM program

To compile our program we use nasm, the netwide assember. We wish to assemble our asm program into an elf object file. Next, we will link our file using ld to create the executable. This is pretty much exactly what happens under the hood of gcc.

To assemble using nasm

nasme -f elf helloworld.asm -o helloworld.o

The -f elf indicates the output format should be in elf. The object file output is the same as any object file we've worked with this semester.

Next, we can link our object file with ld to produce the executable:

ld helloworld.o -o helloworld

Then, we run our program:

user@si485H-base:demo$ nasm -f elf helloworld.asm -o helloworld.o; ld helloworld.o -o helloworld
user@si485H-base:demo$ ./helloworld 
Hello, World!