Lec. 09: Shell Code in x86
Table of Contents
1 System Calls in x86
To review, let's recall how system calls work in x86. To invoke a system call, we first must issue a trap instruction, or interrupt. The interrupt for a system call is 0x80. The arguments to a system call are passed through the registers:
eax
: the system call number specifying which system call. For example, 0x4 is write, 0x1 is exit.ebx
: the first argumentecx
: the second argumentedx
: the third argument
For example, when we were examining how the system call executed for starting a shell with execve, we saw that this call
char * args[] = {"/bin/sh",NULL}; execve(args[0], args, NULL);
was transformed into a stack with the following settings
___________.--> "/bin/sh" / | esp + 0x20 -> | 0x80bea08 |<-. .-' esp + 0x1c -> | | | / esp + 0x18 -> | | | | eax: 0xb esp + 0x14 -> | | | | ebx: 0x80bea08 esp + 0x10 -> | 0x00 | | | ecx: [esp+0x20] esp + 0xc -> | [esp+0x20] | -' | edx: 0x00 esp + 0x8 -> | 0x80bea08 | ---' esp + 0x4 -> | ret addr | esp + 0x0 -> | saved ebx | '--------------'
And, we can write down the ultimate register values to their ultimate values:
eax
: 0xbebx
: 0x80bea08 : a memory reference to the string "/bin/sh"ecx
: [esp+0x20] : a memory reference to the memory reference to "/bin/sh" (double pinter)edx
: 0x00 : a null byte
2 An x86 Programming
Now that we know what shell code looks like when we compile a program from C, our goal is to write our own program in x86 that will perform the necessary system call for executing a shell. We'll do this using the NASM compiler and the gnu linker to build the executable, but we first need to get a grip on x86 programming generally. So we start with Hello World!
2.1 Hello x86 programming
We can write straight up assembly programs. Typically these files are
described as ASM files, and have the .asm
extension. Here's a hello
world program to get us started:
SECTION .data ;data section to store string Hello, World\n hello: db "Hello, World!",0x0a,0x00 ;newline and null terminated SECTION .text ; Code section global _start ; Make label available to linker _start: mov edx,0x10 ;number of bytes to write mov ecx,hello ;memory reference to write mov ebx,0x1 ;write to stdout mov eax,0x4 ;system call number 4 for write int 0x80 ;interupt mov ebx,0 ;exit value mov eax,1 ;exit systemcall number int 0x80 ;interupt
First off, main()
doesn't exist in assembly, it is a c-language
construct. The real starting point of any program is _start
. It is
the special tag that indicates where to start a program. This tag, or
label, needs to be declared globally so that the linker can find it
and it must be declared in the .text
section, which is where the
code goes.
Also within the text section is the main functionality of the program
to set up the registers for the system call. Recall that the
arguments to the write()
system call are:
ssize_t write(int fd, const void *buf, size_t count);
The system call number for write()
is 4, so looking at the setting
of the registers prior to the first interrupt we must do the following:
eax
:0x4
: system call number for writeebx
:0x1
: write to stdout file descriptor 1ecx
:hello
: reference to data segment to string "Hello, World\n\0"edx
: 0x10 : write 16 bytes at the reference, i.e., the string "Hello World\n"
You may also notice the following line in the .data
section:
hello: db "Hello, World!",0x0a,0x00
This instruction creates a named variable in asm called hello
which
at that reference's location in memory exist the sequence of bytes
representing "Hello, World!" followed by 0x0a (the newline symbol) and
0x00 (NULL). This is an easy convention for adding bytes to the
program with a reference to be determined later.
Lastly, we preform another system call at the end of the program to
perform an exit. The system call number of exit()
is 1 and the
return value in ebx
is 0, a successful exit.
2.2 Compiling and Executing ASM program
To compile our program we use nasm, the netwide assember. We wish to assemble our asm program into an elf object file. Next, we will link our file using ld to create the executable. This is pretty much exactly what happens under the hood of gcc.
To assemble using nasm
nasme -f elf helloworld.asm -o helloworld.o
The -f elf
indicates the output format should be in elf. The object
file output is the same as any object file we've worked with this
semester.
Next, we can link our object file with ld
to produce the executable:
ld helloworld.o -o helloworld
Then, we run our program:
user@si485H-base:demo$ nasm -f elf helloworld.asm -o helloworld.o; ld helloworld.o -o helloworld user@si485H-base:demo$ ./helloworld Hello, World!