Lec. 03: Disassembling a Program

1. A binary program … what is it?
2. objdump and readelf basics
3. x86 the Processor Register State

1 A binary program … what is it?

Last week we looked a lot at writing programs in c, compiling them into binaries, and then running them. This week, we peal back the covers further and look right at the binary files themselves. We will examine both what exactly is a binary, how is it formatted, and how do we parse or dissemble the contents within?

2 `objdump` and `readelf` basics

For this entire class, we will pick apart a simple helloworld program:

#include <stdio.h>

int main(int argc, char *argv){

    char hello[15]="Hello, World!\n";
    char * p;

    for(p = hello; *p; p++){

        putchar(*p);         

    }

    return 0;
}

Let's compile the program to create a binary:

user@si485H-base:demo$ gcc helloworld.c -o helloworld

Now if we use the file command we can see what kind of file the binary is.

user@si485H-base:demo$ file helloworld
helloworld: ELF 32-bit LSB  executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=2b27688b97f10f626f1ff62c232d7a2298d6afa1, not stripped

We see that it is actually an ELF file, which stands for Executable and Linkable Format. We will work exclusively with binaries in ELF.

2.1 ELF Files and ELF Headers

All ELF files have a header describing the different sections and general information. We can read the header information for our helloworld program using the readelf

user@si485H-base:demo$ readelf -h helloworld
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x8048370
  Start of program headers:          52 (bytes into file)
  Start of section headers:          4472 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         9
  Size of section headers:           40 (bytes)
  Number of section headers:         30
  Section header string table index: 27

Most of this information isn't too useful, but let me point out some key things.

There is a magic number! The magic number is used to say, hey this is ELF and what version
The class is ELF32, so it's 32 bit
The machine is Intel 80386, or x386 to be execpected
The entry point for the file is address 0x804870, essentially what is the first intsruction in the __start section function which calls main.

Everything else is not super useful for our purposes. Another nice thing we can do with readelf is we can look at all the sections, which is regions of the binary for different purposes.

user@si485H-base:demo$ readelf -S helloworld
There are 30 section headers, starting at offset 0x1178:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        08048154 000154 000013 00   A  0   0  1
  [ 2] .note.ABI-tag     NOTE            08048168 000168 000020 00   A  0   0  4
  [ 3] .note.gnu.build-i NOTE            08048188 000188 000024 00   A  0   0  4
  [ 4] .gnu.hash         GNU_HASH        080481ac 0001ac 000020 04   A  5   0  4
  [ 5] .dynsym           DYNSYM          080481cc 0001cc 000060 10   A  6   1  4
  [ 6] .dynstr           STRTAB          0804822c 00022c 000068 00   A  0   0  1
  [ 7] .gnu.version      VERSYM          08048294 000294 00000c 02   A  5   0  2
  [ 8] .gnu.version_r    VERNEED         080482a0 0002a0 000030 00   A  6   1  4
  [ 9] .rel.dyn          REL             080482d0 0002d0 000008 08   A  5   0  4
  [10] .rel.plt          REL             080482d8 0002d8 000020 08   A  5  12  4
  [11] .init             PROGBITS        080482f8 0002f8 000023 00  AX  0   0  4
  [12] .plt              PROGBITS        08048320 000320 000050 04  AX  0   0 16
  [13] .text             PROGBITS        08048370 000370 0001f2 00  AX  0   0 16
  [14] .fini             PROGBITS        08048564 000564 000014 00  AX  0   0  4
  [15] .rodata           PROGBITS        08048578 000578 000008 00   A  0   0  4
  [16] .eh_frame_hdr     PROGBITS        08048580 000580 00002c 00   A  0   0  4
  [17] .eh_frame         PROGBITS        080485ac 0005ac 0000b0 00   A  0   0  4
  [18] .init_array       INIT_ARRAY      08049f08 000f08 000004 00  WA  0   0  4
  [19] .fini_array       FINI_ARRAY      08049f0c 000f0c 000004 00  WA  0   0  4
  [20] .jcr              PROGBITS        08049f10 000f10 000004 00  WA  0   0  4
  [21] .dynamic          DYNAMIC         08049f14 000f14 0000e8 08  WA  6   0  4
  [22] .got              PROGBITS        08049ffc 000ffc 000004 04  WA  0   0  4
  [23] .got.plt          PROGBITS        0804a000 001000 00001c 04  WA  0   0  4
  [24] .data             PROGBITS        0804a01c 00101c 000008 00  WA  0   0  4
  [25] .bss              NOBITS          0804a024 001024 000004 00  WA  0   0  1
  [26] .comment          PROGBITS        00000000 001024 00004d 01  MS  0   0  1
  [27] .shstrtab         STRTAB          00000000 001071 000106 00      0   0  1
  [28] .symtab           SYMTAB          00000000 001628 000440 10     29  45  4
  [29] .strtab           STRTAB          00000000 001a68 000274 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

Again, a bunch of this information isn't too useful for us, but might be later. Some key things to look at:

The .bss section is listed, this is the same as bss in the program memory layout
There is also .data section, same as the program memory layout
Finally, there is a .text section, same as before. And notice that it is at address 0x08048370 which is the same address in the header to the start of the program instructions.

2.2 Getting at the assembly with `objdump`

Now that we have some idea of the how the file is formatted, it would be nice to get down into the details of the machine instructions themselves. For this, we'll use objdump or "object dump". Simply, we can call it on the binary executable like so:

user@si485H-base:demo$ objdump -d helloworld 

helloworld:     file format elf32-i386


Disassembly of section .init:

080482f8 <_init>:
 80482f8:	53                   	push   %ebx
 80482f9:	83 ec 08             	sub    $0x8,%esp
 80482fc:	e8 9f 00 00 00       	call   80483a0 <__x86.get_pc_thunk.bx>
 8048301:	81 c3 ff 1c 00 00    	add    $0x1cff,%ebx
 8048307:	8b 83 fc ff ff ff    	mov    -0x4(%ebx),%eax
 804830d:	85 c0                	test   %eax,%eax
 804830f:	74 05                	je     8048316 <_init+0x1e>
 8048311:	e8 2a 00 00 00       	call   8048340 <__gmon_start__@plt>
 8048316:	83 c4 08             	add    $0x8,%esp
 8048319:	5b                   	pop    %ebx
 804831a:	c3                   	ret    

Disassembly of section .plt:

08048320 <__stack_chk_fail@plt-0x10>:
 8048320:	ff 35 04 a0 04 08    	pushl  0x804a004
 8048326:	ff 25 08 a0 04 08    	jmp    *0x804a008
 804832c:	00 00                	add    %al,(%eax)
	...

It's going to dump a lot of stuff, but lets look more carefully down, we'll see one header that looks familiar, main:

0804841d <main>:
 804841d:	55                   	push   %ebp
 804841e:	89 e5                	mov    %esp,%ebp
 8048420:	83 e4 f0             	and    $0xfffffff0,%esp
 8048423:	83 ec 30             	sub    $0x30,%esp
 8048426:	c7 44 24 1d 48 65 6c 	movl   $0x6c6c6548,0x1d(%esp)
 804842d:	6c 
 804842e:	c7 44 24 21 6f 2c 20 	movl   $0x57202c6f,0x21(%esp)
 8048435:	57 
 8048436:	c7 44 24 25 6f 72 6c 	movl   $0x646c726f,0x25(%esp)
 804843d:	64 
 804843e:	66 c7 44 24 29 21 0a 	movw   $0xa21,0x29(%esp)
 8048445:	c6 44 24 2b 00       	movb   $0x0,0x2b(%esp)
 804844a:	8d 44 24 1d          	lea    0x1d(%esp),%eax
 804844e:	89 44 24 2c          	mov    %eax,0x2c(%esp)
 8048452:	eb 17                	jmp    804846b <main+0x4e>
 8048454:	8b 44 24 2c          	mov    0x2c(%esp),%eax
 8048458:	0f b6 00             	movzbl (%eax),%eax
 804845b:	0f be c0             	movsbl %al,%eax
 804845e:	89 04 24             	mov    %eax,(%esp)
 8048461:	e8 aa fe ff ff       	call   8048310 <putchar@plt>
 8048466:	83 44 24 2c 01       	addl   $0x1,0x2c(%esp)
 804846b:	8b 44 24 2c          	mov    0x2c(%esp),%eax
 804846f:	0f b6 00             	movzbl (%eax),%eax
 8048472:	84 c0                	test   %al,%al
 8048474:	75 de                	jne    8048454 <main+0x37>
 8048476:	b8 00 00 00 00       	mov    $0x0,%eax
 804847b:	c9                   	leave  
 804847c:	c3                   	ret    
 804847d:	66 90                	xchg   %ax,%ax
 804847f:	90                   	nop

This is the assembly for the main function. Looking across, from left to right, the furtherest left is the address this instruction is loaded into, then the actually bytes of the instruction, and then finally the name of the details of the instruction.

The first thing you might notice about the instruction itself is that it is really, really hard to read. That's because it is AT&T syntax, which, simply, sucks! We will use an alternative format called Intel syntax, which is much, much nicer. For that, we need to pass an argument to objdump:

user@si485H-base:demo$ objdump -M intel -d helloworld 

helloworld:     file format elf32-i386


Disassembly of section .init:

080482f8 <_init>:
 80482f8:	53                   	push   ebx
 80482f9:	83 ec 08             	sub    esp,0x8
(... snip ...)

0804841d <main>:
 804841d:	55                   	push   ebp
 804841e:	89 e5                	mov    ebp,esp
 8048420:	83 e4 f0             	and    esp,0xfffffff0
 8048423:	83 ec 30             	sub    esp,0x30
 8048426:	c7 44 24 1d 48 65 6c 	mov    DWORD PTR [esp+0x1d],0x6c6c6548
 804842d:	6c 
 804842e:	c7 44 24 21 6f 2c 20 	mov    DWORD PTR [esp+0x21],0x57202c6f
 8048435:	57 
 8048436:	c7 44 24 25 6f 72 6c 	mov    DWORD PTR [esp+0x25],0x646c726f
 804843d:	64 
 804843e:	66 c7 44 24 29 21 0a 	mov    WORD PTR [esp+0x29],0xa21
 8048445:	c6 44 24 2b 00       	mov    BYTE PTR [esp+0x2b],0x0
 804844a:	8d 44 24 1d          	lea    eax,[esp+0x1d]
 804844e:	89 44 24 2c          	mov    DWORD PTR [esp+0x2c],eax
 8048452:	eb 17                	jmp    804846b <main+0x4e>
 8048454:	8b 44 24 2c          	mov    eax,DWORD PTR [esp+0x2c]
 8048458:	0f b6 00             	movzx  eax,BYTE PTR [eax]
 804845b:	0f be c0             	movsx  eax,al
 804845e:	89 04 24             	mov    DWORD PTR [esp],eax
 8048461:	e8 aa fe ff ff       	call   8048310 <putchar@plt>
 8048466:	83 44 24 2c 01       	add    DWORD PTR [esp+0x2c],0x1
 804846b:	8b 44 24 2c          	mov    eax,DWORD PTR [esp+0x2c]
 804846f:	0f b6 00             	movzx  eax,BYTE PTR [eax]
 8048472:	84 c0                	test   al,al
 8048474:	75 de                	jne    8048454 <main+0x37>
 8048476:	b8 00 00 00 00       	mov    eax,0x0
 804847b:	c9                   	leave  
 804847c:	c3                   	ret    
 804847d:	66 90                	xchg   ax,ax
 804847f:	90                   	nop

0804846d <main>:
 804846d:	55                   	push   ebp
 804846e:	89 e5                	mov    ebp,esp
 8048470:	83 e4 f0             	and    esp,0xfffffff0
 8048473:	83 ec 30             	sub    esp,0x30
 8048476:	65 a1 14 00 00 00    	mov    eax,gs:0x14
 804847c:	89 44 24 2c          	mov    DWORD PTR [esp+0x2c],eax
 8048480:	31 c0                	xor    eax,eax
 8048482:	c7 44 24 1d 48 65 6c 	mov    DWORD PTR [esp+0x1d],0x6c6c6548
 8048489:	6c 
 804848a:	c7 44 24 21 6f 2c 20 	mov    DWORD PTR [esp+0x21],0x57202c6f
 8048491:	57 
 8048492:	c7 44 24 25 6f 72 6c 	mov    DWORD PTR [esp+0x25],0x646c726f
 8048499:	64 
 804849a:	66 c7 44 24 29 21 0a 	mov    WORD PTR [esp+0x29],0xa21
 80484a1:	c6 44 24 2b 00       	mov    BYTE PTR [esp+0x2b],0x0
 80484a6:	8d 44 24 1d          	lea    eax,[esp+0x1d]
 80484aa:	89 44 24 18          	mov    DWORD PTR [esp+0x18],eax
 80484ae:	eb 17                	jmp    80484c7 <main+0x5a>
 80484b0:	8b 44 24 18          	mov    eax,DWORD PTR [esp+0x18]
 80484b4:	0f b6 00             	movzx  eax,BYTE PTR [eax]
 80484b7:	0f be c0             	movsx  eax,al
 80484ba:	89 04 24             	mov    DWORD PTR [esp],eax
 80484bd:	e8 9e fe ff ff       	call   8048360 <putchar@plt>
 80484c2:	83 44 24 18 01       	add    DWORD PTR [esp+0x18],0x1
 80484c7:	8b 44 24 18          	mov    eax,DWORD PTR [esp+0x18]
 80484cb:	0f b6 00             	movzx  eax,BYTE PTR [eax]
 80484ce:	84 c0                	test   al,al
 80484d0:	75 de                	jne    80484b0 <main+0x43>
 80484d2:	b8 00 00 00 00       	mov    eax,0x0
 80484d7:	8b 54 24 2c          	mov    edx,DWORD PTR [esp+0x2c]
 80484db:	65 33 15 14 00 00 00 	xor    edx,DWORD PTR gs:0x14
 80484e2:	74 05                	je     80484e9 <main+0x7c>
 80484e4:	e8 47 fe ff ff       	call   8048330 <__stack_chk_fail@plt>
 80484e9:	c9                   	leave  
 80484ea:	c3                   	ret    
 80484eb:	66 90                	xchg   ax,ax
 80484ed:	66 90                	xchg   ax,ax
 80484ef:	90                   	nop
 (... snip ...)

2.3 Dissasembling with `gdb`

Another way to get the dissambly code is using gdb, the gnu debugger, which also does a tone of other tasks which we will look at later. To start, run the program under the debugger:

user@si485H-base:demo$ gdb helloworld 
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from helloworld...(no debugging symbols found)...done.
(gdb)

This will print out a disclaimer and leave you in a gdb terminal. Now, you can type disassemble main to disassemble the main function. If you set up the alias already as suggested in the resource page, you can shorten that to ds for dissasemble:

(gdb) ds main
Dump of assembler code for function main:
   0x0804841d <+0>:	push   ebp
   0x0804841e <+1>:	mov    ebp,esp
   0x08048420 <+3>:	and    esp,0xfffffff0
   0x08048423 <+6>:	sub    esp,0x30
   0x08048426 <+9>:	mov    DWORD PTR [esp+0x1d],0x6c6c6548
   0x0804842e <+17>:	mov    DWORD PTR [esp+0x21],0x57202c6f
   0x08048436 <+25>:	mov    DWORD PTR [esp+0x25],0x646c726f
   0x0804843e <+33>:	mov    WORD PTR [esp+0x29],0xa21
   0x08048445 <+40>:	mov    BYTE PTR [esp+0x2b],0x0
   0x0804844a <+45>:	lea    eax,[esp+0x1d]
   0x0804844e <+49>:	mov    DWORD PTR [esp+0x2c],eax
   0x08048452 <+53>:	jmp    0x804846b <main+78>
   0x08048454 <+55>:	mov    eax,DWORD PTR [esp+0x2c]
   0x08048458 <+59>:	movzx  eax,BYTE PTR [eax]
   0x0804845b <+62>:	movsx  eax,al
   0x0804845e <+65>:	mov    DWORD PTR [esp],eax
   0x08048461 <+68>:	call   0x8048310 <putchar@plt>
   0x08048466 <+73>:	add    DWORD PTR [esp+0x2c],0x1
   0x0804846b <+78>:	mov    eax,DWORD PTR [esp+0x2c]
   0x0804846f <+82>:	movzx  eax,BYTE PTR [eax]
   0x08048472 <+85>:	test   al,al
   0x08048474 <+87>:	jne    0x8048454 <main+55>
   0x08048476 <+89>:	mov    eax,0x0
   0x0804847b <+94>:	leave  
   0x0804847c <+95>:	ret    
End of assembler dump.

If your output is in AT&T syntax, then issue the command:

(gdb) set disassembly-flavor intel

To have gdb output in Intel syntax.

I'll mostly work with gdb dissambled output because it's more nicely formatted, IMHO. Our next task is understanding what the hell is going on?!?!

3 x86 the Processor Register State

3.1 x86 Instruction Set

Let's start with a simple item, what is x86? It is an assembly instruction set – a programming language. We can repressed x86 terms of it's byte (as seen in the objdump output) or in a human readable form (as seen in the Intel or AT&T) syntax.

You may have worked with an instruction set previously, such as MIPS. MIPS has the property that it is a RISC instrument set, or a Reduce Instruction Set Computing, which has the advantage that all instructions and arguments are always the same size, 32 bits.

x86 is a CISC instruction set, or Complex Instruction Set Computing, and it has the property that instruction sizes are not contest. They can very between 8 bits and 64 bits and more, depending on the instruction. You may wonder, why in the world would anything be designed this way? The answer is market inertia and backwards capability. As Intel chips dominated the market, more and more binary was x86.

Today, another instruction set has become very relevant: ARM or Acron Risc Machine. And, as the name indicates, it is a RISC instruction set and thus is bringing back a bit of sanity to the instruction set world. ARM is also the architecture of choice on many mobile devices, so it will be relevant for quite some time.

However, we will not be working with ARM in this class, just x86, and we will only be using a very small set of the x86 instructions. You can read more about x86 in the extensive online resources, and when we encounter an unfamiliar instruction, we will look it up.

3.2 Anatomy of an Instruction

An instruction, in the human readable format, has the following format:

operation <dst>, <src>

The operation name is the kind of operation that will be performed. For example, it could be an add or mov or and. The <dst> is where the result will be stored, which is typically a register. The <src> is from where the data is read to be operated over which might also include data referenced in the <dst>. The <src> is optional, and depends on the command.

If we take a few operations from our sample code:

0x0804841d <+0>:	push   ebp
0x0804841e <+1>:	mov    ebp,esp
0x08048420 <+3>:	and    esp,0xfffffff0

The first command push takes one argument an places that argument on the stack, adjusting the stack pointer. In this case, it pushes the value of the base pointer stored in the register ebp onto the stack. The second command, mov takes two arguments, and will move a value from one location to another, much like assignment. The second command moves the value in the stack pointer esp and saves it in the base pointer ebp. Finally, the last command is a bitwise and operation taking two arguments. It will perform a bitwise and on the <dst> with the <src> and store the result in <dst>. In this context, the and command aligns the stack pointer with the lowest 4-bit value. The 4-bit alignment is due to an old bug in the division unit of the x86 processor, and so you'll see this sequence a lot in assembly.

We will take a closer look at these instructions again in a second, but before we do that, we need to understand these registers and what they are used for in more detail.

3.3 Processor Registers

Registers are special storage spaces on the processor that store the state of the program. Some registers are used for general purpose storage to store intermediate storage, while others are used to keep track of the execution state, e.g., like what is the next instruction.

Here are the standard registers you will encounter. There are some others, but we'll explain them when we come across them:

esp: 32-bit register to store the stack pointer
ebp: 32-bit register to store the base pointer
eax: 32-bit general purpose register, sometimes called the "accumulator"
ecx: 32-bit general purpose register
ebx: 32-bit general purpose register
edx: 32-bit general purpose register
esi: 32-bit general purpose registers mostly used for loading and storing
edi: 32-bit general purpose registers mostly used for loading and storing

Each of the general purpose registers can be referenced either by there full 32-bit value or by some subset of that, such as the first 8 bits or second 8 bits. For example, eax refers to the 32-bit general registers, but ax refers to the last 16 bits of the eax register and al is the first 8 bits. Depending on the kind of data the register is storing, we may reference different parts.

3.4 The Base Pointer and Stack Pointer

Two registers will be referenced more than any other: the base and stack pointer. These registers maintain the memory reference state for the current execution, with reference to the current function frame. A function frame is portion of memory on the stack that stores the information for a current functions execution, including local data and return addresses. The base pointer define the top and bottom of the function frame.

The structure of a function frame is like such

           <- 4 bytes ->
          .-------------.    
          |    ...      |    higher address
ebp+0x8 ->| func args   |
ebp+0x4 ->| return addr |       
    ebp ->| saved ebp   |
ebp-0x4 ->|             |
   :      :             :              
   '      '             '
            local args
   .      .             .
   :      :             :
esp+0x4 ->|             |
    esp ->|             |    lower addreses
          '-------------'

Moving from higher addresses to lower addresses, the top of the frame stores the function arguments. These are typically referenced in positive offsets of ebp register. For example, the first argument is at ebp+0x8 moving upwards from there.

The second item in the function frame is the return address at ebp+0x4. The value stored in this memory is where the next instruction is after the return statement, or what instruction occurs after the call to this insturction completes. We will spend a LOT OF TIME talking about this later.

Finally, there is the saved ebp, this is the address where the last base pointer for the calling function. We need to save this value so that the calling function's stack frame can be restored onced this function completes.

The stack pointer references the bottom of the stack, the lowest address allocated. Addresses past this point are considered un-allocated. However, it's pretty easy to allocate more space, we'll just subtract from the stack pointer.

3.5 Managing the Stack Frame and the Stack

Now that we have some context for the registers, let's take a look at the first set of instructions in our code:

0x0804841d <+0>:	push   ebp
0x0804841e <+1>:	mov    ebp,esp
0x08048420 <+3>:	and    esp,0xfffffff0
0x08048423 <+6>:	sub    esp,0x30

Let's first analyze the first four instructions. The push instruction will push a value onto the stack, and in this case it is the previous base pointer, ie, the saved based pointer. Next, the base pointer is set to the stack pointer (mov), and then aligned to 4-bits (and). Next, subtracting from the stack pointer allocates the rest of the stack frame, which is 0x30 bytes long or 48 bytes (don't forget about hex).

3.6 Referencing, De-Referencing, and Setting Memory

The next set of instructions entitles the memory of the stack. Let's switch back to the C-code to see this in c first before we look at it in assembly.

char hello[15]="Hello, World!\n";

The string "Hello World!\n" is set on the stack in 15 byte character array. In assembly, this looks like this.

0x08048426 <+9>:	mov    DWORD PTR [esp+0x1d],0x6c6c6548
0x0804842e <+17>:	mov    DWORD PTR [esp+0x21],0x57202c6f
0x08048436 <+25>:	mov    DWORD PTR [esp+0x25],0x646c726f
0x0804843e <+33>:	mov    WORD PTR [esp+0x29],0xa21
0x08048445 <+40>:	mov    BYTE PTR [esp+0x2b],0x0

If you squint at the <src> of the operators, you'll recognize that this is ASCII. If you don't believe, check out the ASCII table. The DWORD or WORD or BYTE PTR are deference commands.

BYTE PTR[addr] : byte-pointer : de-reference one byte at the address
WORD PTR[addr] : word-pointer : de-reference the two bytes at the address
DWORD PTR[addr] : double word-pointer : de-reference the four bytes at the address

Another way to look at these instructions in C would be like this (don't program like this, though):

char hello[15];
//                      l l e H  
* ((int *) (hello)) = 0x6c6c6548;      // set hello[0]->hello[3]
//                          W   , o
* ((int *) (hello + 4)) = 0x57202c6f; // set hello[4]->hello[7]
//                          d l r o      
* ((int *) (hello + 8)) = 0x646c726f; // set hello[8]->hello[11]
//                             \n !
* ((short *) (hello + 12)) = 0x0a21;  // set hello[12]->hello[13]
//                         \0
* ((char *) (hello+14)) = 0x00;  // set hello[14]

The next two ins ructions are a bit different:

0x0804844a <+45>:	lea    eax,[esp+0x1d]
0x0804844e <+49>:	mov    DWORD PTR [esp+0x2c],eax

lea stands for load effective address and is a short cut for to do a bit a math and calculate a pointer offset and store it. If we look at what's next in the C-program, we see that it is setting up the for-loop.

for(p = hello; *p; p++){

The first part of the for loop is initializing the pointer p to refernce the start of the string hello. From the previous code, the start of the string hello is at address offset esp+0x1d and we want to set that address to the value of p. This is a two step process:

The actually address must be computed using addition from esp and stored. lea eax,[esp+0x1d] will calculate the address and store it in eax.
The value in eax must be stored in the memory reserved for p, which is at address esp+0x2c, the move command accomplishes that.

At this point, everything is set up. And for reference, remeber that the address of p is at esp+0x2c.

3.7 Loops, Jumps, and Condition Testing

Now, we've reached the meat of the program: the inner loop. We can follow the execution at this point by following the jumps.

0x08048452 <+53>: jmp    0x804846b <main+78>      # -----------.
0x08048454 <+55>: mov    eax,DWORD PTR [esp+0x2c] # <-------.  |
0x08048458 <+59>: movzx  eax,BYTE PTR [eax]       #         |  |
0x0804845b <+62>: movsx  eax,al                   #         |  |
0x0804845e <+65>: mov    DWORD PTR [esp],eax      #         |  |  //loop body
0x08048461 <+68>: call   0x8048310 <putchar@plt>  #         |  |
0x08048466 <+73>: add    DWORD PTR [esp+0x2c],0x1 #         |  |
0x0804846b <+78>: mov    eax,DWORD PTR [esp+0x2c] # <-------+--'
0x0804846f <+82>: movzx  eax,BYTE PTR [eax]       #         |    //exit condition
0x08048472 <+85>: test   al,al                    #         |
0x08048474 <+87>: jne    0x8048454 <main+55>      #  -------'

A jmp instruction changes the instruction pointer to the destination specified. It is not conditioned, it is explicit hard jump. Following that jump in the code, we find the following three instructions:

0x0804846f <+82>: movzx  eax,BYTE PTR [eax]       
0x08048472 <+85>: test   al,al                    
0x08048474 <+87>: jne    0x8048454 <main+55>

Easier to start with the movzx instruction. Recall that at this point in the code, eax has the value that is the same as p. And you can see that to be case in the previous instruction mov eax,DWORD PTR [esp+0x2c] where esp+0x2c is the memory address for p.

The movzx instruction will deference the address stored in eax which is whatever p references, read one byte at that address and write it to the lower 8-bits of eax. This is essentially the *p operation which is some character in hello, and so what we want to test is if p references the NULL at the end of hello.

That test occurs test al,al which compares to registers in a number of ways. Here we are testing the al register which is the lower 8-bits= of eax, where we stored the deference of p. The results of the test, greater then, less than, equal, not zero, etc. are stored in a set of bit flags. The one we care about is the ZF flag or the zero flag. If al is zero then ZF is set to 1 which would be the case when p references the end of the hello string.

The jne command says to jump when not equal to zero. If it is the case that al is zero, do not jump, otherwise continue to the address and continue the loop.

3.8 Function Calls

If we investigate the loop body, we find the following instructions:

0x08048454 <+55>: mov    eax,DWORD PTR [esp+0x2c] 
0x08048458 <+59>: movzx  eax,BYTE PTR [eax]       
0x0804845b <+62>: movsx  eax,al                   
0x0804845e <+65>: mov    DWORD PTR [esp],eax      
0x08048461 <+68>: call   0x8048310 <putchar@plt>

The first set of instructions, much like the test before, is to deference the pointer p.

load the value o p, a memory address, into eax
Read the byte referenced at p into the lower 8-bits of eax
zero out the remaining bits of eax leaving only lower 8-bits

At this point, eax stores a value like 0x0000048 (i.e, 'H') where the lowest byte is the character of interest, and the remaining bytes are 0.

This value is then writen to the top of the stack as referenced by esp because we are about to make a function call. The arguments to functions are pushed onto the stack before a call. In this case, we allocated that stack space ahead of time so we don't need to push, but the argument is in the right place, at the top of the stack.

The next operation is a call which will execute the function putchar, conveniently told to us by gdb. Once that function completes, execution will continue to the point right after the call, which is the instruction add.

0x08048466 <+73>: add    DWORD PTR [esp+0x2c],0x1

Looking closely at this instruction, you see that this will increment the pointer p, and the instructions following test weather p now references zero. And the loop goes on … as the world turns.