Practical Reverse Engineering Solutions – Page 17
my go at the exercises on page 17This blog post presents my solutions to exercises from the book Practical Reverse Engineering by Bruce Dang, Alexandre Gazet and Elias Bachaalany (ISBN: 1118787315). The book is my first contact with reverse engineering, so take my statements with a grain of salt. All code snippets are on GitHub. For an overview of my solutions consult this progress page.
Exercise 1
Given what you learned about
CALLandRET, explain how you would read the value ofEIP? Why can’t you just doMOV EAX, EIP?
MOV EAX, EIP does not work, because EIP not an ordinary register. There is no real need to read the EIP, as is handled for you by the processor.
The CALL instruction places the EIP register onto the stack before jumping to the function address. So the stack entering the function looks like that:

We can therefore get the value of EIP by jumping to a dummy function read_eip (thereby placing EIP at the top of the stack), and then copying the value from the stack memory to a register, i.e., EAX:
SECTION .data
SECTION .text
GLOBAL _start
_start:
nop
call read_eip
mov ebx,0
mov eax,1
int 080h
read_eip:
mov eax, [esp]
ret
Let’s test the code with gdb. The value of EIP before calling read_eip is 0x8048061:
$ nasm -f elf32 -g -F dwarf code.asm $ ld -m elf_i386 -o code code.o phreak@phreak:exercise 1]$ gdb -q code Reading symbols from code...done. (gdb) set disassemble-next-line on (gdb) break *_start Breakpoint 1 at 0x8048060: file code.asm, line 5. (gdb) run Starting program: /home/jb/pre/chapter_1/page_17/exercise_1/code Breakpoint 1, _start () at code.asm:5 5 nop => 0x08048060 <_start+0>: 90 nop (gdb) s 6 call read_eip => 0x08048061 <_start+1>: e8 0c 00 00 00 call 0x8048072 <read_eip> (gdb) p/x $eip $1 = 0x8048061
If we inspect EAX right after the function call we get the value 0x8048066; which now is also the value of EIP.
(gdb) s _start () at code.asm:7 7 mov ebx,0 => 0x08048066 <_start+6>: bb 00 00 00 00 mov $0x0,%ebx (gdb) p/x $eax $3 = 0x8048066 (gdb) p/x $eip $3 = 0x8048066
So in fact we get the EIP after the CALL, which is 5 bytes (the number of bytes for the instruction code CALL) greater than before the CALL.
Exercise 2
Come up with at least two code sequences to set
EIPto 0xAABBCCDD
I know three instructions that manipulate the EIP:
RETJMPCALL
Version 1 – Based on RET
The instruction RET jumps to the address stored at the top of the stack, i.e., sets the EIP to the double word stored at ESP. So by pushing the desired address on the stack, followed by RET, should set the EIP:
SECTION .data
SECTION .text
GLOBAL _start
_start:
nop
push 0AABBCCDDh
ret
We can check with the GNU debugger:
(gdb) s 6 push 0AABBCCDDh (gdb) p/x $eip $1 = 0x8048061 (gdb) s _start () at version_1.asm:7 7 ret (gdb) s 0xaabbccdd in ?? () (gdb) p/x $eip $2 = 0xaabbccdd
Version 2 – Based on JMP
Instead of pushing the address on the stack and using RET to jump to an address, doing a plain JMP also works:
SECTION .data
SECTION .text
GLOBAL _start
_start:
nop
jmp 0AABBCCDDh
Again let’s check with the GNU debugger:
(gdb) s 6 jmp 0AABBCCDDh (gdb) p/x $eip $1 = 0x8048061 (gdb) s 0xaabbccdd in ?? () (gdb) p/x $eip $2 = 0xaabbccdd
Version 3 – Based on CALL
CALL works similar to JMP (compared to version 2 it does an unnecessary push of the EIP to the stack):
SECTION .data
SECTION .text
GLOBAL _start
_start:
nop
call 0AABBCCDDh
In GNU debugger:
(gdb) s 6 call 0AABBCCDDh (gdb) p/x $eip $1 = 0x8048061 (gdb) s 0xaabbccdd in ?? () (gdb) p/x $eip $2 = 0xaabbccdd
Exercise 3
In the example function,
addme, what would happen if the stack pointer were not properly restored before executingRET?
You can see the addme function below, with the referenced instruction highlighted:
SECTION .data
SECTION .text
GLOBAL _start
_start:
nop
mov eax, 7
mov ecx, 5
_before:
push eax
push ecx
call add_me
add esp, 8
_after:
mov ebx,0
mov eax,1
int 080h
add_me:
push ebp
mov ebp, esp
movsx eax, word [ebp+8]
movsx eax, word [ebp+0Ch]
add eax, ecx
mov esp, ebp
pop ebp
retnThe restore is part of the function epilogue, which is standard for C-style functions. Resetting the ESP ensures that any values placed on the stack whithin the function, but not cleaned up, don’t mess with the RET statement. If, for instance, the function would have pushed a value on the stack but never retrieve it, then the RET instruction would jump to this location instead of the EIP. Restoring the ESP prevents this. But if the function properly cleans the stack there is no need to backup and restore the ESP. In the present add_me function there are not instruction that modify the ESP between the prologue and epilogue. So there is no need to restore the ESP, removing the instruction will have no effect.
Here’s validation with the GNU debugger, first with the restore instruction:
$ gdb -q addme_with_restore Reading symbols from addme_with_restore...done. (gdb) break *_before Breakpoint 1 at 0x804806b: file addme_with_restore.asm, line 9. (gdb) break *_after Breakpoint 2 at 0x8048075: file addme_with_restore.asm, line 14. (gdb) run Starting program: /home/baderj/chapter 1/page 17/exercise 3/addme_with_restore Breakpoint 1, _before () at addme_with_restore.asm:9 9 push eax (gdb) p/x $esp $1 = 0xffffd000 (gdb) c Continuing. Breakpoint 2, _after () at addme_with_restore.asm:14 14 mov ebx,0 (gdb) p/x $esp $2 = 0xffffd000
and the same without the restore instruction:
Breakpoint 1, _before () at addme_without_restore.asm:9 9 push eax (gdb) p/x $esp $1 = 0xffffd000 (gdb) c Continuing. Breakpoint 2, _after () at addme_without_restore.asm:14 14 mov ebx,0 (gdb) p/x $esp $2 = 0xffffd000
Exercise 4
In all of the calling conventions explained, the return value is stored in a 32-bit register (
EAX). What happens when the return value does not fit in a 32-bit register? Write a program to experiment and evaluate your answer. Does the mechanism change from compiler to compiler?
I use the following C code:
#include <stdio.h>
struct data
{
int n1;
int n2;
};
struct data test_return(void) {
struct data test_object;
test_object.n1 = 7;
test_object.n2 = 5;
return test_object;
}
int main (int argc, char *argv[] )
{
struct data ret;
ret = test_return();
int res = (ret.n1 + ret.n2);
return res;
}
The struct contains two integer values and should therefore be bigger than 32bit. I use gcc to compile the code:
gcc -fno-asynchronous-unwind-tables -masm=intel -Os -S -m32 code.c
The full output is on GitHub, here’s the function excerpt:
test_return: push ebp mov ebp, esp mov eax, DWORD PTR [ebp+8] mov DWORD PTR [eax], 7 mov DWORD PTR [eax+4], 5 pop ebp ret 4
- Line 2 and 3 are part of the standard function prologue.
- Line 3 gets the value from stack
[EBP + 8]. - Line 4 and 5 store the values 5,7 at the location referenced by
EAX, i.e.,[EBP+8]. - Line 6 and 7 are the function epilogue.
The return value is placed in memory at a location given by the stack [EBP+8]. So in order to use the function, the caller needs to reserve space for the struct in memory, and push the address onto the stack before calling the function. Compiling the c code with -Os flag produces assembly code where the function is never called (since the return value is always 12). To see the call I recompiled the code with -O0. The function now contains unnecessary mov statements, but in essence is the same (see GitHub for full output).

The main function now does call the function:
main: push ebp mov ebp, esp sub esp, 20 lea eax, [ebp-8] mov DWORD PTR [esp], eax call test_return
In Line 4 the call sub esp, 20 reserves 20 bytes on stack. The next two instructions get the address of [EBP-8], and put the value on the stack. The following images shows how the stack changes for the six lines above:

The value at the top of the stack contains the address of the stack memory at ESP+4. The stack before the function epilogue, i.e., after mov DWORD PTR [eax+4], 5 looks like the right hand side of the above image. EAX contains the value of the memory at [EBP+8], and therefore contains the address of the stack at EBP+12. The function places the member n1 of the struct at EAX (= EBP+12) and the member n2 at EAX+4 (= EBP+16).
So long story short, the function places its return value on the stack and returns the address of the stack location to the caller. The caller has to reserve the necessary space on the stack and has to pass the address to that reserved space to the function (doesn’t therefore need to check the return value, the caller knows the address already).
I got very similar results with Clang. Again the caller reserves space for the structure and moves the address to the free space last on the stack (lea edx, dword ptr [ebp - 32], and mov dword ptr [esp], edx):
sub esp, 40 mov eax, dword ptr [ebp + 12] mov ecx, dword ptr [ebp + 8] lea edx, dword ptr [ebp - 32] mov dword ptr [ebp - 4], 0 mov dword ptr [ebp - 8], ecx mov dword ptr [ebp - 12], eax mov dword ptr [esp], edx call test_return
Clang moves more stuff on the stack, but that’s probably a matter of optimization. The function looks almost the same as for GCC:
push ebp mov ebp, esp sub esp, 8 mov eax, dword ptr [ebp + 8] mov dword ptr [ebp - 8], 7 mov dword ptr [ebp - 4], 5 movsd xmm0, qword ptr [ebp - 8] movsd qword ptr [eax], xmm0 add esp, 8 pop ebp ret 4
Instead of moving to stack space below EBP (i.e., at higher addresses), Clang moves the data above the EBP (at lower addresses). The function doesn’t use the pointer passed by the caller, but reserve the space within the function doing sub esp, 8 in line 3.
Archived Comments
Note: I removed the Disqus integration in an effort to cut down on bloat. The following comments were retrieved with the export functionality of Disqus. If you have comments, please reach out to me by Twitter or email.
