execve
system call with x86 assemblyIf it's one thing I learned today, its that assembly is pretty hard to debug. There's no easy way to just print something, so I must depart with my old comrades, console.log
and printf
. Time to get acquainted with a new friend, gdb
.
I'm continuing the journey into processes. Yesterday I learned about forking, a mechanism where a new process comes into existence by cloning itself -- copying its memory table, where virtual memory maps to physical memory, into a new address space. However, forking doesn't do much other than create this new address space and copy over the heap and stack. It is assumed that most programs running inside newly forked processes should be replaced by a different program.
When you run an executable from a command line shell, for example, the process that is running the shell forks itself, and is then replaced by the executable:
$ ls
foo bar baz
$
In the above example, we execute ls
from the command line. Once ls
has completed, we see the command prompt again.
This all happens with three distinct system calls. The first, namely forking we discussed yesterday. The system call that executes a new program, thereby replacing the current one, is called exec
. Finally, a parent process can perform a wait
system call, temporarily pausing executing until child process(es) are completed.
Let's quickly see what happens if we use the plain exec
call without forking a process. First, let's make a really simple program in C that we will use as our executable that will replace an existing process:
#include <stdio.h>
#include <unistd.h>
int main() {
int seconds = 5; // Set sleep time in seconds
printf("Sleeping for %d seconds...\n", seconds);
sleep(seconds); // sleep() takes seconds in Linux
printf("Awake now!\n");
return 0;
}
We can make it executable with
gcc -o sleep sleep.c
Yes, I know this isn't assembly. For simplicity, I want the main executable simple and readable.
Ok now that we can execute our C program, let's see if we can run a separate process and load and run it. We'll do this with the execve
system call. In our x86 syscall table this is number 11
or 0xb
, and takes three arguments:
const char *filename
- a pointer to the name of the file in memoryconst char *const *argv
- a pointer to the arguments for the exec callconst char *const *envp
- a pointer to the environment of the caller (we can ignore this for now)The first argument we will use ./sleep
. Note that this will work because we are in the same directory as the program we'll write. We could also use the absolute of a binary (i.e. /usr/bin/ls
). Second will just be the arguments. Since ./sleep
has no other arguments, this will just be the string of the executable. If we look at the man
pages for exec
, we'll also find that we need to add null terminators to all of these arguments. This is very typical in system calls such as these.
Let's store these arguments as initialized variables at the top of our program:
section .data
filename db './sleep',0
arg1 db './sleep',0
argv dd arg1,0
envp dd 0
In the main section of our program we'll set the arguments of the execve
system call in the respective registers. We set eax
to 11
or 0xb
in hex, ebx
to the first argument, and so on:
mov eax, 0x0b
mov ebx, filename
mov ecx, argv
mov edx, envp
int 0x80
We should also exit out of the process if the execve
system call does not succeed. Let's put it all together:
section .data
filename db './sleep',0
arg1 db './sleep',0
argv dd arg1,0
envp dd 0
section .text
global _start
_start:
mov eax,0x0b
mov ebx,filename
mov ecx,argv
mov edx,envp
int 0x80
; exit the program
mov eax,1
mov ebx,1
int 0x80
Once we assemble the program and link object file we can quickly see what happens when we run this in gdb
nasm -f elf32 simple_exec.s -o simple_exec.o
ld -m elf_i386 -o simple_exec simple_exec.o
Inspect with gdb
:
gdb simple_exec
(gdb) layout asm
(gdb) break _start
(gdb) run
We can stepi
to the int
instruction, then step one instruction further and bang! gdb
loads and executes an entirely new program suddenly. The kernel has replaced the addresses and loaded code into the existing process and memory space.
syscalls
In the program above the entire process was replaced, but this isn't ideal. We need to create new processes and execute them after forking. We also want to make parent process wait for child processes to complete.