Exec: a whole new program

Understanding the `execve` system call with x86 assembly

September 25, 2024

If it's one thing I learned today, its that assembly is pretty hard to debug. There's no easy way to just print something, so I must depart with my old comrades, console.log and printf. Time to get acquainted with a new friend, gdb.

I'm continuing the journey into processes. Yesterday I learned about forking, a mechanism where a new process comes into existence by cloning itself -- copying its memory table, where virtual memory maps to physical memory, into a new address space. However, forking doesn't do much other than create this new address space and copy over the heap and stack. It is assumed that most programs running inside newly forked processes should be replaced by a different program.

When you run an executable from a command line shell, for example, the process that is running the shell forks itself, and is then replaced by the executable:

$ ls
foo    bar    baz
$

In the above example, we execute ls from the command line. Once ls has completed, we see the command prompt again.

This all happens with three distinct system calls. The first, namely forking we discussed yesterday. The system call that executes a new program, thereby replacing the current one, is called exec. Finally, a parent process can perform a wait system call, temporarily pausing executing until child process(es) are completed.

Basic exec functionality

Let's quickly see what happens if we use the plain exec call without forking a process. First, let's make a really simple program in C that we will use as our executable that will replace an existing process:

#include <stdio.h>
#include <unistd.h>

int main() {
    int seconds = 5; // Set sleep time in seconds
    printf("Sleeping for %d seconds...\n", seconds);
    sleep(seconds); // sleep() takes seconds in Linux
    printf("Awake now!\n");
	return 0;
}

We can make it executable with

gcc -o sleep sleep.c

Yes, I know this isn't assembly. For simplicity, I want the main executable simple and readable.

Ok now that we can execute our C program, let's see if we can run a separate process and load and run it. We'll do this with the execve system call. In our x86 syscall table this is number 11 or 0xb , and takes three arguments:

const char *filename - a pointer to the name of the file in memory
const char *const *argv - a pointer to the arguments for the exec call
const char *const *envp - a pointer to the environment of the caller (we can ignore this for now)

The first argument we will use ./sleep. Note that this will work because we are in the same directory as the program we'll write. We could also use the absolute of a binary (i.e. /usr/bin/ls). Second will just be the arguments. Since ./sleep has no other arguments, this will just be the string of the executable. If we look at the man pages for exec, we'll also find that we need to add null terminators to all of these arguments. This is very typical in system calls such as these.

Let's store these arguments as initialized variables at the top of our program:

section .data
	filename db './sleep',0
	arg1 db './sleep',0
	argv dd arg1,0
	envp dd 0

In the main section of our program we'll set the arguments of the execve system call in the respective registers. We set eax to 11 or 0xb in hex, ebx to the first argument, and so on:

mov eax, 0x0b
mov ebx, filename
mov ecx, argv
mov edx, envp
int 0x80

We should also exit out of the process if the execve system call does not succeed. Let's put it all together:

section .data
	filename db './sleep',0
	arg1 db './sleep',0
	argv dd arg1,0
	envp dd 0
section .text
	global _start

_start:
	mov eax,0x0b
	mov ebx,filename
	mov ecx,argv
	mov edx,envp
	int 0x80
	
	; exit the program
	mov eax,1
	mov ebx,1
	int 0x80

Once we assemble the program and link object file we can quickly see what happens when we run this in gdb

nasm -f elf32 simple_exec.s -o simple_exec.o
ld -m elf_i386 -o simple_exec simple_exec.o

Inspect with gdb:

gdb simple_exec
(gdb) layout asm
(gdb) break _start
(gdb) run

We can stepi to the int instruction, then step one instruction further and bang! gdb loads and executes an entirely new program suddenly. The kernel has replaced the addresses and loaded code into the existing process and memory space.

Looking forward: combining process `syscalls`

In the program above the entire process was replaced, but this isn't ideal. We need to create new processes and execute them after forking. We also want to make parent process wait for child processes to complete.

Exec: a whole new program

Understanding the execve system call with x86 assembly

September 25, 2024

Basic exec functionality

Looking forward: combining process syscalls

Understanding the `execve` system call with x86 assembly

Looking forward: combining process `syscalls`