From the previous post, we know how to make an x86-64 ELF program start, but how do we make it stop? Let's make a program that will exit, say with exit code 110.
In C, this would be easy to do:
$ cat >eleventy.c <<'EOF'
#include <stdlib.h>
int main(void) { exit(110); }
EOF
$ gcc eleventy.c -o eleventy
$ ./eleventy; echo $?
110
Our goal is to write some assembly code to do this, so we'll have to unwrap a few layers of niceness that the C standard library provides for us. The first stop is man 3 exit
, which tells us that exit
does a bunch of stuff and eventually calls _exit
, with an underscore in front. The next stop is man 2 _exit
, which tells us:
C library/kernel differences
In glibc up to version 2.3, the _exit() wrapper
function invoked the kernel system call of the same
name. Since glibc 2.3, the wrapper function
invokes exit_group(2), in order to terminate all of
the threads in a process.
I love these "C library/kernel differences" sections that are in the man pages for most system calls. Concise and useful!
Alright, so now we know we should call either the exit
(number 60) or the exit_group
(number 231) system call. (Those numbers come from either searching online or looking at the file arch/x86/entry/syscalls/syscall_64.tbl
in the Linux kernel source tree.)
Next up, the specification in x86-64-psABI-1.0.pdf tells us on page 148 how to actually make a system call in assembly:
(Incidentally, this tells us that the whole errno
thing in C is an abstraction added by C itself; in assembly, the error information is right there in the return value of the system call.)
Summarizing, the only code we need to exit with 110 is:
mov rax, 0xe7 # system call number 231 (decimal) = e7 (hex)
mov rdi, 0x6e # exit code 110 (decimal) = 6e (hex)
syscall
According to the AMD64 Architecture Programmer's Manual, Volume 3, page 421, it's pretty easy to encode the syscall
instruction:
Just the two bytes 0f 05
and we're done.
It's a bit harder to encode the mov
instructions, because there are so many ways of doing it. There are eight variants just for moving a constant into a register (page 225):
If you look closely at the encoding, though, you'll notice that the encodings for the 16-bit, 32-bit and 64-bit variants look the same on the binary level. That can't be right! How does the processor know the size of the operands? The answer is hidden in a few other tables and figures and sections in the manual, which specify how various instruction prefixes modify the meaning of instructions. Table 1-2 on page 8 tells us how:
Our program will be operating in the 64-bit submode of long mode, so to use a 64-bit operand size, we need a REX prefix with REX.W = 1
. After reading the flow-chart on page 2, section 1.2.7 "REX Prefix", and section 2.5.2 "Opcode Syntax", we can finally figure out that mov rax, 0xe7
can be encoded as:
48 b8 e7 00 00 00 00 00 00 00
48
, is the REX prefix which specifies that we want wide operands, because it has the bit REX.W = 1
.
b8
, together with the bit REX.B = 0
, encodes mov rax, {some 64-bit constant}
.
0xe7
, in little-endian format.
But there's more than one way to do it! Let's take this as an invitation to do some premature optimization for code size. We can do the same thing in fewer bytes if we look at figure 2-3 and/or read the surrounding text:
This specifies that the 32-bit register eax
lives in the lower half of the 64-bit register rax
, and when we save a result to eax
the processor sets the upper half of rax
to zero automatically. Our constant has its upper half equal to zero, so we could use the encoding:
b8 e7 00 00 00
b8
and the absence of REX.B
encodes mov eax, {some 32-bit constant}
.
0xe7
, again in little-endian.
Can we do even better? Our constant fits into a single byte, so we could hope to just stuff it into the 8-bit register al
, which lives in the lowest byte of rax
. This would be encoded as:
b0 e7
However, as figure 2-3 above tells us, the rest of rax
doesn't get set to zero when we put something in al
. We could do that ourselves, and the section on the mov
instruction on page 224 tells us:
After reading about the xor
instruction and the encoding of ModRM bytes, we can encode xor eax, eax
(which sets rax
to zero) as:
31 c0
So this brings us down to four bytes for something roughly equivalent to mov rax, 0xe7
:
31 c0 b0 e7
Can we do the same thing for the other value we have to set, mov rdi, 0x6e
? Not really! Figure 2-3 tells us that the register dil
, which lives in the lowest byte of register rdi
, can only be used if we have a REX prefix (so that we can have REX.B = 1
), which would bring us back up to five bytes:
31 ff 41 b7 6e
31 ff
to encode xor edi, edi
, which sets rdi
to zero.
41
is a REX prefix with REX.B = 1
so that we can use the register dil
.
b7
together with REX.B = 1
encodes mov dil, {some 8-bit constant}
.
0x6e
.
However, it turns out that we can encode the instruction mov edi, eax
with only two bytes:
89 c7
This means that we can do the following sequence in just 10 bytes:
31 c0 # xor eax, eax
b0 6e # mov al, 6e
89 c7 # mov edi, eax
b0 e7 # mov al, e7
0f 05 # syscall
and the effect is essentially the same as the sequence we started with, which uses 22 bytes:
48 b8 e7 00 00 00 00 00 00 00 # mov rax, 0xe7
48 bf 6e 00 00 00 00 00 00 00 # mov rdi, 0x6e
0f 05 # syscall
(It's only essentially the same because the 10-byte version affects the status flags before doing the syscall, which we don't care about here. Infinite minutiae!)
Wrap it all up in an ELF file and we have a program that can make a system call:
$ cat >make-eleventy <<'EOF'
#!/usr/bin/python3
s = """
7f 45 4c 46 02 01 01 00
00 00 00 00 00 00 00 00
02 00 3e 00 01 00 00 00
78 00 01 00 00 00 00 00
40 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 40 00 38 00
01 00 40 00 00 00 00 00
01 00 00 00 05 00 00 00
78 00 00 00 00 00 00 00
78 00 01 00 00 00 00 00
00 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00
00 10 00 00 00 00 00 00
31 c0 b0 6e 89 c7 b0 e7
0f 05
"""
open('eleventy', 'wb').write(bytes([int(b, 16) for b in s.split()]))
EOF
$ chmod +x make-eleventy
$ ./make-eleventy
$ chmod +x eleventy
$ ./eleventy
$ echo $?
110