This week I suddenly got a burning desire to write some x86 assembly code and have it run on my computer, in as minimal an environment as possible. I decided that it wouldn't be too much to ask my computer to run the code eb fe
(that's two bytes of code, written in hexadecimal), which is just an infinite loop, the equivalent of label: goto label
. No input, no output, not even an exit, just a single observable side-effect: running forever, or at least until ctrl-C.
As a first step to doing this, I installed the excellent QEMU project and spent a day trying to understand which command options to give it to disable most of the bells and whistles it gives you but still load a file with some binary code. Unsuccessfully.
The next day, I vaguely remembered having read an excellent blog post about creating tiny ELF files, so I decided that sticking my two bytes of assembly code in a valid ELF file might be easier. The smart thing would have been to immediately Google for this blog post and read it, tweaking as necessary, but instead I tried to look for specs and found:
man 5 elf
for more details;
less /usr/include/elf.h
for some of the numeric constants;
readelf -a /usr/bin/lefty
and hd /usr/bin/lefty
for a worked example.
Eventually I ended up with a file called make-loopy
:
#!/usr/bin/python3
s = """
7f 45 4c 46 02 01 01 00
00 00 00 00 00 00 00 00
02 00 3e 00 01 00 00 00
78 00 01 00 00 00 00 00
40 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 40 00 38 00
01 00 40 00 00 00 00 00
01 00 00 00 05 00 00 00
78 00 00 00 00 00 00 00
78 00 01 00 00 00 00 00
00 00 00 00 00 00 00 00
02 00 00 00 00 00 00 00
02 00 00 00 00 00 00 00
00 10 00 00 00 00 00 00
eb fe
"""
open('loopy', 'wb').write(bytes([int(b, 16) for b in s.split()]))
It's just a Python script that writes a bunch of hex bytes to a file called loopy
, and the glorious result is:
$ ./make-loopy && ./loopy
...wait for however long you like...
^C
$ readelf --file-header --program-headers loopy
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x10078
Start of program headers: 64 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 1
Size of section headers: 64 (bytes)
Number of section headers: 0
Section header string table index: 0
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000078 0x0000000000010078 0x0000000000000000
0x0000000000000002 0x0000000000000002 R E 1000
It works! It's a valid ELF file! Success! And it only took 120 well-crafted bytes of overhead.
gdb
isn't always helpful
As a side note, my first several attempts at this resulted in
$ ./make-loopy && ./loopy
Segmentation fault
and
$ ./make-loopy && ./loopy
Segmentation fault
and more
$ ./make-loopy && ./loopy
Segmentation fault
but sometimes
$ ./make-loopy && ./loopy
Segmentation fault (core dumped)
instead, before returning some more
$ ./make-loopy && ./loopy
Segmentation fault
so at some point in the process I decided to try using gdb
instead of just poking at the hex values semi-randomly. And while gdb
is sometimes very helpful in telling you exactly what's going on and letting you prod at your code and memory, in this case all I got was:
$ ./make-loopy && gdb ./loopy
Reading symbols from ./loopy...(no debugging symbols found)...done.
(gdb) run
Starting program: ./loopy
During startup program terminated with signal SIGSEGV, Segmentation fault.
(gdb) info registers
The program has no registers now.
(gdb) x 0x10000
0x10000: Cannot access memory at address 0x10000
Unfortunately, the executable was causing segfaults while being loaded, so there was no program state for gdb
to tell me about. À l'impossible nul n'est tenu, so I can't really blame gdb
.
If you're curious about the details, the following version of make-loopy
is probably more informative and tweakable. You too could get your computer to run some lovingly-crafted bytes!
#!/usr/bin/python3.6
magic_bytes = "7f 45 4c 46" # "This is an ELF file"
elf_class = "02" # 64-bit architecture
byte_order = "01" # little-endian
file_type = "02 00" # an executable
architecture = "3e 00" # x86-64
# the memory address where execution starts
entry_point = "78 00 01 00 00 00 00 00"
# mark the program's memory as executable and readable
# (for writable as well, use "07 00 00 00")
code_segment_flags = "05 00 00 00"
# the file offset where the code starts
# (right after the headers)
code_offset = "78 00 00 00 00 00 00 00"
# there are two bytes of code
code_size = "02 00 00 00 00 00 00 00"
# this value works on my machine
alignment = "00 10 00 00 00 00 00 00"
s = f"""
{magic_bytes} {elf_class} {byte_order} 01 00
00 00 00 00 00 00 00 00
{file_type} {architecture} 01 00 00 00
{entry_point}
40 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
00 00 00 00 40 00 38 00
01 00 40 00 00 00 00 00
01 00 00 00 {code_segment_flags}
{code_offset}
{entry_point}
00 00 00 00 00 00 00 00
{code_size}
{code_size}
{alignment}
eb fe
"""
open('loopy', 'wb').write(bytes([int(b, 16) for b in s.split()]))
I only marked a few of the ELF fields, but do look up the Wikipedia page if you're curious about the other values in there. Some points:
entry_point
is 0x10078.
entry_point
and code_offset
have to match (as recorded also in the value of alignment
).
code_size
appears twice in the file. That's because one of them is the code size in the file and the other one is the code size in memory once loaded. The code size in memory can be larger, in which case the end will be padded with zeros, but this only works if you change the code_segment_flags
to make the program memory writable (as well as readable and executable).