Software Reverse Engineering: Ripping Apart Bomb Binary
Baby, I hate little secrets you keep from me x_x
Hello folks, starting with a “long time no see”. Anyways, this is an article series on reverse engineering the famous bomb binary (originally from CMU Architecture class) compiled for Intel x86–64 processor architecture as a part of Arch1001 by Xeno Kovah in OST2. If you haven’t tried it yet, I recommend trying to diffuse the bomb by yourself and referencing a writeup until absolute necessity. In the end of this article series, we would have crafted the source code for bomb.exe
(at least for the parts relevant to diffusing the bomb binary). Since, it is a long journey to fit in a single article, I’ve broken it down into smaller chunks (linked below in the appropriate chronological order). This allows me to demonstrate the art of software reverse engineering on both Linux (using GNU Gdb)and Windows (using Windbg) platforms.
This writeup is a part of the article series —
- Software Reverse Engineering: Ripping Apart Bomb Binary
(Windbg: Windows)
- Software Reverse Engineering: Diffusing Phase 1
(GNU GDB: Linux)
- Software Reverse Engineering: Diffusing Phase 2
(Windbg: Windows)
- Software Reverse Engineering: Diffusing Phase 3
(Windbg: Windows)
- Software Reverse Engineering: Diffusing Phase 4
(Windbg: Windows)
- Software Reverse Engineering: Diffusing Phase 5
(Windbg: Windows)
- Software Reverse Engineering: Diffusing Phase 6
(GNU GDB: Linux)
Feel free to download any version of bomb binary (Windows or Linux). Both of these can be found here. My reverse engineered solution (in the form of C pseudocode) can be found here (it was originally entirely written while reversing Linux version of this binary using GNU gdb in contrast to what this article shows using Windbg).
NOTE: This article doesn’t intend to teach x86–64 intel assembly programming, debugger usage or any other tool but is merely a walkthrough describing my approach of reversing in Microsoft Windows ecosystem. Also, the way demonstrated here isn’t the most efficient way to RE a software either as it relies solely on dynamic analysis to make the scenario a bit more challenging (a more efficient way would be to combine this methodology with static analysis tools at your arsenal).
Prerequisites
- I recommend using a virtualization platform (Virtualbox or VMware player/workstation) to setup a Windows or Linux OS (depending on the version of bomb binary you’re planning to diffuse).
- Install a debugger on your platform. For Linux version of bomb binary, I would recommend using GNU Gdb which comes preinstalled on most of the modern Linux distributions (if not, install it by
sudo apt install gdb
).
NOTE: For this challenge, I restrict myself to using debuggers alone, and therefore won’t be using any static analysis tool (like IDA Pro or Ghidra). Reader should feel free to follow a hybrid approach.
Poking that binary bomb
Until we are dealing with a binary program that is known to have malicious attributes, its better to start up with dynamic analysis and save some time. Launching bomb.exe
, we see it greets us with a welcome message and hints us about 6 phases we have to diffuse.
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
After greeting, it patiently waits for our input.
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
Take that!BOOM!!!
The bomb has blown up.
For an incorrect input string like the one we entered (Take that!
), it immediately explodes and exits. So, the challenge is to find that correct input string for each phase such that the bomb doesn’t explode !
Analyzing hardcoded strings
I’ll be following a top-down approach (decomposition) while reversing this binary. To get an abstract idea about the program, we can start up by going through any hardcoded strings inside bomb.exe
(using strings.exe) that can be of interest to us.
...
%s: Error: Couldn't open %s
Usage: %s [<input_file>]
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
Phase 1 defused. How about the next one?
That's number 2. Keep going!
Halfway there!
So you got that one. Try this one.
Good work! On to the next...
...
I am just a renegade hockey mom.
%d %d
Wow! You've defused the secret stage!
greatwhite.ics.cs.cmu.edu
angelshark.ics.cs.cmu.edu
makoshark.ics.cs.cmu.edu
passphrase
tmp1
tmp2
So you think you can stop the bomb with ctrl-c, do you?
Well...
OK. :-)
Invalid phase%s
%d %d %d %d %d %d
Error: Premature EOF on stdin
GRADE_BOMB
Error: Input line too long
***truncated***
BOOM!!!
The bomb has blown up.
%d %d %s
DrEvil
Curses, you've found the secret phase!
But finding it and solving it are quite different...
Congratulations! You've defused the bomb!
Stack around the variable '
' was corrupted.
The variable '
...
Notice the hardcoded greeting message we saw while starting bomb.exe
. Another string Usage: %s [<input_file>]
hints us that this program accepts a command line argument which is probably a file location
.
NOTE: To save time, it is given that input can either be supplied either from standard input or a file on disk. Choose whatever feels convenient to you.
Mama, what’s next ?
While reverse engineering any target, it’s often helpful to put yourself in the engineer/developer’s shoes (saves me plenty of time). There are several ways to proceed from here that includes —
- Hooking library calls (using ltrace/LD_PRELOAD method on Linux) and then analyzing if the program compares any hardcoded string using any string comparison function.
- Brute-forcing, i.e. parsing read-only data section and using hardcoded strings (see above output) as input to
bomb.exe
expecting the passphrase to be one among this pool of RO data.
The problem is that we can’t be so sure that the correct string is hardcoded inside binary and even if it is hardcoded, we can’t predict the way it is compared (the author might have implemented his own algorithm rather than relying on library routines). Still one can try !
Launching Windbg and dissecting bomb.exe
Since I like to eliminate assumptions and also there are in total 6 phases (according to greetings prompt) to diffuse, I guess its time to have an insight about bomb.exe
. Below are the steps I followed —
- Launch windbg and load debugging symbols (via
.reload /f
command). - Load the bomb binary by opening (
Ctrl+e
) it inside windbg. - Set a breakpoint on
main()
(since the programmer written code most frequently starts with this symbol) by enteringbp main
and continue execution by typingg
(for go).
- After you hit the breakpoint, unassemble function main by entering
uf main
. Below is the disassembly output for main(). The omitted first half of the main() disassembly essentially checks for user-supplied input file using argc/argv, pushes callee saved registers, initializes local stack buffer viarep stos
and essentially executes some compiler generated instructions influenced by compile-time flags.
NOTE: While there was more to disassembly of main(), I’ve intentionally omitted the output and explanation to some important concepts (like shadow store/space) as it is not relevant to diffusing the phases.
From the disassembly above, we can see calls to each phase we need to diffuse. The source code for main() would look something like below —
Understanding initialize_bomb
Let’s start up with understanding what initialize_bomb()
does as a warmup exercise. Setting up a breakpoint on it (bp initialize_bomb
), we trace into (t
) this function.
At address, we see a call to signal() which is prototyped as —
void __cdecl *signal(int sig, int (*func)(int, int));
which means this function merely registers a signal handler (2nd parameter — func
) for a particular signal number (1st parameter — sig
). So, how do we see what values are passed as parameters to this function ?
Microsoft’s Fast-Call Calling Convention
According to Microsoft’s x64 Application Binary Interface (ABI), a four-register fastcall calling convention is followed. Below is quoted from Microsoft’s documentation itself —
Looking at the disassembly, we can see how the Microsoft’s compiler complies with x64 calling conventions.
mov ecx, 2
instruction places the first integer parameter.lea rdx,[bomb!read_six_numbers+0xf0 (00007ff6`037730b0)]
places the second parameter, i.e. a signal handler function pointer.
therefore, the exact call to signal() is —
signal (2, 00007ff6037730b0);
which means bomb_initialize()
registers a handler function for signal #2, i.e. for SIGINT (keyboard interrupt) which can be issued to console applications by pressing key combination of Ctrl+c
. Below is the pseudocode for initialize_bomb()
.
void initialize_bomb () {
signal (2, sig_handler_func); // sig #2 : SIGINT
}
Reversing the Signal Handler (Optional)
To ensure that author implemented signal handler routine doesn’t do any funny business, lets have a look at its disassembly —
Assuming we know Microsoft x64 calling conventions, we should now be able to read what routines with what parameters are further called by this handler. Below is the pseudocode for your reference.
Epilogue
This article introduces the bird’s eye view of bomb binary (pseudocode for main) following a top-down approach and hopefully clears off a basic idea of how to uncover what we want to uncover. Next, we will get into dissecting phases relevant to solving this challenge. Let’s continue to diffuse phase 1.
Cheers,
Abhinav Thakur
(a.k.a compilepeace)
Connect on Linkedin