Yesterday, MalwareTech posted two shellcode challenges. I spent some time going through the first challenge and wrote a walkthrough of it here. I recommend reading it first, as some of the code is similar. This is a walkthrough of the second challenge using IDA PRO and Python.
Here is part 2 which follows on from shellcode1 (and turns up the heat a little).https://t.co/n26hPHyiaS
— MalwareTech (@MalwareTechBlog) May 24, 2018
Just like the first challenge, the ZIP file contained a README and the actual binary. The README file informs the user that this challenge is meant to be solved statically and to not use a debugger. Like challenge 1 it outputs the MD5 of the flag when run.
Jumping right into the code, I open up IDA and see that the start function contains a similar structure to the first challenge. Since I know that the binary will print out the MD5 of the flag when executed, I immediately look at the MessageBox call towards the bottom of the function, and see that the decrypted flag is stored in the variable var_28.
Moving back to the top of the function, the first thing the binary does is load a large amount of useless data into var_28 (which we know contains the flag in the end).
It then allocates space for a heap and adds four things to the heap: LoadLibrary, GetProcAddress, var_28, and the number 36.
With the heap finished, the binary moves onto allocating some space in memory, copying the shellcode into the new memory, and executing it. This technique is exactly how the last challenge executed its shellcode.
Now, I step into the shellcode and take a look at what’s being executed. Unlike the last challenge, the shellcode here is much larger and constitutes the majority of what the binary does. After analyzing it, I can break it down into three main parts: dynamic imports, file operations, and a decoder.
The first part is the dynamic imports. This is at the top of the shellcode, where it moves characters one at a time into variables. This technique is often used to hide strings from basic static analysis. After going through each one, I can see the hidden strings: msvcrt.dll, kernel32.dll, fopen, fread, fseek, fclose, GetModuleFileNameA, and rb. (If you’re following along and wondering how I see characters, when you see hex, select each hex value in IDA and press R on your keyboard to have the data represented as an ASCII character.)
Now that the strings are known, I can discover how they are being used. The first thing that this code does is move LoadLibrary into ebp-4 and GetProcAddress into ebp-44. These two variables are then called multiple times in order to load more imports, all of which have to do with file operations.
With all of the imports added, the code moves on to the next part – the file operations. This section of code uses the imports from above in order to work. It starts out with getting a handle on itself (GetModuleFilename), opens up the binary (fopen), goes to the 78th offset (fseek), reads in the next 38 bytes of data into a buffer (fread), and finally closes the file when it’s done (fclose).
In order to find which bytes will be read into the buffer, I open up a hex editor, locate the 78th position, and copy out the next 38 bytes.
With these bytes in the buffer, the code moves on to the last section – the decoder. This section iterates through the entire buffer and XORs it with the corresponding location in var_28, which I know contains the FLAG.
At this point, I know what the encoded flag is, and what the flag is being XOR’d with. I then write this short python script (below) in order to decode the flag.
flag = ["Encoded Flag"] buffer = ["Buffer"] final = "" for x in range(0,36): print (buffer[x]) print (flag[x]) final += hex(buffer[x] ^ flag[x]) print (final)
I now copy the flag located at var_28, and the data held inside the buffer into my script and run it to receive the flag.