Lab 9 - Anti-RE Techniques (2)

In this lab, you will explore anti-reverse-engineering tricks that malware authors employ so that their malware will evade detection and slow down analysts.

Deliverables

Upload the following items to the Lab 9 Assignment on Canvas when finished.

  • The PDF file containing the required documentation

Part 1 - IP and URL Obfuscation

While performing behavioral analysis on a malware sample, you observe in Wireshark that the malware attempts to contact a specific IP address. Is it always the same IP address, or can the address change? One easy way to find out would be to look for the IP address as a string in the file. If you find that string and track its usage, you can confirm or reject your suspicion that the same IP address is always used.

Launch your REMnux virtual machine and copy over the 1-xxxx.zip malware. Unzip the malware.

Question Answer
Use the strings command-line utility to search the .exe file for ASCII strings. What is the first string found in the file?
Use the strings command-line utility to search the .exe file for ASCII strings. What is the last string found in the file?
Use the strings command-line utility to search the .exe file for ASCII strings. Do you see any IP addresses in the file?

It appears that there is not an obvious IP address string embedded in the file. But perhaps it is just obfuscated? Use the bbcrack tool, part of the Balbuzard toolkit. This script can extract “patterns of interest” from malware files, such as IP addresses, URLs, embedded files, and common strings. It tests a variety of obfuscation techniques, such as XOR, ROL (bit-shifting), ADD, as well as combinations of those techniques. It also uses heuristics when reporting results to allow the malware analyst to prioritize the most successful techniques for producing the desired patterns of interest. The script produces output files of each technique for manual review.

Usage: bbcrack -l 1 malware.exe to test malware.exe with one level of transformation.

Question Answer
Use the bbcrack command-line utility to search the .exe file for obfuscated strings. Ignore the "identity" transformation. What is the name and score of the next highest-ranking transformation?
Tip: Look at the bottom of the output report, in the HIGHEST SCORES section.
Use the bbcrack command-line utility to search the .exe file for obfuscated strings. Ignore the "identity" transformation. How many heuristic-based matches were found with the highest-ranking (non-identity) transformation?
Tip: In the middle of the output report will be a section on each transformation. Look for a section starting with "Found 22 - Any word longer than 6 chars".

The bbcrack utility has placed output files in the current directory. Each file is the original binary with the transformation applied to every byte. As such, most of the file will be binary gibberish, but there should be a few plaintext strings sprinkled throughout. Find the file for your highest-ranking (non-identity) transformation, and then use the strings utility to view those strings.

Question Answer
What URL did you find in the output file?
What offset is this URL in the file?
Tip: Use the --radix=x argument to the strings utility to print the offset of each string as it exists in the file.
What affiliate ID (affid) did you find in the output file? (This may refer to the authors of the malware)
What offset is this affiliate ID in the file?
What obfuscation technique was used to hide these important strings?

Launch your Windows virtual machine and copy over the 1-xxxx.zip malware. Unzip the malware and load it into IDA. You want to find the original obfuscated URL string in IDA, and you have the offset of the string as it exists in the raw PE file. Use the Jump->Jump to File Offset feature to go to that raw offset into the PE file. Do not use the normal Jump->Jump to Address feature, because that looks for an address after the PE file has been processed! (Most notably, all addresses in this PE file are offset by the IMAGEBASE value of 0x140000000, plus additional offsets by section. Let IDA worry about the exact structure and address math).

Question Answer
Take a screenshot of IDA showing the obfuscated URL string as it exists in memory (IDA View-A)
Take a screenshot of IDA in text view showing the assembly code that will deobfuscate the URL string. Based on your prior knowledge above, this code will be obvious. Circle the code in question or crop your image. Note that the deobfuscation is being done twice, for two different strings. Be sure you are looking at the correct string.
Take a screenshot of IDA in graph view showing the assembly code that will deobfuscate the URL string. (This is the same code as above, but the graph view makes the loop obvious). Circle the code in question or crop your image.

You could follow the URL string further in IDA and track its usage throughout the malware.

Part 2 - String Obfuscation on Stack

Another way malware might attempt to hide a string is by assembling it, character by character, onto the stack. These "stack strings" are unlikely to appear in traditional string search tools because the string value is interspersed with assembly instructions.

Launch your Windows virtual machine and copy over the 2-xxxxx.zip malware. Unzip the malware and load it into IDA. Look at the region of code from 0x40133D to 0x4013B8. This code assembles a string character by character onto the stack, and then calls the function sub_401604() with pointers to locations of strings on the stack.

Question Answer
Take a screenshot of IDA showing the region of code from 0x40133D to 0x4013B8.
Take a screenshot of IDA showing the region of code from 0x40133D to 0x4013B8 after you have told IDA to display each hex value being moved onto the stack as an ASCII character.
Tip: Right click on each hex value and choose the ASCII value. Or press the 'R' shortcut key.
What is the complete string?

Manually decoding stack strings is "fun enough", especially when the character order and memory addresses are all jumbled up, but what if you didn't know where to look to find them? Fortunately, search tools exist to locate these patterns.

Launch your REMnux virtual machine and copy over the 2-xxxx.zip malware. Unzip the malware.

Question Answer
Use the strdeob.pl utility to search for potential stack strings in the malware that use a single MOV instruction to place a value onto the stack. What strings are found?
Tip: Usage syntax is: strdeob.pl malware.exe Note that the output is messy
Use FLOSS - the FireEye Labs Obfuscated String Solver - to search for potential stack strings in the malware. What "stack strings" are found?
Tip: Usage syntax is: floss malware.exe > malware-output.txt Open the text output and look for the section "FLOSS extracted n stackstrings" at the bottom of the report.

Note that FLOSS is much more than a simple search script for MOV assembly instructions and stack strings. Rather, it uses heuristics to identify blocks of code that could decode a string, and then emulates the behavior of the program to see what output those code sections could produce. The FLOSS algorithm is described as:

  1. Analyze control flow of malware to identify functions, basic blocks, etc.
  2. Use heuristics to find potential decoding routines
  3. Brute force emulate all code paths among basic blocks and functions
  4. Snapshot emulator state (registers, memory) at appropriate points
  5. Extract arguments to decoder functions from emulator snapshots
  6. Emulate decoder functions using extracted arguments and emulator state
  7. Diff memory state from before and after decoder emulation
  8. Extract human-readable strings from memory state difference

Part 3 - Something Something Malware

Launch your Windows virtual machine and copy over the 3-xxxxx.zip malware. Unzip the malware, and load it into PEStudio.

Question Answer
If one of your behavioral analysts told you "I think this is malware", do you see any artifacts in the original .exe that would support that conclusion?

Launch your REMnux virtual machine and copy over the 3-xxxxx.zip malware. Unzip the malware, and try some of our newly-learned string analysis tools to look for more artifacts. We are specifically looking for a way that the malware could be unpacking itself into something more malicious, despite not including many DLLs or importing library functions of interest. (An alternate method of discovery could involve using Process Monitor to track system calls, and ProcDOT to visualize them.)

Question Answer
strings: Are there any plain ASCII strings corresponding to previously-unseen Windows API functions or DLLs in this file? If so, what are those string(s)?
bbcrack: Are there any XOR, ROT, or ADD obfuscated strings corresponding to previously-unseen Windows API functions or DLLs in this file? If so, what are those strings?
strdeob.pl: Are there any stack strings corresponding to previously-unseen Windows API functions or DLLs in this file? If so, what are those strings?
floss: Are there any custom-obfuscated strings corresponding to previously-unseen Windows API functions or DLLs in this file? (ignore the FLOSS static ASCII strings section, that's just what 'strings' would report). If so, what are those strings?

Let's take one suspicious function that FLOSS found for us, RtlDecompressBuffer, and investigate how it is used in the malware. By its name alone, you could imagine it being used by an unpacking routine. To sidestep the obfuscation in getting to this point, let's investigate this function in the debugger.

Question Answer
Set a breakpoint on the API call of interest: SetBPX RtlDecompressBuffer. What happens when you try to run to that breakpoint?
Tip: Run past the EntryPoint breakpoint. Hit the run button as many times as you want - you'll end up stuck at the same place.

Since the malware is heavily obfuscated, perhaps the authors also employed anti-debugging tricks? Let's try the ScyllaHide plugin. Restart Execution, and go to Plugins->ScyllaHide->Load Profile->VMProtect x86/x64.

Question Answer
Keep your breakpoint on the API call of interest: RlDecompressBuffer. What happens when you try to run to that breakpoint?
Tip: Run past the EntryPoint breakpoint, and if you get another exception, try to Run past it. This malware is a little buggy in recent versions of Windows 10.
Locate the UncompressedBuffer argument to the RtlDecompressBuffer function. This is a pointer to where the resulting, uncompressed data should be stored. Load this buffer in the dump. Prior to the execution of RtlDecompressBuffer, what value is each element of the buffer set to?
Tip: Find which element on the stack has the argument you seek, and then right click and choose Follow in Dump
Execute the RtlDecompressBuffer API call until you return to the main body of the malware executable. Take a screenshot of your Dump window, showing the buffer full of "Data" now.
Tip: Debug->Run to User Code
This buffer doesn't look useful to us. Perhaps it's been decompressed, but still needs to be deobfuscated? Choose Debug->Step Over to selectively advance through the program, while you ponder your life choices that brought you to this point. Take a screenshot of the Dump when the buffer suddenly contains something that is "obviously interesting".

Save the "obviously interesting" data to a file. Right-click on the Dump and choose follow in Memory Map. Don't click, just scroll up and down. The highlighted line in the memory map should correspond to the range of addresses you saw in the Dump. You could switch between the panels to verify this. Once you're sure you have the correct region, right click on the highlighted line and choose Dump to File. Save the file to the desktop, and then open it in PE Studio.

Question Answer
How many libraries were loaded in the original .exe?
How many libraries are loaded in this new file?
How many functions were imported in the original .exe?
How many functions were imported in this new file?
What artifacts do you see in the new file to support the statement of your behavioral analyst who said the file looks like ransomware?

More information on this particular malware sample is available online for those that are curious. It seems to be a bit glitchy in the latest updates to Windows 10 that we use. Perhaps a slightly earlier OS would work without the access violation errors previously encountered in the debugger? It's always good to have a variety of analysis VMs at your disposal: WinXP, Win 7, Win 10. Even better would be VMs both before and after major service packs.