VBInject and VBCrypt are names invented by the industry, but in the underground these tools are generally referred to as RunPE, so I'll use that name. Also it's easier to type.
What makes it so difficult to unpack and analyze a RunPE-packed executable? There are a few key differences between RunPE and traditional packers/crypters/protectors:
- The way that it works. The packed executable re-launches itself as a new process and then overwrites that process's memory. This is drastically different than most other packers which overwrite their own process's memory. Generic unpacking tools are often unequipped to deal with this.
- The unpacking code itself is written in VB6, which ends up as interpreted bytecode (p-code). The only way to figure out the unpacking algorithm is to reverse engineer this code. It's possible to build VB apps as native code, but for whatever reason, many RunPE variants will only run if they're built as p-code and not as native executables. This also makes it harder to reverse engineer... (There are now some versions that use VB.NET, which isn't p-code or native, but .NET CIL)
- It's ridiculously easy to modify and create new versions if you can read VB code, which doesn't exactly take a PhD. Because the packing/unpacking code can be modified so easily, you never know which algorithm will be used to extract the original EXE file without reverse engineering the VB code (which could be p-code, native assembly, or even .NET IL). It could be a simple XOR or an actual decryption or decompression routine. The VB overhead adds an extra layer of obfuscation to get in the way of reverse engineering.
There are TONS of variants of RunPE out there, and every wanna-be packer/crypter writer writes their own. It's VB6, anyone can do it, just search and replace a few strings and you have a custom copy. There are even RunPE generators that contain fabulous graphics:
This one advertises itself as a FUD RunPE generator. FUD means "fully undetectable" to the kids in the underground - it means no AV software will detect your beautiful malware (although, packers advertised as FUD don't stay that way for long). All this thing does is spit out a copy of the RunPE source (a VB.NET version) with randomized class and function names, with some garbage code thrown in. It's enough to fool many AV products. I won't say which ones to avoid insulting my friends who work at those companies.
So where did RunPE come from? Who knows... the first reference I saw to it was early 2009, it started getting really popular in mid-2009, and now everyone is using it. It's been evolving since the first public release, too. For example, the first version used Kernel32.WriteProcessMemory to write memory to the new process. Later versions used Ntdll.NtWriteVirtualMemory to screw up my breakpoints. Newer versions don't even use the regular VB6 DLL importing functionality and actually use thunks written in assembly code to load DLL modules and call external functions the hard way (check out NTRunPE, cNtPEL, etc). Every few months an even-more-obfuscated version will appear.
Behind all of the obfuscation, they all work about the same way:
- Decrypt/unpack/unobfuscate the original EXE file in memory (stored as a byte array in the VB code)
- Call CreateProcess() on a target EXE (usually the same EXE that's currently executing) using the CREATE_SUSPENDED flag. This maps the executable into memory and it's ready to execute, but the entry point hasn't executed yet.
- Call NtUnmapViewOfSection() to unmap the virtual address space used by the new process
- Call VirtualAllocEx() to re-allocate the memory in the process's address space to the correct size (the size of the new EXE)
- Call WriteProcessMemory() to write the PE headers and each section of the new EXE (unpacked in Step 1) to the virtual address location they expect to be (calling VirtualProtextEx() to set the protection flags that each section needs).
- Call SetThreadContext() and then ResumeThread() to start executing the new executable.
To see how this works, look at the relevant code from the original RunPE:
This version didn't perform any encryption or obfuscating of the EXE code, all of the imports were visible from various PE tools, and it used MessageBox calls to report errors. Running strings on the binary would give you a good idea of what's going on. Overall, it's not very stealthy.
Now here's a more recent and highly obfuscated version:
You can see that the new version uses obfuscated DLL calls - so running strings or checking the imports with dumpbin won't give you much of a clue about what this thing is doing. Even running it through IDA with VB6 scripts or a VB6 decompiler won't help much.
That GetProcAddress call isn't making a DLL function call - it's actually traversing the module's import directory in memory. In VB code! That code is beyond the scope of this article, but you can find a version of it here (Google for cNtPEL if the link is dead).
So now we know how RunPE works. Because it extracts the code into a newly-launched process and not it's own process, many auto-unpackers will fail to detect that the sample is even packed - it doesn't modify it's own memory space. It requires some special techniques to unpack. It's not hard though - this is actually one of the easiest packers to unpack once you know how. That's because it doesn't require any rebuilding of import tables, finding the original entry point (OEP), fixing relocations, or realigning the sections.
The trick (using any debugger) is to first set a breakpoint on CreateProcessW. This will let us catch calls to CreateProcessA as well. This breakpoint won't work if someone writes a version of RunPE that talks directly to the CSRSS subsystem to launch a process, but let's just hope that never happens. After breaking on CreateProcessW, we can make sure that the 6th argument (at ESP+24) has bit 0x00000004 set - this is the value for the CREATE_SUSPENDED flag. If this flag isn't set, something else is going on other than RunPE.
Now we can step out of that function and set a breakpoint on NtWriteVirtualMemory (WriteProcessMemory calls that, so we'll catch versions that use that call too). The reason to wait until after CreateProcessW finishes is because NtWriteVirtualMemory is called a few times during the creation of a process - we only want to break when RunPE is calling it. When the breakpoint is set, continue executing. The next break should be when RunPE is writing the new EXE bytes over the new process that was created. The 3rd argument to NtWriteVirtualMemory (at ESP+16) is a pointer to the buffer that's getting copied from. This will be the beginning of the EXE file. Examine the memory with your debugger to make sure it looks like an EXE file.
This EXE file is (probably) in disk format - it hasn't been mapped to memory. RunPE does the mapping itself. This means we can just dump the bytes to file. The tricky part is figuring out how many bytes to write. The quick and dirty method is to use the PE optional header's SizeOfImage field, but that will get you some extra bytes because the on-disk size is almost always less than SizeOfImage. My prefered method is a bit more elegant - I calculate the size of the PE file by using the PtrToRawData and SizeOfRawData members of the last section. It's possible that this can miss bytes in a PE file with stuff tacked on the end, but I haven't ran into any PE files packed with RunPE that do this. In fact, the RunPE code itself stops copying the bytes at the end of the last section.
After the 2nd time I had to go through this process, I wrote an IDC script for IDA to automate the process for me. It would be possible to do this in any scriptable debugger, the Win32 debugging API, or a toolkit like TitanEngine or EasyHook, but IDA has become my debugger of choice lately so I used that. Here's the script:
It will launch the debugger, set some breakpoints, then write the unpacked PE file (if found) to "unpacked.exe". This scripts works on all of the in-the-wild RunPE-packed malware I tried it on, but there are a lot of ways around it. That's why there's error checking and it stops the script if anything strange happens.
When the script finishes, the debugger is still running and both processes are suspended. It's up to you to terminate the debugged process (and the child process), but that part could easily be automated as well.