Exploit

There are some nice tutorials about malicious Office documents around the web, but as far as I’ve seen so far I dealt an unusual method to hide the shellcode. Great tools like OfficeMalScanner and others are unable to handle this particular scenario, so here is the story of my adventure inside this RTF file.

Header of the malicious file

The first bytes of the file tells me something about the content. It’s a common header for a RTF file document. First of all I tried using RTFScan (which is part of OfficeMalScanner) without luck. The scanner tool is able to recognize an OLE document followed by an object data but it fails to retrieve a possible shellcode from it.

I rapidly decided to put my hands inside the file inspecting the content of the document with an hex editor. The aim of the analysis is to find the shellcode which is executed once the exploit occurs. Looking at the bytes sequence I tried to locate some clues (i.e. a sequence of 0x90 bytes), but I failed miserably. Seems like there’s no trace of particular piece of code resembling a shellcode.

Shellcode should be visible and not encrypted
Taking in mind this concept I started to cut some parts from the RTF file; the idea was to isolate blocks of bytes, in this way it should be easier to recognize the shellcode. I started cutting the header of the file and all the parts containing sequence of strings, they probably don’t represent executable code:

I can remove this sequence of chars

Doing so the size of the file has been reduced a lot but there are still many bytes to check so I decided to proceed investigating around the OLE definition, you can easily locate it searching for D0 CF 11 E0 bytes sequence.

OLE Compound File signature

Some bytes below I reached the objdata definition, it contains the exploit type:

Exploit type: cve-2012-0158

Converting the sequence “4D53436F6D63746C4C69622E4C697374566965774374” to byte sequence you’ll get the string “MSComctlLib.ListViewCt”. I’m almost sure the exploit takes advantage of an old vulnerability in mscomctl.ocx (cve-2012-0158) to execute arbitrary code but as far as I remember it should be “MSComctlLib.ListViewCtrl.2“, where are the other letters? I need ‘r’, ‘l’, ‘.’ and ‘2’.
The answer is inside the “\bin” definition following the incomplete string. The keyword is used to specify a sequence of bytes in hex format, in this specific case the number of byte is 4. The 0x20 byte follows the keyword and it’s not part of the 4 bytes. So, the defined bytes are: 72, 6c, 2e, 32 and they form the substring I need: “rl.2”. After these bytes and before the next bin definition there are some more bytes, they are not in the hex format and they don’t really represent the string “00000000”. They simply are the bytes sequence 00, 00, 00, 00. This mix of binary and non-binary defined bytes represents the flaw of the entire idea used to hide the shellcode.

The obfuscation
Having in mind this particular behaviour I checked the file from another perspective, and I had a great help from it. In the middle of the file there are a lot of “\bin” definitions. Here is the obfusction in action:

Obfuscation of the shellcode

I’m not good with graphic tools but you should get the point. The shellcode is obfuscated inside bin keyword declaration. Two opcodes of the shellcode are taken from the hex bytes inside “bin”, and the other two are right after the bin definition. I’ve finally found out the shellcode!
I don’t know how automatic recognition tools are done, but I can now imagine why they are not able to identify suspicious instructions using specific signatures.

The schellcode
The first instruction of the shellcode is inside the above picture, it’s a call to the procedure starting with the next lines of code:

First lines from the shellcode

The initialization part contains a standard code used to locate a specific dll inside the module list and to retrieve his base address. The Checksum (calculated over the name of the current module) function is used to identify the right dll: kernel32. I’m sure you can predict the next task of the shellcode:

Gets the needed functions from kernel32.dll

It gets the address of the necessary functions that are used inside the shellcode. They are all from kernel32 dll. It doesn’t need anything else.

It tries to load the malicious RTF file in memory

It tries to identify the malicious RTF file

The content of the RTF file is entirely loaded inside a dynamic allocated buffer. To better understand this snippet you should try imagining the right scenario: a vulnerable machine runs the RTF, the exploit occurs and the shellcode will be executed. The RTF is already opened, so the file handle already exists and it’s a concrete value. The shellcode author tries to guess the right value of the file handle. That’s why there are some checks inside the snippet, he/she wants to be sure he’s loading the right file.

It’s pretty easy to understand this part of the code directly from the dead list, but if you want to proceed reversing the shellcode a debugger is almost necessary. Why? Well, the decrypted malicious file contains a snippet that is called directly from the shellcode.
To debug the shellcode like in a real environment I wrote a little piece of code:
char shellcode[] = "\xE8\x8B\x00...\x04\x89\xF2\xC3";

int main(int argc, char **argv) { HANDLE hFile;

// Open the RTF file, necessary to simulate a real scenario hFile = CreateFile("rtf.zai",GENERIC_READ,0,NULL,OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL,NULL);
// Jump to the shellcode code __asm { mov eax, offset shellcode;
push eax;
ret;
} CloseHandle(hFile);
}
Now I can debug everything like in a real environment.

Back to the analysis, the RTF file in memory is decrypted using an algorithm (it’s not a single xor operation but it’s not interesting per se). There’s not much to say about the last part of the shellcode, the most interesting thing is the fact that part of the shellcode is inside the decrypted RTF file. It’s really hard to get the entire shellcode from the malicious document; you may can use a static tool but you sure have to decrypt the file.

To sum up:
1: exploit triggers
2: shellcode starts
3: RTF file loaded inside dynamic allocated memory
4: decryption of the RTF file in memory
5: shellcode continues its execution from the decrypted RTF
6: machine infection

Machine infection
Here is the last task of the malicious document, the most important for a malware author, the infection. Two files are created, the first one is the malware and the other one is just a clean document. Both of them are created inside the temp directory, the malware has a random temporary file name and the document has a fixed name “cv.doc”.
Looking at the list of functions obtained from kernel32.dll by the shellcode you can predict the sequence of functions used to create the two files (GetTempPathA – GetTempFileNameA/lstrcat – CreateFileA – WriteFile – CloseHandle and WinExec).

Once created, the malware is immediately started using WinExec. The same function is used to show the content of the cv.doc file calling winword as a reader. The doc file is a clean version of the malicious RTF file, it does contain the OLE part but the “\object\objocx” section is not inside the file anymore (the exploit/shellcode part has been cutted off).

RTF document content

That’s the content of the fake file, the one used to show something on the screen. According to Google translation it should be the word “Instructions” but I don’t care much about the meaning of it.

That’s all for now, I’ll blog about the malware analysis in a future post, stay tuned!

40296C search_right_handle: 40296C cmp [ebp+hFileMappingObject], 0FFFFh ; hFileMappingObject is initially 0 402973 jnb short loc_4029C1 402975 xor eax, eax 402977 mov [ebp+var_28], eax 40297A push eax ; dwNumberOfBytesToMap 40297B push eax ; dwFileOffsetLow 40297C push eax ; dwFileOffsetHigh 40297D push FILE_MAP_ALL_ACCESS ; dwDesiredAccess 402982 push [ebp+hFileMappingObject] ; hFileMappingObject 402985 call ds:MapViewOfFile 40298B mov [ebp+lpBaseAddress], eax 40298E test eax, eax 402990 jz short MapView_fails 402992 lea ecx, [ebp+var_2C] 402995 push 0 ; ResultLength 402997 push 10h ; SectionInformationLength 402999 push ecx ; SectionInformation 40299A push 0 ; SectionInformationClass 40299C push [ebp+hFileMappingObject] ; SectionHandle 40299F call NtQuerySection ; Retrieves information about the section object 4029A5 cmp [ebp+var_28], SEC_COMMIT 4029AC jz short section_found 4029AE push [ebp+lpBaseAddress] ; lpBaseAddress 4029B1 call ds:UnmapViewOfFile ; Wrong handle, unmap! 4029B7 xor eax, eax 4029B9 mov [ebp+lpBaseAddress], eax 4029BC 4029BC MapView_fails: 4029BC inc [ebp+hFileMappingObject] ; Increments hFileMappingObject 4029BF jmp short search_right_handle

4029CC call RtlAllocateHeap_bridge ; Allocates memory space 4029D1 test eax, eax 4029D3 jz loc_402AA4 4029D9 mov [ebp+var_14], eax 4029DC mov word ptr [eax+2], 1 ; palNumEntries 4029E2 mov word ptr [eax], 300h ; palVersion 4029E7 push eax ; Logical palette 4029E8 call CreatePalette ; Creates a logical palette

402A04 search_object: 402A04 mov eax, [ebp+lpBaseAddress] 402A07 add eax, [ebp+var_24] 402A0A cmp [ebp+GDI_Structure], eax ; Is it inside the mapped memory? 402A0D jnb short loc_402A37 402A0F mov eax, [ebp+GDI_Structure] 402A12 xor ecx, ecx 402A14 mov cx, [eax+4] 402A18 mov edx, [ebp+Pid] ; pGdiEntry->ProcessID 402A1B cmp ecx, edx 402A1D jnz short try_next_structure 402A1F xor ecx, ecx 402A21 mov cx, [eax+0Ah] ; pGdiEntry->nType 402A25 cmp ecx, 8 ; PAL_TYPE 402A28 jnz short try_next_structure 402A2A mov eax, [eax] ; pGdiEntry->pKernelInfo 402A2C mov [ebp+original_KernelInfo], eax ; Saves the original value 402A2F jmp short loc_402A37 402A31 try_next_structure: 402A31 add [ebp+GDI_Structure], 10h ; Moves on the next structure to check 402A35 jmp short search_object

402A3E call _RtlAllocateHeap_bridge ... 402A4E push [ebp+hO] 402A51 pop dword ptr [eax] ; Stores handle obtained calling CreatePalette 402A53 mov dword ptr [eax+14h], 1 402A5A push [ebp+hook_hidden_function] ; push 402AC0 402A5D pop dword ptr [eax+3Ch] ; Stores the real function to call ... 402A6B mov eax, [ebp+GDI_Structure] ; Address of the original structure to replace 402A6E push [ebp+fake_structure] ; Push the fake structure address 402A71 pop dword ptr [eax] ; To tamper!

00402A73 push 0 00402A75 push [ebp+hO] 00402A78 call GetNearestPaletteIndex ... BF94B4AF mov esi, [ebp+8] ; esi -> fake structure created between 402A4E and 402A6E ... BF94B4E0 call dword ptr [esi+3Ch] ; esi+3C points to hook_hidden_function!!!

My infected computer

something strange happens inside it

CVE-2006-5758: better late than ever

Recent Posts

Archives

Categories

Blogroll

Follow me on Twitter