This is a sort of continuation of the previous post, the one about malware able to infect right-handed only.
It’s a Msn malware, one of the recent one (as far as I remember I got it from Malware Domain List). I think there’s often something interesting inside a malware, no matter what it does and this is a perfect example!
The malware is not really interesting per se, but it has something I’ve never noticed before. It’s not a cool and dangerous new technique, but a coding behaviour. Look at the graph overview:
The image represents the content of a malware procedure. Nothing strange per se, except the fact that it contains 657 instructions in it, too many for a simple malware. It’s a big routine and I was surprised at first because you can do a lot of things with so many instructions. I started analysing the code, nothing is passed to the routine and nothing is returned back to the original caller. I tought it should be an important part of the malware, but I was disappointed by the real content of the routine. After few seconds I realized what’s really going on: 657 lines of code for doing something that normally would require around 50 lines…
The function contains a block of 17 instructions repeated 38 times. When I’m facing things like that I always have a little discussion with my brain. The questions are:
– why do you need to repeat each block 38 times?
– can’t you just use a while statement?
– is this a sort of anti-disassembling trick?
– can you produce such a procedure setting up some specific compiler’s options?
The repeated block contains the instruction below:
00402175 push 9 ; Length of the string to decrypt 00402177 push offset ntdll_dll ; String to decrypt 0040217C push offset aM4l0x123456789 ; key: "M4L0X123456789" 00402181 call sub_401050 ; decrypt "ntdll.dll" 00402186 add esp, 0Ch 00402189 mov edi, eax 0040218B mov edx, offset ntdll_dll 00402190 or ecx, 0FFFFFFFFh 00402193 xor eax, eax 00402195 repne scasb 00402197 not ecx 00402199 sub edi, ecx 0040219B mov esi, edi 0040219D mov eax, ecx 0040219F mov edi, edx 004021A1 shr ecx, 2 004021A4 rep movsd 004021A6 mov ecx, eax 004021A8 and ecx, 3 004021AB rep movsb
It’s only a decryption routine, nothing more. The string is decrypted by the “call 401050”, the rest of the code simply moves the string in the right buffer.
Ok, let’s try answering the initial questions.
According to some PE scanners the exe file was produced by Microsoft Visual C++ 6.0 SPx.
It’s possible to code the big procedure just using a loop (while, for, do-while) containing the snippet above. I don’t think the author used one of these statements because as far as I know it’s not possible to tell the compiler to explode a cycle into a sequence of blocks. At this point I have to options:
– he wrote the same block for 38 times
– he defined a macro with the block’s instructions repeating the macro for 38 times
I won’t code something like that, but the macro option seems to be the most probable choice.
Is it an anti-disassembling trick? My answer is no because it’s really easy to read such a code. You don’t have to deal with variables used inside a for/while; to understand what’s going on you only have to compare three or four blocks.
I don’t have a valid answer to the doubt I had at first….
Trying to find out some more info I studied the rest of the code. I was quite surprised to see another funny diagram.
This time the image represents the content of the procedure used to retrieve the address of the API functions. Again, no while/for/do-while statement. The rectangle on the upper part of the image it’s a sequence of calls to GetProcAddress, and the code below it’s just a sequence of checks on the addresses obtained by GetProcAddress.
It’s a series of:
address = GetProcAddress(hDLL, "function_name");
followed by a series of:
if (!address) goto _error;
Apart the non-use of a loop there’s something more this time, something that I think reveals an unusual coding style; tha author checks errors at the end of the procedure. I always prefer to check return values as soon as I can, it’s not a rule but it’s something that help you to avoid oversight and potential errors… The procedure has a little bug/oversight at the end, the author forgot to close an opened handle. Just a coincidence?
Anyway, two procedures without a single loop. Seems like the author didn’t use any kind of loop for choice. In case you still have some doubts here’s another cool pictures for you:
The routine inside the picture contains the code used to check if the API(s) are patched or not. The check is done comparing the first byte with 0xE8 and 0xE9 (call and jump). If the functions are not patched the malware goes on, otherwise it ends. As you can see no loops are used.
In summary: it’s not jungle code, it’s not an anti-disasm code and it’s not a specific compiler setting. I think it’s only a personal choice, but I would really like to know why the author used this particular style.
Do you have any suggestions?
Beyond the coding style, the malware has some more strange things. As pointed out by *asaperlo*, the code contains a bugged RC4 implementation (Look at the comments of the previous blog post).
It also has a virtual machine check. The idea is pretty simple, the malware checks the nick of the current user. If the nick is “sandbox” or “vmware” you are under a virtual machine…
This malware spawns another one (it’s encrypted inside the file), it might be material for another post.
That’s a funny coded malware for sure!