I enjoy browsing crackmes.de, and I used to try a crackme from time to time. It’s not my intention to write a tutorial for this crackme, but after reading a comment from andrewl.us I decided to spend some words for this tutorial.
andrewl.us’s comment is pretty simple: “two solutions accepted, and neither uses a deobfuscator that emits a cleaner crackme”. That’s the point, nowadays only few people spend some time writing a detailed/original/complete solution, most of them prefer to say “I did it”… that’s sad and it’s not the spirit of crackmes.de!
Intro
The protection routine created by the author is really simple, but the obfuscation method applied makes our reversing session hard. Look here and you’ll understand how the code works:
406E2A jmp short loc_406E2D
...
406E2D nop
406E2E jmp loc_405B10
...
405B10 nop
405B11 nop
405B12 and al, 0FFh
405B14 jmp loc_4053CC
...
A lot of jumps and a lot of junk code. The real challenge is to identify the valid instructions of the crackme because most of them are only useless junk code. So, to get the valid instruction you can:
1. step all the crackme code using a debugger trying to identify valid and not valid instruction
2. deobfuscate the crackme in some way
Point #1 represents the easy way because you will surely find out the valid instructions checking the code instruction by instruction. In this case it’s a good approach because the crackme is only 35kb… Will you apply this method on a huge obfuscated file? That’s why I opt for point #2!
Due to the nature of the protected code I think it’s not possible to write a static program able to deobfuscate an exe protected with this method, but if you imagine the exe like a set of procedures combined all together you can produce something which is pretty near to the original untouched program.
To solve this crackme I’ll write an idc script able to extract valid instructions from a single procedure. Having this in mind you can then put everything all together obtaining a working deobfuscated crackme.
The deobfuscator
The idea is pretty simple: starting from a specific address the script traces through the necessary instructions trying to identify (and eventually remove) junk code. The script will show all the valid instructions inside the output area.
First of all I have to create the skeleton of the script.
static main()
{
do {
check_current_instruction;
if (!junk_instruction)
print_valid_instruction;
get_address_next_instruction;
}
while (valid_address_instruction);
}
The script checks every single instruction trying to understand if it’s valid or junk. I’ll show you later how to check it, now take a look at the snippet used to print a valid instruction:
static print_valid_instruction(address)
{
auto instrLen;
auto op1, op2;
Jump(address);
MakeUnkn(address, 0);
instrLen = MakeCode(address);
Message("\n%X: %s", address, GetMnem(address));
op1 = GetOpnd(address, 0);
if (op1 != 0x00)
Message(" %s", op1);
op2 = GetOpnd(address, 1);
if (op2 != 0x00)
Message(", %s", op2);
return instrLen;
}
The function receives the address of the current valid instruction printing it inside the output area.
I have used Jump, MakeUnkn and MakeCode because the code produced by IDA has a lot of undefined lines: with an undefined instruction is not always possible to get the right menmonic text associated to a specific opcode.
So, when you have the decoded instruction you can print it with the necessary operands. The function returns the length of the current instruction, used to get the address of the next one to check.
Nothing hard, let’s see how to identify the junk code. The idea is to isolate all the unnecessary instructions. First of all, to have a general idea of the crackme’s code take a look at this blocks:
.jmp:00405B10 90 nop
.jmp:00405B11 90 nop
.jmp:00405B12 24 FF and al, 0FFh <-- al remains the same
.jmp:00405B14 E9 B3 F8 FF FF jmp loc_4053CC
...
.jmp:00408969 40 inc eax <--
.jmp:0040896A 48 dec eax <-- eax remains the same
.jmp:0040896B 90 nop
.jmp:0040896C 90 nop
.jmp:0040896D E9 14 D4 FF FF jmp near ptr unk_405D86
...
.jmp:00405CC3 90 nop
.jmp:00405CC4 80 EB 00 sub bl, 0 <-- bl remains the same
.jmp:00405CC7 E9 34 0E 00 00 jmp near ptr unk_406B00
...
.jmp:00407163 50 push eax
.jmp:00407164 04 66 add al, 66h <-- add inside push/pop it's nonsense...
.jmp:00407166 58 pop eax
.jmp:00407167 E9 0C EB FF FF jmp near ptr unk_405C78
If you execute the snippets above you’ll find out that the state of the involved register remains the same, no changes are applied.
So, what can we put inside the junk code set? All the “jmp” instructions, “nop2, “and” with 0xFF and so on… If you look at the code I think you’ll surely find out what to mark as junk in a short time.
Just to understand it, here is a part of the script used to avoid some specific instructions:
opcode = Byte(currAddress); // Get the byte at the current address
...
else if ((opcode == 0x24) && (Byte(currAddress+1) == 0xFF)) // Avoid "and op1, 0xFF" instruction
currAddress = currAddress + 2; // Move to the next instruction
...
else if ((opcode == 0x60) && (Byte(currAddress+2) == 0x61)) // Avoid instructions from pushad to popad
currAddress = currAddress + 3;
...
else if (opcode == 0x90) // Avoid nop instruction
currAddress = currAddress + 1;
...
if (opcode == 0xE9) // Avoid jump instruction
currAddress = (currAddress + Dword(currAddress+1) + 5) & 0x00000000FFFFFFFF;
Instead of printing the instruction I simply avoid it moving the attention to the next one. You don’t need to use MakeCode or something else because the script works directly on the bytes code.
Ok, is it really all? Hmm, there is something more to add. You have to pay particular attention to a point: how can you manage a conditional jump?
The question deserves a reply because sooner or later the crackme will have a check and you’ll have to face a conditional jump. I decided to let you choose the path to follow! What does it mean?
I remove all the conditional jumps except the one which is after a compare instruction. I need to remember that a compare has been executed:
else if (opcode == 0x3B) { // Compare found, take it in mind!
cmp = 1;
currAddress = currAddress + print_valid_instruction(currAddress);
}
The variable “cmp” is used to take in mind that a compare instruction has been executed. cmp variable is originally initialized to -1. Here is how I handle a conditional jump instruction:
else if ((opcode == 0x74) || (opcode == 0x75)){
// Check to see if a cmp instruction has been executed
if (cmp == 1) {
Jump(currAddress);
jump = AskYN(0, "Want to jump?"); // Box with a question
if (jump == 0) // No jump
currAddress = currAddress + 2;
else // Jump
currAddress = (currAddress + Byte(currAddress+1) + 2) & 0x00000000FFFFFFFF;
cmp = -1; // Restore the original value
}
else
currAddress = (currAddress + Byte(currAddress+1) + 2) & 0x00000000FFFFFFFF;
}
That’s what I meant when I said “I decided to let you choose the path to follow”. Pressing Yes or No inside the appeared box you can decide the flow of the crackme code, it’s just like a live trace.
I decided to not print the conditional jump but if you think it will help you put it inside the set of valid instructions.
That’s how my deobfuscator works (get it from here). I don’t need anything else because with the output produced by the script I can study the entire protection routine via dead list without having to debug the program.
For those who wants to study the deobfuscated crackme using a debugger it’s possible to produce a clean crackme too. Instead of printing the result in the output area you could move the valid instructions somewhere inside the file. There are a lot of free bytes to use inside the exe; just choose a starting point and with some minor adjustments, as I suggested before combining all the deobfuscated procedures, you can have a working clean crackme version.
The protection routine
The algorithm used by the crackme is really simple, but it requires a brute-force approach…
4092BA: mov cx, 54h
408B0D: shl ecx, 10h
405507: mov cx, 5854h
405680: push ecx <-- push "TXT
4085F4: mov cx, 2E59h
407CC1: shl ecx, 10h
405B4C: mov cx, 454Bh
4087F4: push ecx <-- push "KEY."
406705: mov edx, esp <-- edx -> "KEY.TXT"
40881F: mov cx, 40h
407FAF: shl ecx, 10h
408603: mov cx, 812Fh
405435: push ecx <-- push 0x40812F: first instruction to execute after CreateFileA
4078B7: xor eax, eax
408EDE: push eax
407E92: mov al, 50h
408810: push eax <-- Flags & attributes: 0x50: FILE_ATTRIBUTE_DIRECTORY | FILE_ATTRIBUTE_DEVICE
4070EB: mov al, 3
4052CF: push eax <-- OPEN_EXISTING
405B5C: mov al, 0
408441: push eax
406E3B: push eax
40868B: mov al, 50h
40610A: shl eax, 18h
408AF1: push eax <-- Desired access: 0x50000000: GENERIC_WRITE | GENERIC_ALL
4080CC: push edx <-- Filename: "KEY.TXT"
4092CE: call CreateFileA
4092D3: ret
From the readme file we already know that the protection routine is based on a keyfile, now we know the name of the file to produce: KEY.TXT
At the beginning of the tutorial I told you that it’s not possible to write a fully working static deobfuscator, and now you can understand why: look at the push at 405435. The address on the stack (0x40812F) is the point where the precedure will return after ret instruction at 4092D3. How can you identify the address as a possible return value? Well, maybe possible, but hard…
Ok, back to our protection routine: after executing CreateFileA function, the code pass from instruction 40812F:
4077E5: xor edx, edx
4068B9: dec edx
407FDC: cmp edx, eax <-- check
4092D9: call ExitProcess <-- here if check fails
The snippet above contains a first check which is done on the keyfile handle. If the file doesn’t exist the crackme ends. As you’ll see there are some more checks performed by the crackme, all of them end with a call to ExitProcess without a single error message.
To pass this check create the file and put some bytes inside. As you can guess the next piece of code will be used to read the content of the keyfile:
407FDC: cmp edx, eax
406372: xor edx, edx
406C87: push edx <-- push 0
40909F: push edx <-- push 0
405DD3: push edx <-- push 0
405D95: mov ebp, esp
406FFC: push edx <-- push 0
405B01: mov esi, esp
407426: push eax <-- push file handle
407235: mov cx, 40h
406263: shl ecx, 10h
406F38: mov cx, 5ADDh
40759B: push ecx <-- push 0x405ADD: first instruction to execute after ReadFile
40778A: push edx
4089C3: push esi <-- bytes read
407181: push 18h
4070DC: push ebp <-- buffer that receives the data
408E56: push eax <-- file handle
4092C8: call ReadFile
4092CD: retn
An obvious conseguence of CreateFile: ReadFile. It’s time to see how it will use the keyfile’s bytes.
405426: mov esp, ebp <-- ESP points to the contents of the keyfile
40764F: mov cx, 0F45Ah
40782F: shl ecx, 10h
405DA4: mov cx, 675Dh
4066E6: mov esi, ecx <-- ESI = 0xF45A675D
407B95: mov cx, 4DDAh
40892D: shl ecx, 10h
407CDF: mov cx, 0FA31h
406B4C: mov edi, ecx <-- EDI = 0x4DDAFA31
406083: pop eax <-- EAX = first 4 bytes of the file (Key_1)
405999: pop ecx <-- ECX = second 4 bytes of the file (Key_2)
405A04: push ecx
405EC3: push eax
407B59: xor eax, esi <--
4075AA: xor ecx, edi <-- operations based on Key_1 and Key_2
4067C8: add eax, ecx <--
406F47: mov cx, 0BAE3h
406F83: shl ecx, 10h
406BB4: mov cx, 0DC73h <-- ECX = 0xBAE3DC73
405544: cmp eax, ecx <-- first check
This first check is based on the first eight bytes of the keyfile. The scheme is:
(Key_1 ^ 0xF45A675D) + (Key_2 ^ 0x4DDAFA31) = 0xBAE3DC73
Nothing hard per se but without other informations if you want to solve this equation you are forced to write a brute force algorithm over key_1 and key_2. Really expansive in terms of time…
408DCE: mov cx, 0AADDh
40846E: shl ecx, 10h
406E48: mov cx, 357Dh
408324: mov esi, ecx <-- ESI = 0xAADD357D
408DEC: mov cx, 44FAh
405FED: shl ecx, 10h
405B2E: mov cx, 0FC3Ch
4059A9: mov edi, ecx <-- EDI = 0x44FAFC3C
4069F3: pop eax <-- EAX = Key_1
4074BB: pop ecx <-- ECX = Key_2
4085A9: push ecx
40919F: push eax
4051FC: xor eax, esi <--
408EB0: xor ecx, edi <-- operations based on Key_1 and Key_2
409143: add eax, ecx <--
407901: mov cx, 0F23Fh
40893C: shl ecx, 10h
406335: mov cx, 2C88h <-- ECX = 0xF23F2C88
406B87: cmp eax, ecx <-- second check
Another check over the keyfile’s content. As you can see it’s pretty similar to the previous one:
(Key_1 ^ 0xAADD357D) + (Key_2 ^ 0x44FAFC3C) = 0xF23F2C88
2 equations and 2 variables, we can reduce the time of the brute-agorithm because you can test all possible values of key_1 only (key_2 is obtained as a conseguence).
407E66: pop esi <-- ESI = Key_1
407FCD: pop edi <-- EDI = Key_2
407228: pop ebx <-- EBX = third 4 bytes of the file (Key_3)
408144: mov cx, 2164h
406AF1: shl ecx, 10h
406A4C: mov cx, 3172h <-- ECX = 0x21643172
40602A: cmp ecx, ebx <-- third check
4 more bytes join the party. This time no math or logic operations but a simple compare byte to byte: Key_3 must be “r1d!”. We know part of the keyfile, but this information doesn’t reduce our brute-time…
408C84: mov cx, 6F04h
406290: shl ecx, 10h
4091E8: mov cx, 530Ah <-- ECX = 0x6F04530A
406F65: xor ecx, edi <-- Key_2 ^ 0x6F04530A
406F57: push ecx
407514: mov cx, 0F0Fh
405408: shl ecx, 10h
40597B: mov cx, 101Bh <-- ECX = 0x0F0F101B
40819E: xor ecx, esi <-- Key_1 ^ 0x0F0F101B
406BD3: push ecx
408D0B: mov eax, esp
40638F: xor ebp, ebp
405284: push ebp
407AD2: push ebp <-- Caption: NULL
408EBF: push eax <-- Text:
406E95: push ebp <-- 0
4092D4: call MessageBoxA <-- CONGRATULATION BOX
4092D9: call ExitProcess
The final part of the crackme, it shows a congratulation message box. The text of the box is obtained using the correct value for Key_1 and Key_2.
To sum-up: there are 3 checks to pass, the first two are based on Key_1 and key_2 while the last check is over key_3 only. Key_3 values is fixed and I have to find the first two dwords (represented by Key_1 and Key_2) using a brute. The solution is not unique, you can provide some different keyfiles but I think the Valid solution is the one wich reveal the right text inside the message box. For me the valid text is “Success” that is obtained with keyfile = “Hello wor1d!” but who knows… :)
Final notes
It’s indeed an easy crackme, but it represents a good starting sample for those who have never played with an obfuscated target.