Crackmes.de – san01suke's SomeCrypto~01
- July 14, 2014
- reverse engineering
- crackmes
- no comments
- The User Interace
- Looking for the Good Boy Message
- A First Look at loc_401000
- The Function Parameter
- Valid Serial Characters
- Valid Serial Length
- Copy String *byte_403010* to *byte_403140*
- Check Character(s) in *byte_403140*
- Substitution Cipher (Very Gently Obfuscated)
- Wrapping up the Encryption Loop
- Hashing the Plaintext Message
- Finding the Plaintext Message
The author san01suke submitted three crackmes to www.crackmes.de on July 1st. This is my attempt to solve the first one, called SomeCrypto~01. You can view and download the crackme here. The short description simply says:
“Just solve this simple crackme”
− san01suke
The User Interace
The user interface has two input boxes that invite you to enter a name and a serial:
There are no buttons to validate the input; the crackme probably checks the input whenever the content of the text boxes changes or periodically with a timer event. There is no feedback when you enter a wrong name/serial combination.
Looking for the Good Boy Message
Let’s open the crackme in IDA and switch to the Strings subview with Shift+F12
:
The string Success
looks promising, double click it to get to the definition in the source code:
Putting the cursor on the variable name and hitting X
brings us to the only code location that references the success string:
.text:0040129D call loc_401000 .text:004012A2 add esp, 4 .text:004012A5 mov byte_403270, al .text:004012AA test al, al .text:004012AC jz short loc_4012CA .text:004012AE mov ecx, [esp+100h+lpText] .text:004012B2 push 0 ; uType .text:004012B4 push offset Caption ; "Success"
Lines 4 and 5 check if al
is zero and jump over the “Success” part if al
is non zero. The register eax
is probably set by the subroutine loc_401000
, at least it is the common register to hold return values. So the task is to find name/serial combinations for which the subroutines loc_401000
does not return null.
A First Look at loc_401000
Double clicking the label loc_401000
brings us to the loc_401000
subroutine. Let’s have a quick look at the entire code before starting to analyze it in detail:
loc_401000: ; CODE XREF: DialogFunc+1CDp .text:00401000 push ebp .text:00401001 mov ebp, esp .text:00401003 mov al, [ecx] .text:00401005 sub esp, 20h .text:00401008 push esi .text:00401009 xor esi, esi .text:0040100B test al, al .text:0040100D jz loc_4010C6 .text:00401013 lea edx, [ebp-20h] .text:00401016 sub edx, ecx .text:00401018 .text:00401018 loc_401018: ; CODE XREF: .text:00401032j .text:00401018 cmp al, 61h .text:0040101A jl loc_4010C6 .text:00401020 cmp al, 7Ah .text:00401022 jg loc_4010C6 .text:00401028 mov [edx+ecx], al .text:0040102B mov al, [ecx+1] .text:0040102E inc ecx .text:0040102F inc esi .text:00401030 test al, al .text:00401032 jnz short loc_401018 .text:00401034 cmp esi, 1Ah .text:00401037 jnz loc_4010C6 .text:0040103D xor eax, eax .text:0040103F nop .text:00401040 .text:00401040 loc_401040: ; CODE XREF: .text:0040104Fj .text:00401040 mov cl, byte_403010[eax] .text:00401046 mov byte_403140[eax], cl .text:0040104C inc eax .text:0040104D test cl, cl .text:0040104F jnz short loc_401040 .text:00401051 xor ecx, ecx .text:00401053 cmp byte_403140, cl .text:00401059 jz short loc_401088 .text:0040105B jmp short loc_401060 .text:0040105B ; --------------------------------------------------------------------------- .text:0040105D align 10h .text:00401060 .text:00401060 loc_401060: ; CODE XREF: .text:0040105Bj .text:00401060 ; .text:00401086j .text:00401060 mov al, byte_403140[ecx] .text:00401066 cmp al, 61h .text:00401068 jl short loc_40107E .text:0040106A cmp al, 7Ah .text:0040106C jg short loc_40107E .text:0040106E .text:0040106E loc_40106E: ; DATA XREF: start:loc_4012D5w .text:0040106E push cs .text:0040106F mov esi, 5948AC0h .text:00401074 .text:00401074 loc_401074: ; CODE XREF: .text:loc_401074j .text:00401074 jg short near ptr loc_401074+1 .text:00401074 ; --------------------------------------------------------------------------- .text:00401076 dw 0FFFFh .text:00401078 ; --------------------------------------------------------------------------- .text:00401078 mov byte_403140[ecx], dl .text:0040107E .text:0040107E loc_40107E: ; CODE XREF: .text:00401068j .text:0040107E ; .text:0040106Cj .text:0040107E inc ecx .text:0040107F cmp byte_403140[ecx], 0 .text:00401086 jnz short loc_401060 .text:00401088 .text:00401088 loc_401088: ; CODE XREF: .text:00401059j .text:00401088 or eax, 0FFFFFFFFh .text:0040108B mov edx, offset byte_403140 .text:00401090 test ecx, ecx .text:00401092 jz short loc_4010AD .text:00401094 .text:00401094 loc_401094: ; CODE XREF: .text:004010ABj .text:00401094 movzx esi, byte ptr [edx] .text:00401097 xor esi, eax .text:00401099 and esi, 0FFh .text:0040109F shr eax, 8 .text:004010A2 xor eax, ds:dword_402058[esi*4] .text:004010A9 inc edx .text:004010AA dec ecx .text:004010AB jnz short loc_401094 .text:004010AD .text:004010AD loc_4010AD: ; CODE XREF: .text:00401092j .text:004010AD not eax .text:004010AF cmp eax, 0F891B218h .text:004010B4 jnz short loc_4010C6 .text:004010B6 mov eax, [ebp+8] .text:004010B9 mov dword ptr [eax], offset byte_403140 .text:004010BF mov al, 1 .text:004010C1 pop esi .text:004010C2 mov esp, ebp .text:004010C4 pop ebp .text:004010C5 retn
The code isn’t overly long and looks reasonable. There’s one exception though in lines 56 to 58:
; --------------------------------------------------------------------------- .text:00401076 dw 0FFFFh .text:00401078 ; ---------------------------------------------------------------------------
IDA wasn’t able to convert these two bytes to code and displays it as data. So either the snippet has a data section inside the code, or the code is self-modifying. We will come to this later.
The Function Parameter
The first nine lines of the function are:
loc_401000: ; CODE XREF: DialogFunc+1CDp .text:00401000 push ebp .text:00401001 mov ebp, esp .text:00401003 mov al, [ecx] .text:00401005 sub esp, 20h .text:00401008 push esi .text:00401009 xor esi, esi .text:0040100B test al, al .text:0040100D jz loc_4010C6
After the standard function prologue we find a reference to register ecx
. Since the register wasn’t set within the subroutine, it must be a function argument. If we go back to the caller we see that ecx holds the address esp+104h+var_80
. This address is also used as the lpString
argument to the GetDlgItemTextA
call (line 5):
.text:00401281 lea edx, [esp+104h+var_80] .text:00401288 push edx ; lpString .text:00401289 push 3EAh ; nIDDlgItem .text:0040128E push esi ; hDlg .text:0040128F call edi ; GetDlgItemTextA .text:00401291 lea eax, [esp+100h+lpText] .text:00401295 push eax .text:00401296 lea ecx, [esp+104h+var_80] .text:0040129D call loc_401000
So ecx
points to either the name or serial. Which one it is can be determined with OllyDbg. Set a breakpoint at 0040128F
, then run the executable and inspect the ecx
register:
So ecx
points to the serial
. The subroutine loc_401000
then creates a stack frame for local variables in line 5. After that it checks whether the first character of the serial number is 0, i.e., if the serial number is an empty string. If the serial is empty, the subroutine jumps to loc_4010C6
and returns 0 (which means we failed). Let’s continue assuming the serial is a non empty string.
Valid Serial Characters
These are the next few lines of the subroutine:
.text:00401013 lea edx, [ebp-20h] .text:00401016 sub edx, ecx .text:00401018 .text:00401018 loc_401018: ; CODE XREF: .text:00401032j .text:00401018 cmp al, 61h .text:0040101A jl loc_4010C6 .text:00401020 cmp al, 7Ah .text:00401022 jg loc_4010C6 .text:00401028 mov [edx+ecx], al .text:0040102B mov al, [ecx+1] .text:0040102E inc ecx .text:0040102F inc esi .text:00401030 test al, al .text:00401032 jnz short loc_401018
The code first loads the address of the beginning of the stack frame to edx
, and then subtracts ecx
. After that follow two comparisons of al
with constants 61h
and 7Ah
. The register al
was set in line 4 (mov al, [ecx]
) and holds the first character of the serial. The two constants are the ASCII codes for a
and z
respectively. So the two checks make sure that the character in al
is one of the 26 lowercase letters. If not, the code jumps loc_4010C6
which we already know returns 0 (failure). After the two checks, the code copies the character in al
to [edx+ecx]
(line 18). From lines 10 and 11 we know that this memory location is in fact [ebp-20h]
. Line 19 loads the next character from the serial, and line 20 sets the pointer ecx
to this character. Line 21 increments esi
which was set to 0 in line 7. Line 22 is a check to see if the character is the null-byte. If not, the code jumps back to loc_401018 for a next iteration. The snippet implements the loop:
char serial_copy[32] // in [ebp-20h] esi = 0 DO c = serial[esi] IF NOT 'a' <= c <= 'z' THEN RETURN 0 // failure ENDIF serial_copy[esi] = c esi += 1 WHILE c != '\0'
To summarize: the snippet copies the serial to [ebp-20h]
. It also makes sure that the serial only contains lower case characters.
Valid Serial Length
These lines come next:
.text:00401034 cmp esi, 1Ah .text:00401037 jnz loc_4010C6
The first line compares esi
to 1Ah = 26. The register esi
holds the index of the null byte into the serial string, which corresponds to the length of the string. So the snippet checks if the serial has exactly 26 letters. If it doesn’t, the snippet jumps to the well known failure location loc_4010C6
:
IF len(serial) != 26 THEN RETURN 0 // failure ENDIF
Copy String byte_403010 to byte_403140
The next lines are easy to decompile:
.text:0040103D xor eax, eax .text:0040103F nop .text:00401040 .text:00401040 loc_401040: ; CODE XREF: .text:0040104Fj .text:00401040 mov cl, byte_403010[eax] .text:00401046 mov byte_403140[eax], cl .text:0040104C inc eax .text:0040104D test cl, cl .text:0040104F jnz short loc_401040 .text:00401051 xor ecx, ecx .text:00401053 cmp byte_403140, cl .text:00401059 jz short loc_401088
The snippet simply copies the null-terminated string in byte_403010
to byte_403140
. Lines 36 and 37 also check if the first character in byte_403140 is a null byte, i.e., if the string is empty. If it is, then the code jumps to loc_401088
. Since byte_403140
is hard coded and is not a null byte, we can assume the jump is not taken. This is the pseudo-code for the snippet:
STRCPY(byte_403140, byte_403010) // copy string byte_403010 to byte_403140 IF byte_403140[0] == '\0' THEN GOTO loc_401088 \\ should never happen ENDIF
Check Character(s) in byte_403140
This snippet comes next:
.text:0040105B jmp short loc_401060 .text:0040105B ; --------------------------------------------------------------------------- .text:0040105D align 10h .text:00401060 .text:00401060 loc_401060: ; CODE XREF: .text:0040105Bj .text:00401060 ; .text:00401086j .text:00401060 mov al, byte_403140[ecx] .text:00401066 cmp al, 61h .text:00401068 jl short loc_40107E .text:0040106A cmp al, 7Ah .text:0040106C jg short loc_40107E .text:0040106E
Register ecx
was set to 0 in line 35. So al
is the first character in byte_403140
. Lines 45 to 48 check if it is a lower case character and jump to loc_40107E if not:
IF NOT 'a' <= byte_403140[ecx] <= 'z' THEN GOTO loc_40107E ENDIF
Substitution Cipher (Very Gently Obfuscated)
If the jump is not take, we continue with these very interesting lines:
.text:0040106E .text:0040106E loc_40106E: ; DATA XREF: start:loc_4012D5w .text:0040106E push cs .text:0040106F mov esi, 5948AC0h .text:00401074 .text:00401074 loc_401074: ; CODE XREF: .text:loc_401074j .text:00401074 jg short near ptr loc_401074+1 .text:00401074 ; --------------------------------------------------------------------------- .text:00401076 dw 0FFFFh .text:00401078 ; --------------------------------------------------------------------------- .text:00401078 mov byte_403140[ecx], dl
As noted before, the disassembly looks very odd:
push cs
doesn’t make sense at this point- The constant in
mov esi, 5948AC0h
looks very arbitrary - The jump in
jg short near ptr loc_401074+1
has an invalid target - The data section
dw 0FFFFh
is unexpected
So probably this part is modified during runtime to something more meaningful. A quick way to verify this is to set a breakpoint at .text:0040106E
in OllyDbg and check how the code looks at runtime. Make sure you enter a valid serial (26 lower case characters), otherwise you won’t even arrive at the breakpoint:
This confirms that the code looks different during runtime. The same location before running the exe looks like this:
So bytes 0F BE C0
are changed to 0E BE C0
during runtime. Let’s set a breakpoint at 0040106E
with “Memory, on write” to see which part of the code modifies the code. This is the location were the memory is modified:
The crackme uses a simple increment instruction to modify one byte inside our subroutine. Newer versions of IDA Pro allow you to patch the code to get the correct code:
.text:0040106E .text:0040106E loc_40106E: ; DATA XREF: start:loc_4012D5w .text:0040106E movsx eax, al .text:0040106F mov dl, ebp[eax-81h] .text:00401078 mov byte_403140[ecx], dl
The memory location [ebp+eax-81h]
looks strange at first. But we know that at [ebp-20h]
there is a copy of our serial. So we can rewrite the location as serial_copy[eax - 61h]
. Furthermore, 61h = 97 is the ASCII code of letter “a”
, which leads to the following pseudo code:
// al = byte_403140[ecx] dl = serial_copy[eax - 'a']
So dl holds the nth character of the serial, where n is 0 for eax = ‘a’, 1 for ‘b’, …, 26 for ‘z’. This also shows why the serial needs to have 26 characters: it serves as the key for a substitution cipher.
Wrapping up the Encryption Loop
The next couple of lines finish the encryption loop. The lines simply increase the index in ecx
and loop back to loc_401060
if the next character in byte_403140
is not the null byte:
.text:0040107E .text:0040107E loc_40107E: ; CODE XREF: .text:00401068j .text:0040107E ; .text:0040106Cj .text:0040107E inc ecx .text:0040107F cmp byte_403140[ecx], 0 .text:00401086 jnz short loc_401060
Here’s the entire decryption routine in pseudo code:
FOR i = 0 TO LEN(byte_403140) DO ch = byte_403140[i] IF 'a' <= ch <= 'z' THEN byte_403140[i] = serial_copy[ch - 'a'] ENDIF END FOR
All characters in byte_403140 that are not lowercase letters are left alone. All lower case letters are replaced with the character from the serial at index n, where n is zero if the character is ‘a’, is 1 if the character is ‘b’, etc.
Hashing the Plaintext Message
The next lines look quite complicated:
00401088 loc_401088: ; CODE XREF: .text:00401059j .text:00401088 or eax, 0FFFFFFFFh .text:0040108B mov edx, offset byte_403140 .text:00401090 test ecx, ecx .text:00401092 jz short loc_4010AD .text:00401094 .text:00401094 loc_401094: ; CODE XREF: .text:004010ABj .text:00401094 movzx esi, byte ptr [edx] .text:00401097 xor esi, eax .text:00401099 and esi, 0FFh .text:0040109F shr eax, 8 .text:004010A2 xor eax, ds:dword_402058[esi*4] .text:004010A9 inc edx .text:004010AA dec ecx .text:004010AB jnz short loc_401094 .text:004010AD .text:004010AD loc_4010AD: ; CODE XREF: .text:00401092j .text:004010AD not eax .text:004010AF cmp eax, 0F891B218h .text:004010B4 jnz short loc_4010C6 .text:004010B6 mov eax, [ebp+8] .text:004010B9 mov dword ptr [eax], offset byte_403140 .text:004010BF mov al, 1 .text:004010C1 pop esi .text:004010C2 mov esp, ebp .text:004010C4 pop ebp .text:004010C5 retn
What the lines basically do is to calculate a hash of the string in byte_403140
(the plaintext message). If the hash is equal to 0F891B218h
(see line 85), then we get the success message, otherwise we get the failure code at loc_4010C6
. We can only assume that the hash 0F891B218h
is produced when the decryption leads to the correct plaintext.
Finding the Plaintext Message
It’s time to have a look at the string in byte_403010. The string is hardcoded and has the following value:
Ix lzctusdzetgc, ex n-fsb (nvfnujuvujsx-fsb) jn e fenjl lsatsxrxu sw ncaaruzjl qrc ehdszjugan pgjlg trzwszan nvfnujuvujsx. Ix fhslq ljtgrzn, ugrc ezr uctjlehhc vnrm us sfnlvzr ugr zrheujsxngjt fruprrx ugr qrc exm ugr
ljtgrzurbu.
We know that all lowercase character in this strings stem from a simple substitution cipher. Entering the correct key as the serial should produce a meaningful plaintext and hopefully lead to the success message. Breaking substitution ciphers is pretty easy. I’m using the Python script break_simplesub
from the Practical Cryptography webpage. The code uses a random optimization algorithm and the output varies each time you run the code. Here’s my output:
$ python break_simplesub.py Substitution Cipher solver, you may have to wait several iterations for the correct result. Press ctrl+c to exit program. best score so far: -1001.47696195 on iteration 1 best key: RATQNHWGEDMLCZJBKFXUSIPOVY plaintext: VSLNMCTUJNICHMISERUPEYRETOTYTOUSRUPOEIRIEOLLUBCUSASTUGEMBBATNOLDAMIFJUNOTHBEWHOLHCANGUNBEEYRETOTYTOUSVSRFULDLOCHANETHAMINATMCOLIFFMYEAKTUURELYNATHANAFITOUSEHOCRATWAASTHADAMISKTHALOCHANTAPT best score so far: -996.55579631 on iteration 3 best key: UAESNQPXZIBFDJLVYGCRTWMKHO plaintext: JHOISUADMICURSCHELDKEPLEANAPANDHLDKNECLCENOODBUDHTHADVESBBTAINOFTSCYMDINARBEGRNORUTIVDIBEEPLEANAPANDHJHLYDOFONURTIEARTSCITASUNOCYYSPETWADDLEOPITARTITYCANDHERNULTAGTTHARTFTSCHWARTONURTIATKA best score so far: -827.410210054 on iteration 9 best key: EFLMRWDGJYQHAXSTOZNUVIPBCK plaintext: VNCRYPTOGRAPHYANSBOXSUBSTITUTIONBOXISABASICCOMPONENTOFSYMMETRICKEYALGORITHMSWHICHPERFORMSSUBSTITUTIONVNBLOCKCIPHERSTHEYARETYPICALLYUSEDTOOBSCURETHERELATIONSHIPBETWEENTHEKEYANDTHECIPHERTEXT
The plaintext is not very readable because the script does not tackle special characters like spaces. However, you can clearly recognize meaningful text. The key to encrypt the message therefore EFLMRWDGJYQHAXSTOZNUVIPBCK. To decrypt it, we must enter the reverse of the key as the serial. The following Python script generates the decryption key, and also shows the resulting plaintext:
import string crypt = """Ix lzctusdzetgc, ex n-fsb (nvfnujuvujsx-fsb) jn e fenjl lsatsxrxu sw ncaaruzjl qrc ehdszjugan pgjlg trzwszan nvfnujuvujsx. Ix fhslq ljtgrzn, ugrc ezr uctjlehhc vnrm us sfnlvzr ugr zrheujsxngjt fruprrx ugr qrc exm ugr ljtgrzurbu.""".replace('\n', '') key = 'EFLMRWDGJIQHAXSTOZNUVYPBCK'.lower() mapping = {} for k, c in zip(key, string.lowercase): mapping[k] = c msg = "" for c in crypt: msg += mapping.get(c, c) print("the key is: {}".format(''.join([mapping[x] for x in string.lowercase]))) print("the plaintext is: {}".format(msg))
It produces the following output:
$ python decrypt.py the key is: mxygabhljizcdsqwkeoptufnvr the plaintext is: In cryptography, an s-box (substitution-box) is a basic component of symmetric key algorithms which performs substitution. In block ciphers, they are typically used to obscure the relationship between the key and the ciphertext.
The message therefore is
In cryptography, an s-box (substitution-box) is a basic component of symmetric key algorithms which performs substitution. In block ciphers, they are typically used to obscure the relationship between the key and the ciphertext.
If you enter the serial mxygabhljizcdsqwkeoptufnvr to the crackme you should see the good boy message: