Thursday, May 5, 2016

PCDC 2016 RE Challenge Solutions - Part 2

This is a continuation off of my previous post detailing the solutions to the 2016 PCDC reverse engineering challenges 1 and 2. This post will go over the solutions for challenges 3 and 4. If you were a competitor in this year's PCDC competition, you may not have seen all of the RE challenges. Challenges 1 and 2 were made for high school students, challenge 3 was added for college students, and challenge 4 was added for professional day. As a result, only the professionals saw all 4 challenges. This was designed to help balance expected work load during the competition and provide increasing level so difficulty for increasing levels of skill. So let's look at the remaining two challenge solutions.

Challenge 3 - exclusive

Like with all the other challenges, let's run this one and get an idea of what it does.


So this looks a little bit like our previous challenges where we have to give it a valid input and it well tell us if that input is correct and print out the key. Let's open the challenge up in Immunity and take a look. 


After loading the binary and analyzing the modules just like the we did in the previous challenge, we can scroll down and see the instructions responsible for printing out the initial user prompt. We can also see how our input gets read into the program via the call to fgets(). The important thing to note how fgets() gets its parameters. In x86-32 bit assembly on Windows, which is what we are looking at, these parameters are passed on the stack in a standard called CDECL (C declaration). What this means is the PUSH instructions are actually setting up the parameters to the fgets() function. To understand what these parameters are, let's look at this function.

Using MSDN as a resource, we can see that fgets() has the following signature:

char *fgets(char *str, int n, FILE *stream);

What this function is doing is reading in an n byte string, storing it in a memory buffer pointed to by str, and reading the input from the file stream stream. If you look at Immunity, it even tells you which PUSH instruction is responsible for setting up which parameter to the fgets() call and in this case, it tells you it is reading a string of length 0x16 (n = 16 (22.)). Now for our purposes, the important parameter to consider is the address where this user defined input gets stored. If you looked at the link specifying the CDECL calling convention, you'll see that the first parameter listed in the function (char *str) is the last one pushed onto that stack before the call to fgets(). We don't need to understand all the instructions, we just need to see the use of DWORD PTR SS:[EBP-34]. This is going to be a pointer to where our input will be stored. If we look through the disassembly a little more, we should see this address show up again. Specifically in the 0x00401397 - 0x004013CE address range. Before we go any further, lets recap what we know:

- The program expects to get a 'serial number' which it then validates
- The program uses a call to fgets() to read in our input
- Our input is stored at DWORD PTR SS:[EBP-34]
- We see our stored address used within the assembly address range 0x00401397 0x004013CE

Now it's time to verify these. The best way to do this with a debugger is to set break points. At this point, right-click on the address 0x00401397, go down to breakpoints and select toggle. Alternatively, click on the address and press F2 to toggle the breakpoint. This is after we have entered the string to the program and before it looks like it's getting used.  With the breakpoint turned on, our program will halt at that address when execution has gotten there. So with our breakpoint set, let's run the program. This can be done 3 different ways: 1. Press F9, 2. To to the top menu and press the red play button, 3. Go to Debug -> Run. The first thing that will happen is that the program will break at what is called the program entry point. This is basically the API the program exposes to the OS so that the program can start up. There are a lot of steps that go into getting from this point to the break point we set, but we don't care about those at this time. Just press F9 again to continue. Now at this point, you should see the Windows command window. It should be waiting for you to enter in your input. Go ahead and enter is some string into this window like you did the first time you ran this challenge program.



At this point it will look like your program hangs, but this is Immunity pausing it so you can being to dive deeper into the details of what's going on. Let's analyze the following snippet of code:



If we look at the first instruction, we see a CMP to 0x16. Remember from the first post that CMP is used to make comparisons and remember from earlier in this post, that this challenge reads in a 0x16 byte string. Now look at the second instruction, JNB SHORT exclusiv.004013D0. It is a jump instruction to the address 0x004013D0. That target address is interesting, but not as interesting as the instruction before it at address 0x004013CE, a JMP SHORT exclusive.00401397. That's the same address we set our breakpoint at! This is a loop! We compare some counter to 0x22, if it is less than 0x22, continue execution, otherwise, jump. It kind of looks like:

int x = 0;
while (x < 0x16) {
    loop_body();
}

Now we haven't figured out the loop body yet. But let's step through this an instruction at a time by pressing the F7 key and stop when we get to address 0x004013A5, the first XOR instruction. If you're following along with my example and entered in abcd1234 as your string to test, you should see something like this:


Now I've added some circles to draw your attention to a few places. So the instruction we stopped at is XOR EAX, 9C. This means we are performing the XOR operations on whatever value is in the EAX register with the hex value 0x9C. I've circled the EAX register, and we can see the value of 0x61. If you recall, our input string was abcd1234, the hexadecimal value for the ASCII character 'a' is 0x61. Interesting. Let's continue to the next XOR instruction at 0x004013B9.


Again, I've highlighted the value in the EAX register, 0x62 (ASCII value for 'b'), and the other value in the XOR operation is 0xDC. If you continue to step through, you'll see that we jump back to address 0x00401397, and loop through these series of instructions again. When you inspect the EAX register, you'll see the values 0x63 and 0x64 (ASCII 'c' and 'd' respectively). It is indeed our input string. Now, we know we are in a loop. In fact, we wrote out some pseudo-code for it. So let's update it a little:

int x = 0;
while (x < 0x16) {
   input[i] ^ 0x9C;
   input[i+1] ^ 0xDC;
   x++;
}

Hmmm this doesn't seem quite right. So how is our loop being incremented and how to we know our index for our input? Let's looks at the following details:

What we're looking at is how the input string is indexed using the EDX register, and how that register is incremented by 2 every loop iteration. So let's refine our code a little more:

int x = 0;
while (x < 0x16) {
   input[i] ^ 0x9C;
   input[i+1] ^ 0xDC;
   x += 2;
}

Great. Now when we exit our loop, our input has been XOR'ed with the key 0x9CDC. A little further down we see a call to the function memcmp(). Here is the MSDN documentation for that function. What it does is look to see if 2 memory regions contain the same data for a given number of bytes. Again, Immunity has done some work for us and shows us that this memcmp() is using 2 pointers, s1 and s2, and comparing them up to 0x15 (21) bytes. Le't set a breakpoint at address 0x004013DA where memcmp() gets called. So, by looking at the previous instructions and what we learned from the loop we reverse engineered, we see that s2 is the string we entered after it has been XOR'ed. A good way to confirm this is we see that the address of this string got loaded into the EAX register with the instruction at address 0x004013D2. So what we can do is right-click on the EAX register and click Follow in Dump. This will show a hex dump of our data:


And indeed, those are our input bytes after being XOR'ed. Check the first 2 bytes; 0x6162 XOR 0x9CDC = 0xFDBE. So what we are really interested in is the memory contents of the second pointer, s1, being used in the memcmp(). We can find that by doing the same thing we did in the previous step, but with the ECX register. So let's get the contents of that dump.


This this is ultimately our goal. The 0x15 (21) bytes of this memory segment. This is what our input string gets compared to after being XOR'ed against the key 0x9CDC. So now let's update our pseudo-code:

char serial[0x15] = "\xff\xbd\xfa\xb9\xb1\xec\xad\xee\xaf\xe8\xb1\xe9\xaa\xeb\xa4\xe5\xb1\xbe\xf9\xb9\xfa"
char *input = (char *)malloc(0x16);
fgets(input, 0x16, STDIN);

int x = 0;
while (x < 0x16) {
 input[i] ^ 0x9C;
   input[i+1] ^ 0xDC;
   x += 2;
}

if (memcmp(key, input, 0x15) == 0) {
    win();
} else {
    fail();
}

Now, we know this is what happens with the memcmp() function by reading the documentation and seeing the TEST EAX, EAX instruction and seeing successful jumps over the failure messages. So the only thing left to do is solve what the input string is. If you read the documentation on XOR, you know that it is reversible. So all we have to do is XOR our serial we dumped out of memory with the key of 0x9CDC. There are plenty of online tools to do this for you. Once you do it, you'll find that the input should be: cafe-01234-56789-beef. Let's give this a try.



Success!

Password: cafe-01234-56789-beef
Flag: FLAG{XOR_Encryption_Is_Super_Safe!!!}

Challenge 4 - hashbrowns

This challenge was made specifically for the competition on professional day. As such, I'm going to assume a slightly more advanced level of readership. The last challenge solution really went in depth by stepping through the explanation. This solution write-up isn't going to go into as much detail describing how a particular section of code does something. Instead, it will be left up to the reader to figure out all the details.

So let's start by running this program like we did the ones before it.



Ok. So we are given a hash and we have to find the input that matches this hash. So the way we do this is we need to find the hashing algorithm. Opening up the executable in Immunity, we should see the following:


So with this challenge, we see that there are a couple of internal functions with the CALL instructions to addresses within the hashbrowns executable image. Notably, these can be seen at 0x00401442 and 0x00401471. Let's look at the first function starting at address 0x00401310 by right-clicking on the address and choosing the follow option.



I've gone ahead and annotated the interested aspects of this function. The first is that this is a loop and we see a call to strlen() in the loop. This leads us to believe that we are looping through a string for the length of a string. The other interesting pieces of information about this loop are the CMP instructions. We see each of them circled in the above picture and each comparison is to a hex value. The four values, 0x41, 0x5A, 0x61, and 0x7A correspond to the ASCII characters 'A', 'Z', 'a', and 'z' respectively. This loop looks like it is looking for upper and lowercase alphabetical characters. We can further confirm this by analyzing the string argument to the printf() call at 0x00401453, "Found a character I can't hash." This clue would indicate that the program only accepts alphabetical characters. Let's test this hypothesis.


We have confirmed our hypothesis. Let's move onto the next internal function at 0x00401390. Here is a picture of that function.



So again we can see can see we are looping through the length of the string character by character. This is in-fact the function that performs the hashing function. Now, there were a few ways to solve this. The difficult way was to manually reverse engineer the actual algorithm represented by the instructions at addresses 0x004013C3 - 0x004013D7, or, look at the value 0x1505 and do some research on hashing algorithms. If you convert 0x1505 to decimal, you get 5381. A Google search with the keywords hash and 5381 should lead you directly to the DJB2 hashing algorithm. Here is what that algorithm looks like.
unsigned long djb2(unsigned char *str){
        unsigned long hash = 5381;
        int c;
        while (c = *str++)
            hash = ((hash << 5) + hash) + c; 

        return hash;
}

When looking at this algorithm, we see the constant 0x1505 and the shift left 5 (SHL EDX, 5). Now we need to make sure this is actually what were after. Going back to the main part of our challenge, we see the following section of code.


What we see here is a call to our hashing function at 0x00401390 then a series of CMP instructions. Interestingly we see a CMP  to a value of 0x28C7FAE4 which is the hash value the challenge asked us to match. If we look at it a little further, we can see that this CMP instruction calls a JMP to a unique address not jumped to by any other CMP instruction. So these are the pieces we have so far:

  • The input expects a string input that will be hashed
  • That string gets hashed through the DJB2 hashing algorithm
  • The result of the hash must match 0x28C7FAE4
The way to solve this is to write a brute force script that enumerates through alphabetical strings and hashes them using the DJB2 hashing algorithm until a match is found with the value 0x28C7FAE4

This is what the DJB2 algorithm looks like in Python.

def djb2(string):
    hash = 5381
    for x in string:
        hash = (( hash << 5) + hash) + ord(x)
    return hash & 0xFFFFFFFF

The best way to solve this is to actually use a password list. Using the rock-you password list and a python script based on the previous snippet of code, you would have found that the hash value matches the input string of cyberdragoon. Here is the proof.



So here is the summary for RE challenge 4:
Password: cyberdragoon
Flag: FLAG{alls_good_when_the_hashing_is_easy}


Conclusion

I hope this year introduced an interesting new twist on the PCDC injects. If you're interested in working through the challenges on your own, I've put the executables on my GitHub account here. As always, I'm always interested in hearing feedback, so if you have anything you'd like to see improved for next year, don't hesitate to let me know. See you in 2017!