“Maze” by Hack The Box

This crackme doesn’t require a lot of reversing skills but provides plenty of opportunities to learn new tools and techniques. At least I learned a few new things.

First of all, the archive contains three files: a Windows executable file, a password-protected zip archive (by default, “hackthebox” will not work), and a beautiful picture. Let’s check what’s inside this Windows executable:

This was the first thing I learned: there is a way to create an executable file for different platforms, inside which there is a Python interpreter and all the necessary dependencies! Luckily for me, there’s no need to dig into its innards and unpack it (although that would be an interesting exercise), since the the unpacker already exists, even with the Web version.

Running pyinstxtractor produces a lot of files, but only two of them look interesting: maze.pyc and obf_path.pyc, which it uses as one of the imported files. The other files seem to be dependencies, dependencies of dependencies, and some kind of glue for PyInstaller to work.

Writing a decompiler for pyc files also seems like an interesting challenge, but I cut corners and used one that already exists. Now, the fun begins.

The first thing that maze.py does is that it checks that the input string is Y0u_St1ll_1N_4_M4z3 and then executes the obf_path.obfuscate_route() function. Decompiling the obf_path.pyc doesn’t help much: it loads the binary blob and executes it. The question is how to convert this blob back to Python code.

Google hasn’t automatically answered how to generate pyc files from code objects, so I’ll have to dig deeper. And that was the second thing I learned. There is a module named py_compile that generates bytecode from a source file, so it should definitely have something related to my question. It turned out that the only thing it does is call a separate module:

and then it writes this blob to the pyc file. I thought that the _code_to_timestamp_pyc function should do some complex things, but no, the pyc file format is quite simple:

So, for our task, we need to call this function directly by providing a code object from the obf_path.py file:

Decompiling the result with uncompyle6 creates another challenge: the next layer now uses lzma and zlib to decompress a binary blob and then executes it. Unpacking it and writing to standard output will produce Python code that looks a bit weird: between the long lines that assign a string value with a lot of repeated __regboss__ to a variable with the same name, there is another call to exec (loads(… )). I googled this string just out of curiosity, and it turns out that this code was generated by a “protector” named Regboss. Protector! Which compresses the code and assigns a variable with a long name around it! The world is going crazy.

In any case, decompiling the blob using the same technique with importlib will eventually give you the source code. Simplifying the logic a bit, it looks like this:

So, there are no flags here, but it looks like a hint for the next step:

Returning to where we started, maze.py has the following logic after the obf_path.obfuscate_route() function I just described:

  • unpack the enc_maze.zip archive with the hardcoded password
  • read the contents of the maze file from the archive
  • “decrypt” the content using the following logic and write the result to the dec_maze file:

The dec_maze file looks like an ELF binary, but none of the analysis tools can recognize it:

If you look closely, the file is corrupt: the first byte of the ELF magic should be 7f, plus some other fields in the header have strange values too. My idea was that it was not fully “decrypted”, or the decryption algorithm should be modified somehow to get the correct result.

And indeed, when I carefully read maze.py again, a problem emerged:

In other words, the second round of the “decryption” algorithm actually does nothing: it XORs the data with 0, resulting in the same value. My gut feeling said that I needed to use the “hint” code from the obf_path.obfuscate_route() function:

and voila:

Now we finally have a binary to analyze! During the disassembly round, the following logic was revealed (in pseudo-C):

In other words, the algorithm compares the sum of three adjacent characters in the input data with some integer in the encrypted_flag array. This means that in this case 'H' + 'T' + 'B' should be equal to '0xDE, then 'T' + 'B' + '{' should be equal to '0x111 and soon.

To reverse this algorithm, we need to do the following:

  • get the current integer from encrypted_flag
  • subtract it from the sum of the two previous characters
  • the result is the current character we need

In code it might look like this (in pseudo-Java now, because why not):

And that’s all we need.
The challenge can be found here.