"Maze" by Hack The Box

This crackme doesn’t require a lot of reversing skills but provides plenty of opportunities to learn new tools and techniques. At least I learned a few new things.

First of all, the archive contains three files: a Windows executable file, a password-protected zip archive (by default, “hackthebox” will not work), and a beautiful picture. Let’s check what’s inside this Windows executable:

$ diec 
PE64
    Linker: Microsoft Linker(14.36.32825)
    Compiler: Microsoft Visual C/C++(19.36.32825)[C]
    Tool: Visual Studio(2022 version 17.6)
    Packer: PyInstaller

$ diec

PE64

Linker: Microsoft Linker(14.36.32825)

Compiler: Microsoft Visual C/C++(19.36.32825)[C]

Tool: Visual Studio(2022 version 17.6)

Packer: PyInstaller

This was the first thing I learned: there is a way to create an executable file for different platforms, inside which there is a Python interpreter and all the necessary dependencies! Luckily for me, there’s no need to dig into its innards and unpack it (although that would be an interesting exercise), since the the unpacker already exists, even with the Web version.

Running pyinstxtractor produces a lot of files, but only two of them look interesting: maze.pyc and obf_path.pyc, which it uses as one of the imported files. The other files seem to be dependencies, dependencies of dependencies, and some kind of glue for PyInstaller to work.

Writing a decompiler for pyc files also seems like an interesting challenge, but I cut corners and used one that already exists. Now, the fun begins.

The first thing that maze.py does is that it checks that the input string is Y0u_St1ll_1N_4_M4z3 and then executes the obf_path.obfuscate_route() function. Decompiling the obf_path.pyc doesn’t help much: it loads the binary blob and executes it. The question is how to convert this blob back to Python code.

Google hasn’t automatically answered how to generate pyc files from code objects, so I’ll have to dig deeper. And that was the second thing I learned. There is a module named py_compile that generates bytecode from a source file, so it should definitely have something related to my question. It turned out that the only thing it does is call a separate module:

    if invalidation_mode == PycInvalidationMode.TIMESTAMP:
        source_stats = loader.path_stats(file)
        bytecode = importlib._bootstrap_external._code_to_timestamp_pyc(
            code, source_stats['mtime'], source_stats['size'])
    else:
        source_hash = importlib.util.source_hash(source_bytes)
        bytecode = importlib._bootstrap_external._code_to_hash_pyc(
            code,
            source_hash,
            (invalidation_mode == PycInvalidationMode.CHECKED_HASH),
        )

if invalidation_mode == PycInvalidationMode.TIMESTAMP:

source_stats = loader.path_stats(file)

bytecode = importlib._bootstrap_external._code_to_timestamp_pyc(

code, source_stats['mtime'], source_stats['size'])

else:

source_hash = importlib.util.source_hash(source_bytes)

bytecode = importlib._bootstrap_external._code_to_hash_pyc(

code,

source_hash,

(invalidation_mode == PycInvalidationMode.CHECKED_HASH),

)

and then it writes this blob to the pyc file. I thought that the _code_to_timestamp_pyc function should do some complex things, but no, the pyc file format is quite simple:

def _code_to_timestamp_pyc(code, mtime=0, source_size=0):
    "Produce the data for a timestamp-based pyc."
    data = bytearray(MAGIC_NUMBER)
    data.extend(_pack_uint32(0))
    data.extend(_pack_uint32(mtime))
    data.extend(_pack_uint32(source_size))
    data.extend(marshal.dumps(code))
    return data

def _code_to_timestamp_pyc(code, mtime=0, source_size=0):

"Produce the data for a timestamp-based pyc."

data = bytearray(MAGIC_NUMBER)

data.extend(_pack_uint32(0))

data.extend(_pack_uint32(mtime))

data.extend(_pack_uint32(source_size))

data.extend(marshal.dumps(code))

return data

So, for our task, we need to call this function directly by providing a code object from the obf_path.py file:

v = loads(b'...')
pyc_data = importlib._bootstrap_external._code_to_timestamp_pyc(v)
with open('obf_path_unpacked.pyc', 'wb') as f:
        f.write(pyc_data)

v = loads(b'...')

pyc_data = importlib._bootstrap_external._code_to_timestamp_pyc(v)

with open('obf_path_unpacked.pyc', 'wb') as f:

f.write(pyc_data)

Decompiling the result with uncompyle6 creates another challenge: the next layer now uses lzma and zlib to decompress a binary blob and then executes it. Unpacking it and writing to standard output will produce Python code that looks a bit weird: between the long lines that assign a string value with a lot of repeated __regboss__ to a variable with the same name, there is another call to exec (loads(… )). I googled this string just out of curiosity, and it turns out that this code was generated by a “protector” named Regboss. Protector! Which compresses the code and assigns a variable with a long name around it! The world is going crazy.

In any case, decompiling the blob using the same technique with importlib will eventually give you the source code. Simplifying the logic a bit, it looks like this:

index_file = "maze.png"
index = open(index_file, "rb").read()
seed = index[4817] + index[2624] + index[2640] + index[2720]
print("\n\nG00d!! you could escape the obfuscated path")
print("take this it may help you: ")
sleep(2)
print(f"\nseed({seed})\nfor i in range(300):\n    randint(32,125)\n")
print("Be Careful!!!! the route from here is not safe.")
sys.exit(0)

index_file = "maze.png"

index = open(index_file, "rb").read()

seed = index[4817] + index[2624] + index[2640] + index[2720]

print("\n\nG00d!! you could escape the obfuscated path")

print("take this it may help you: ")

sleep(2)

print(f"\nseed({seed})\nfor i in range(300):\n randint(32,125)\n")

print("Be Careful!!!! the route from here is not safe.")

sys.exit(0)

So, there are no flags here, but it looks like a hint for the next step:

seed(493)
for i in range(300):
    randint(32,125)

seed(493)

for i in range(300):

randint(32,125)

Returning to where we started, maze.py has the following logic after the obf_path.obfuscate_route() function I just described:

unpack the enc_maze.zip archive with the hardcoded password
read the contents of the maze file from the archive
“decrypt” the content using the following logic and write the result to the dec_maze file:

for i in range(0, len(data), 10):
    data[i] = (data[i] + 80) % 256
else:
    for i in range(0, len(data), 10):
        data[i] = (data[i] ^ key[i % len(key)]) % 256

for i in range(0, len(data), 10):

data[i] = (data[i] + 80) % 256

else:

for i in range(0, len(data), 10):

data[i] = (data[i] ^ key[i % len(key)]) % 256

The dec_maze file looks like an ELF binary, but none of the analysis tools can recognize it:

$ xxd -l 64 dec_maze
00000000: 3f45 4c46 0201 0100 0000 5f00 0000 0000  ?ELF......_.....
00000010: 0300 3e00 6000 0000 8010 0000 0000 3d00  ..>.`.........=.
00000020: 4000 0000 0000 0000 4b31 0000 0000 0000  @.......K1......
00000030: 0000 3d00 4000 3800 0d00 4000 6d00 1c00  ..=.@.8...@.m...

$ file dec_maze
dec_maze: data

$ objdump -f ./dec_maze
objdump: ./dec_maze: file format not recognized

$ xxd -l 64 dec_maze

00000000: 3f45 4c46 0201 0100 0000 5f00 0000 0000 ?ELF......_.....

00000010: 0300 3e00 6000 0000 8010 0000 0000 3d00 ..>.`.........=.

00000020: 4000 0000 0000 0000 4b31 0000 0000 0000 @.......K1......

00000030: 0000 3d00 4000 3800 0d00 4000 6d00 1c00 [email protected][email protected]...

$ file dec_maze

dec_maze: data

$ objdump -f ./dec_maze

objdump: ./dec_maze: file format not recognized

If you look closely, the file is corrupt: the first byte of the ELF magic should be 7f, plus some other fields in the header have strange values too. My idea was that it was not fully “decrypted”, or the decryption algorithm should be modified somehow to get the correct result.

And indeed, when I carefully read maze.py again, a problem emerged:

key = [0] * len(data)

1	key = [0] * len(data)

In other words, the second round of the “decryption” algorithm actually does nothing: it XORs the data with 0, resulting in the same value. My gut feeling said that I needed to use the “hint” code from the obf_path.obfuscate_route() function:

key = []
random.seed(493)
for i in range(300):
    key.append(random.randint(32,125))

key = []

random.seed(493)

for i in range(300):

key.append(random.randint(32,125))

and voila:

$ file dec_maze
dec_maze: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=fda317f523cc4b926eea4e2565e7b9e6390f5aff, for GNU/Linux 3.2.0, stripped

1 2	$ file dec_maze dec_maze: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=fda317f523cc4b926eea4e2565e7b9e6390f5aff, for GNU/Linux 3.2.0, stripped

Now we finally have a binary to analyze! During the disassembly round, the following logic was revealed (in pseudo-C):

fgets(input, 64, stdin);
if (input[0] != 'H' || input[1] != 'T' || input[2] != 'B')
    return -1;

int length = strlen(input);
for (int i = 1, flag_index = 0; i < length; ++i, ++flag_index) {
    if (i + 1 == length)
        break;

    if ((input[i-1] + input[i] + input[i+1]) != encrypted_flag[flag_index])
        return -1;
}

return 0;

fgets(input, 64, stdin);

if (input[0] != 'H' || input[1] != 'T' || input[2] != 'B')

return -1;

int length = strlen(input);

for (int i = 1, flag_index = 0; i < length; ++i, ++flag_index) {

if (i + 1 == length)

break;

if ((input[i-1] + input[i] + input[i+1]) != encrypted_flag[flag_index])

return -1;

}

return 0;

In other words, the algorithm compares the sum of three adjacent characters in the input data with some integer in the encrypted_flag array. This means that in this case 'H' + 'T' + 'B' should be equal to '0xDE, then 'T' + 'B' + '{' should be equal to '0x111 and soon.

To reverse this algorithm, we need to do the following:

get the current integer from encrypted_flag
subtract it from the sum of the two previous characters
the result is the current character we need

In code it might look like this (in pseudo-Java now, because why not):

var flag = new char[encrypted_flag.length + 3];  
flag[0] = 'H';  
flag[1] = 'T';  
flag[2] = 'B';  

for (int i = 0; i < encrypted_flag.length; ++i) {  
  var c = encrypted_flag[i] - (flag[i+1] + flag[i+2]);  
  flag[i+3] = (char) c;  
}

var flag = new char[encrypted_flag.length + 3];

flag[0] = 'H';

flag[1] = 'T';

flag[2] = 'B';

for (int i = 0; i < encrypted_flag.length; ++i) {

var c = encrypted_flag[i] - (flag[i+1] + flag[i+2]);

flag[i+3] = (char) c;

}

And that’s all we need.
The challenge can be found here.