TLOP is the final challenge of the LabyREnth CTF random track. When you download it you’re given a file called TLOP.pyw. If you go ahead and run it it will open up a program called APT Maker Pro - UNREGISTERED TRIAL VERSION. There’s a button labeled “Generate APT” which informs you that you need to activate APT Maker Pro, and a button labeled “Activate APT Maker Pro!” which brings up a dialog asking for the product key. The challenge here is to find the product key to activate the program, allowing you to generate the APT.

Let’s start reversing! TLOP.pyw is a pwy file, not a py file, which is pretty much the same except that on Windows pyw files don’t bring up a command line window, so it can be renamed to py if you’d like to print debug info to STDOUT. Once you open up the file you’ll see ten lines of python (get it? Ten Lines of Python? TLOP… it’s stupid). Most of the code after the imports simply sets up a TKinter window, which is actually the splash screen. The last line is where it gets interesting:

exec marshal.loads(zlib.decompress(<longgggggggg gibberish
string>))

So it’s an exec statement being passed the result of marshal.loads, which is loading some zlib’d data. Marshal is the Python module for serializing and deserializing builtin Python types. If we remove the exec and run the same line in the Python shell, we can see the result is a code object. exec statements in Python accept two types of input, strings of Python code and code objects which contain compiled Python bytecode. Code objects are most often seen used in .pyc files, which are compiled Python files which get generated when Python modules are imported. Pyc files contain a 32-bit magic number specifying the Python version, a 32-bit timestamp of the compilation, and a marshaled code object. Since we already have a marshaled code object, we can turn this into a .pyc file with the following code:

import py_compile
import zlib
o = open('stage1.pyc', 'wb')
o.write(py_compile.MAGIC)
o.write('\x00' * 4) # null timestamp
o.write(zlib.decompress(<longgggggggg gibberish string>))
o.close()

Using uncompyle2 we can can decompile the stage1.pyc file we created.

Decompiled output

The resulting decompilation contains a class called AptMaker which contains most of the code for the UI. After the class there is an RC4 function, as well as another exec on the result of another marshal. We can build the exec’d object into another pyc file to analyze it.

import py_compile
import zlib
import base64
o = open('stage2.pyc', 'wb')
o.write(py_compile.MAGIC)
o.write('\x00' * 4) # null timestamp
o.write(zlib.decompress(base64.b64decode(<longgggggggg gibberish string>)))
o.close()

If we attempt to decompile the stage2.pyc with uncompyle2, we get the following error:

$ uncompyle2 stage2.pyc
# 2016.10.02 23:03:37 PDT
#Embedded file name: a
### Can't uncompyle stage2.pyc
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/__init__.py", line 197, in main
    uncompyle_file(infile, outstream, showasm, showast, deob)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/__init__.py", line 130, in uncompyle_file
    uncompyle(version, co, outstream, showasm, showast, deob)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/__init__.py", line 117, in uncompyle
    walker.gen_source(ast, customize)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/Walker.py", line 1406, in gen_source
    self.print_(self.traverse(ast, isLambda=isLambda))
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/Walker.py", line 492, in traverse
    self.preorder(node)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/spark.py", line 692, in preorder
    self.preorder(kid)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/spark.py", line 692, in preorder
    self.preorder(kid)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/spark.py", line 692, in preorder
    self.preorder(kid)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/spark.py", line 687, in preorder
    self.default(node)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/Walker.py", line 1180, in default
    self.engine(table[key], node)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/Walker.py", line 1130, in engine
    self.preorder(node[entry[arg]])
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/spark.py", line 685, in preorder
    func(node)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/Walker.py", line 878, in n_mkfunc
    self.make_function(node, isLambda=0)
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/Walker.py", line 1330, in make_function
    self.print_docstring(indent, code.co_consts[0])
  File "/usr/local/lib/python2.7/site-packages/uncompyle2/Walker.py", line 544, in print_docstring
    docstring = repr(docstring.expandtabs())[1:-1]
AttributeError: 'code' object has no attribute 'expandtabs'
# decompiled 0 files: 0 okay, 1 failed, 0 verify failed
# 2016.10.02 23:03:37 PDT

Throwing it at other Python decompilers will likely yield similar errors, so we’re going to have to take a different approach. Since it won’t decompile we can attempt to disassemble the bytecode. Python has a built-in module for disassembling Python bytecode called dis, which we can use to disassemble the pyc file we produced by running the following code:

import marshal
import dis
o = open('stage2.pyc', 'rb')
o.read(8)
c = marshal.load(o)
o.close()
dis.dis(c)

Which produces the following output:

0 LOAD_CONST               0 (<code object verify_license at 0x610458, file "", line -1>)
  3 MAKE_FUNCTION            0
  6 STORE_NAME               0 (verify_license)
  9 LOAD_CONST               1 (None)
 12 RETURN_VALUE

The code output is relatively simple. The first instruction, LOAD_CONST 0, pushes the constant at index 0 in the current code object onto the stack. After that we have MAKE_FUNCTION 0, which pops a code object off of the stack and turns it into a function object. The third instruction, STORE_NAME 0, pops the top item on the stack (the function we just created) and stores it with the name at index 0 in the code object, that name in this case is “verify_license”. What this sequence of instructions does in practice is creates a function named verify_license. We can see the verify_license function is references in our previous decompilation inside the “is_licensed” function:

    def is_licensed(self):
        return verify_license(self.license_key.zfill(25))

The two last instructions simply push the constant None to the stack, and then return it. This is present in all Python code objects that don’t have an explicit return value, as all Python code objects must return something.

Now that we know that all this code is doing is creating a function, we can actually run it and use the verify_license function from the Python shell by importing the stage2.pyc file. If we run the following code we can start to play around with the function from the shell:

>>> from stage2 import *
>>> verify_license('A' * 100)
False

From this we know that verify_license is a function which returns a boolean. If we disassemble the function itself we will see the following:

  0 LOAD_CONST               0 (<code object check_login at 0x2d46e0, file "shell.py", line -1>)
  3 LOAD_CONST               1 (None)
  6 DUP_TOP
  7 EXEC_STMT
  8 LOAD_NAME                0 ( )
 11 RETURN_VALUE

Let’s take a look at this instruction by instruction:

0 LOAD_CONST 0 (<code object check_login at 0x2d46e0, file
"shell.py", line -1>)

First off it’s pushing constant 0, which is a code object, to the stack.

3 LOAD_CONST 1 (None)

The next thing it does is push constant 1, which is None, to the stack.

6 DUP_TOP

DUP_TOP duplicates the top item of the stack, so it pushes another None to the stack.

7 EXEC_STMT

EXEC_STMT is the equivalent of the “exec” keyword in regular Python. It takes three parameters, a code object or Python string, and two optional parameters containing global and local variables. In this case the code object is the one pushed at the start of the function, and the global and locals are not used, so those are the two Nones on the stack.

8 LOAD_NAME 0 ( )
11 RETURN_VALUE

This pushes the value stored for name 0 and returns it. Name 0 here appears to be “ “, which is not a valid Python name and is not referenced anywhere else in the function, so it’s safe to assume this gets set by the code run by the EXEC_STMT.

Since there’s not much code in here we can assume the meat of the code is inside the code object that gets exec’d. We can disassemble that code object by running the following:

>>> dis.dis(verify_license.func_code.co_consts[0])

Unlike the previous times we’ve run dis in here you’ll start seeing incredibly long output that looks something like this:

(a bird’s eye view of the disassembly)

If we actually let the disassembly run until it’s entirely finished we won’t actually get any useful information. The entire output is thousands of EXTENDED_ARG instructions with increasingly large arguments, followed by a JUMP_FORWARD to the same large value.

So what’s the issue? The Python runtime stores the arguments for Python instructions in a signed 32-bit integer called oparg. Python instructions that have arguments are 3 bytes long, 1 byte for the opcode and 2 bytes for the oparg value. The problem with this is that instructions can only set the lower 16-bits of the oparg, instead of the whole 32-bits. To get around this limitation, Python has an instruction called EXTENDED_ARG, which shifts its argument to the left 16-bits, allowing you to set the upper 16-bit in one instruction, and the lower 16 in the next. If you put multiple EXTENDED_ARG instructions in a row, the Python runtime will simply keep shifting the 32-bit integer that is oparg, and bits will fall off the end. However, if you disassemble that code with dis, oparg is stored in a Python number. Since dis uses a Python number instead of a fixed 32-bit integer, the number keeps on growing with every single EXTENDED_ARG instruction.

After coming across this and a few other issues with how dis handles funky bytecode, I wrote my own assembler/disassembler called pyasm (https://github.com/gabe-k/pyasm) which we can use to produce a slightly more useful disassembly.

If we run run dispy.py on stage2.pyc, we will get stage2.pyasm. The code object we are looking at starts at line 11 in stage2.pyasm, with the actual instructions starting at line 100. By looking at the first bunch of instructions we can start to notice a pattern:

EXTENDED_ARG 101
EXTENDED_ARG 28169
EXTENDED_ARG 2305
133 * EXTENDED_ARG 0xffff
EXTENDED_ARG 356
EXTENDED_ARG 28169
EXTENDED_ARG 2305
133 * EXTENDED_ARG 0xffff
EXTENDED_ARG 28185
EXTENDED_ARG 2305
133 * EXTENDED_ARG 0xffff
EXTENDED_ARG 602
EXTENDED_ARG 28169
EXTENDED_ARG 2305
133 * EXTENDED_ARG 0xffff
EXTENDED_ARG 612
EXTENDED_ARG 28169
EXTENDED_ARG 2305
133 * EXTENDED_ARG 0xffff
EXTENDED_ARG 2660
EXTENDED_ARG 28169
EXTENDED_ARG 2305
133 * EXTENDED_ARG 0xffff

We can see there are consistently two-three EXTENDED_ARG instructions with different arg values, followed by 133 using the arg value 0xFFFF. If we scroll down to the very end of the instructions at line 1390 we can see the last two instructions deviates slightly from this pattern:

EXTENDED_ARG 65533
JUMP_FORWARD 52549

We can work out the actual argument for JUMP_FORWARD as 65533 « 16 | 52549 which comes out to 0xfffdcd45. Oparg is signed, so it is actually -144059, which is actually a jump back to the second byte of the bytecode, instead of the first byte where it normally starts execution. This is a form of instruction overlapping, since the second byte is the argument for the first instruction, the code is hidden in the arguments of all of the EXTENDED_ARG instructions.

If we open the stage2.pyc file in a hex editor we can delete the first byte of the bytecode, so that it starts disassembly from the second byte. To do this we just delete the byte at 0x6D in stage2.pyc, then change the 32-bit int containing the length of the bytecode 0x69 to from 0x232BC to 0x232BB. Now if we disassemble the file we should see the following at the start of the code:

LOAD_NAME 0x9100 # license_key
NOP
JUMP_FORWARD 401
NOP

Now it’s starting to look more like normal bytecode. The first thing it does is load name 0x9100 which is “license key” and then jump forward 401 bytes. If we cut out the 401 bytes after the jump forward and disassemble again we get even more:

LOAD_NAME 0x9100 # license_key
NOP 
JUMP_FORWARD 401
LOAD_CONST 37121 # 0
NOP 
JUMP_FORWARD 401
NOP

Cool, we’re starting to get more. Now if we do this a few more times, we start to see some interesting stuff:

LOAD_NAME 0x9100 # license_key
NOP 
JUMP_FORWARD 401
LOAD_CONST 37121 # 0
NOP 
JUMP_FORWARD 401
BINARY_SUBSCR 
JUMP_FORWARD 401
STORE_NAME 37122 #   
NOP 
JUMP_FORWARD 401
LOAD_CONST 37122
NOP 
JUMP_FORWARD 401
LOAD_CONST 37130 # 542
NOP 
JUMP_FORWARD 401
BINARY_SUBSCR 
JUMP_FORWARD 401
STORE_NAME 37123 #    
NOP 
JUMP_FORWARD 401
LOAD_CONST 37131 # <code object <module> at 0x710800, file "cmp_eq.py", line -1>
NOP 
JUMP_FORWARD 401
LOAD_CONST 0x9100 # None
NOP 
JUMP_FORWARD 401
NOP 
JUMP_FORWARD 401
EXEC_STMT 
JUMP_FORWARD 401
NOP

If we strip out all the JUMP_FORWARD and NOP instructions it becomes easier to see what it’s doing:

LOAD_NAME 0x9100 # license_key
NOP 
JUMP_FORWARD 401
LOAD_CONST 37121 # 0
NOP 
JUMP_FORWARD 401
BINARY_SUBSCR 
JUMP_FORWARD 401
STORE_NAME 37122 #   
NOP 
JUMP_FORWARD 401
LOAD_CONST 37122
NOP 
JUMP_FORWARD 401
LOAD_CONST 37130 # 542
NOP 
JUMP_FORWARD 401
BINARY_SUBSCR 
JUMP_FORWARD 401
STORE_NAME 37123 #    
NOP 
JUMP_FORWARD 401
LOAD_CONST 37131 # <code object <module> at 0x710800, file "cmp_eq.py", line -1>
NOP 
JUMP_FORWARD 401
LOAD_CONST 0x9100 # None
NOP 
JUMP_FORWARD 401
NOP 
JUMP_FORWARD 401
EXEC_STMT 
JUMP_FORWARD 401
NOP

It’s taking the variable named “license_key”, and a constant with the value 0, and doing a BINARY_SUBSCR, which allows you to retrieve a value at an index, and then it is storing it in name 37122, which is a string of whitespace. This is equivalent to the following line of Python:

= license_key[0]

It then does the same thing, but instead of using the variable “license_key” it uses const 37122, which if we look is actually a PNG, and it gets the value at index 542 and stores it in a different name 37123, which is also whitespace. After that it loads const 37131, which is a code object, and does an exec. If we look at the code in const 37131 it’s fairly simple:

LOAD_NAME 0 #
LOAD_NAME 1 #
COMPARE_OP 2
STORE_NAME 2 #
LOAD_CONST 0 # None
RETURN_VALUE

It is loading the two whitespace named variables that we just set up, comparing them, and storing the result of the comparison in a third whitespace named variable. This is where the license key is actually being checked. We can make the program actually spit out it’s key by simply inserting a print statement in this code object:

LOAD_NAME 0 #   
LOAD_NAME 1 #
DUP_TOP
PRINT_ITEM
PRINT_NEWLINE    
COMPARE_OP 2
STORE_NAME 2 #  
LOAD_CONST 0 # None
RETURN_VALUE 

Now if we build the patched stage2.pyasm file with makepy and run verify_license again we can see it print out the correct key:

> from stage2solve import *
>>> verify_license(‘A’ * 100)
1
_
W
4
n
n
A
_
b
3
_
T
h
3
_
v
E
R
y
_
b
3
S
T
!
False
>>>

So the license key is “1_W4nnA_b3_Th3_vERy_b3ST!” if we go and plug that into the program we can see that it turns green and activates.

Yay! Now we can press the “Generate APT” button, which creates a file called “EVIL_MALWARE_CYBER_PATHOGEN.pyc”. If we run that it will scroll the ASCII art flag across the screen.

And there’s the flag! PAN{l1Ke_n0_oN3_ev3r_Wa5}!