Solving the Disobey 2020 puzzle bootloader using Unicorn and Ghidra

This challenge was part of the Disobey 2020 puzzle. Full write up has been written by whois. This write up is about solving the reverse engineering challenge using Unicorn and Ghidra. If you wish to know how this has been done using radare2 check out the write up by Petteri.

The boot sector is a Master Boot Record (MBR). This can be figured out easily by searching for the boot signature (0x55 0xAA) at file offset 0x1FE.

First, let's take a quick look at the bootloader using Ghidra. Because the CPU is in real mode, we have to select the “x86 real mode 16-bit” as the language:

MBR entrypoint is at address 0x7C00. The memory block has to be moved manually: Also, instructions have to be disassembled manually.

At first, the bootloader checks that the CPU manufacturer ID “CX” register contains “nt”, i.e. that the manufacturer ID is “GenuineIntel”.´ If the CPU is not intel it halts (0x7c56).

After that, before jumping to 0x7cb5 it patches the corresponding instruction to “INT 0x13”, which is used to load the data located after the MBR to the memory.

At 0x7e5d, the CPU changes from real 16-bit to protective 32-bit mode, by setting the Protection Enable (PE) bit in the CR0 register. Because of the mode change sequential instructions are displayed incorrectly. For now, let's continue the analysis using Unicorn.

Unicorn is a CPU emulator framework based on QEMU, that has bindings for various languages, such as Python. At first Unicorn and Capstone are initialized. Capstone is used to disasm instructions.

md = Cs(CS_ARCH_X86, CS_MODE_16)  # Capstone
mu = Uc(UC_ARCH_X86, UC_MODE_16)  # Unicorn

md.detail = True

To execute the bootloader correctly, the MBR is mapped to 0x7c00 and 0x10000 bytes of memory are allocated. Processing interrupts and tracking execution is done using UC_HOOK_INTR and UC_HOOK_CODE hooks. The emulator stops after executing the defined amount of instructions and the memory is dumped to disk for further analysis in Ghidra.

if __name__ == "__main__":

    with open("bootloader.bin", "rb") as f:
        file_data = f.read()

    img_base = 0x7c00
    entry_point = img_base

    try:
        # Initialize CPU emulator
        # Write image to the emulator's memory
        mem_size = 0x10000
        mu.mem_map(0,  mem_size)

        mu.mem_+write(img_base, file_data[0:512])  # Write MBR

        # Set hooks
        mu.hook_add(UC_HOOK_CODE, hook_code)
        mu.hook_add(UC_HOOK_INTR, hook_intr)

        # Exec 100000 instructions
        mu.emu_start(entry_point, 0, count=100000)
        print("Emulation done")

    except UcError as e:
        print("ERROR: %s" % e)

    dump_memory(0, mem_size, "./bootloader_mem_dump.bin")

The following code processes the interrupts used to print characters and to read from the disk:

def hook_intr(mu, intno, user_data):
    global mu_stdout

    ah = mu.reg_read(UC_X86_REG_AH)
    al = mu.reg_read(UC_X86_REG_AL)
    dh = mu.reg_read(UC_X86_REG_DH)
    dl = mu.reg_read(UC_X86_REG_DL)
    cl = mu.reg_read(UC_X86_REG_CL)
    ch = mu.reg_read(UC_X86_REG_CH)
    es = mu.reg_read(UC_X86_REG_ES)
    bx = mu.reg_read(UC_X86_REG_BX)

    if intno == INT_PRINT:
        char_addr = mu.reg_read(UC_X86_REG_SI)
        if ah == 0x0e:  # print char
            mu_stdout += mu.mem_read(char_addr - 1, 1)
    elif intno == INT_DISK:
        if ah == READ_SECTORS:
            n_sectors = al
            head = dh
            drive = dl
            sector = cl
            cylinder = ch

            lba_start = (cylinder * 16 + head) * 63 + (sector - 1)
            lba_end = (cylinder * 16 + head) * 63 + ((sector + n_sectors) - 1)

            start_offset = lba_start * 512
            end_offset = lba_end * 512

            # Write results
            data = file_data[start_offset:end_offset]
            mu.mem_write(bx, data)

            mu.reg_write(UC_X86_REG_AL, 0)
            mu.reg_write(UC_X86_REG_AH, 0)

            eflags = mu.reg_read(UC_X86_REG_EFLAGS)
            mu.reg_write(UC_X86_REG_EFLAGS, eflags & ~(1 << 0))
    else:
        print("Unknown interrupt no: %xh" % intno)
        print("Stop emulator")
        mu.emu_stop()

The following code is used to spoof the CPUID and to switch from real mode to protective mode. The address where to switch modes is simply hardcoded. Executed instructions are printed to make the analysis easier.

def hook_code(mu, address, size, user_data):
    global md

    code = mu.mem_read(address, size)
    insns = list(md.disasm(
        bytes(code), address, count=1))

    if len(insns) == 0:
        print("Could not decode ins at 0x%x" % address)
        return

    insn = insns[0]
    address = insn.address
    size = insn.size
    mnemonic = insn.mnemonic
    op_str = insn.op_str

    print("0x%x:\t%s\t%s\tBytes: %s\tBP: %x, BX: %x, CX: %x, DX: %x, EAX: %x, ESI: %x, EDI: %x, CR0: %x" %
          (address, mnemonic, op_str, ''.join('{:02x}'.format(x) for x in code[:size]), mu.reg_read(UC_X86_REG_BP),
           mu.reg_read(UC_X86_REG_BX), mu.reg_read(
               UC_X86_REG_CX), mu.reg_read(UC_X86_REG_DX),
           mu.reg_read(UC_X86_REG_EAX), mu.reg_read(
               UC_X86_REG_ESI), mu.reg_read(UC_X86_REG_EDI),
           mu.reg_read(UC_X86_REG_CR0)))

    if (mnemonic == "cpuid"):
        mu.reg_write(UC_X86_REG_IP, address + size)  # Jump over
        mu.reg_write(UC_X86_REG_CX, 0x746e)  # Spoof CPUID

    if address == 0x7e65:
        # Because of switch from 16 bit real mode to 32 bit protect mode
        md = Cs(CS_ARCH_X86, CS_MODE_32)
        md.detail = True

The bootloader starts to loop indefinitely around 0x3028. Let's analyse the memory dump in Ghidra. By disassembling the dump we can clearly see that the code destroyes the data at 0x303F by writing 0xFF.

So let's dump the memory when the CPU reaches 0x3028:

if address == 0x3028:
    mu.emu_stop()

Now the data is intact, and we can see the following strings:

The code seen below first outputs the “place flag at 0x3000” string and then begins validating the flag bytes. We can clearly see that the first and the second bytes of the flag are 0x42.

Because instructions have to be disassembled manually using the addresses of executed instructions from Unicorn and Ghidra does not find well instruction boundaries with only that information Unicorn is used to find the rest of the flag.

First, change the emulator to stop when the fail branch has been reached:

mu.emu_start(entry_point, 0x3114)

and place the known flag bytes to 0x3000:

if address == 0x304d:
    mu.mem_write(0x3000, b'\x42\x42\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')

Then the flag is calcuted manually:

0x308a: mov     ax, word ptr [0x3004]  EAX: 41
0x3090: add     al, 0x32               EAX: 0
0x3092: cmp     al, 0x7b               EAX: 32
0x3094: jne     0x3114                 EAX: 32

al + 0x32 = 0x7b => 0x49

0x3096: jmp     0x309d      EAX: ff7b
0x309d: sub     ah, 0x20    EAX: ff7b
0x30a0: cmp     ah, 0x54    EAX: df7b
0x30a3: jne     0x3114      EAX: df7b

ah - 0x20 = 0x54 => 0x74

0x30a5: mov     ax, word ptr [0x3002]  BX: 7e00, EAX: 547b
0x30ab: push    ax                     BX: 7e00, EAX: 0
0x30ad: pop     bx                     BX: 7e00, EAX: 0
0x30af: and     ax, 0xf0f              BX: 0, EAX: 0
0x30b3: cmp     ax, 0x202              BX: 0, EAX: 0
0x30b7: jne     0x3114                 BX: 0, EAX: 0

(ax & 0xf0f) == 0x202

ax & 0xf = 0x2

(ax » 16) = 0x2

0x30b7: jne     0x3114      BX: 202, EAX: 202
0x30b9: xor     bx, 0x202   BX: 202, EAX: 202
0x30be: cmp     bx, 0x4040  BX: 0, EAX: 202
0x30c3: jne     0x3114      BX: 0, EAX: 202

bx = ax (bx & 0xf0) = 4 (bx » 16) & 0xf0 = 4

0x30d1: mov     eax, dword ptr [0x3006] EAX: 202
0x30d6: inc     al                      EAX: 0
0x30d8: jne     0x30de                  EAX: 1
0x30de: cmp     al, 0x47                EAX: 1
0x30e0: jne     0x3114                  EAX: 1

1 + al = 0x47 => 0x46

0x30de: cmp     al, 0x47     EAX: 47
0x30e0: jne     0x3114       EAX: 47
0x30e2: shr     eax, 8       EAX: 47
0x30e5: cmp     al, 0x47     EAX: 0
0x30e7: je      0x3114       EAX: 0
0x30e9: cmp     al, 0x69     EAX: 0
0x30eb: je      0x30f1       EAX: 0

eax = 0x00000047 (eax » 8) == 0x69

0x30eb: je      0x30f1      EAX: 69
0x30f1: shr     eax, 8      EAX: 69
0x30f4: cmp     ax, 0x7374  EAX: 0
0x30f8: jne     0x3114      EAX: 0

ax = 0x7374

Finally, the flag is \x42\x42\x42\x42\x49\x74\x46\x69\x74\x73 = BBBBItFits