Solving the Disobey 2020 puzzle bootloader using Unicorn and Ghidra
This challenge was part of the Disobey 2020 puzzle. Full write up has been written by whois. This write up is about solving the reverse engineering challenge using Unicorn and Ghidra. If you wish to know how this has been done using radare2 check out the write up by Petteri.
The boot sector is a Master Boot Record (MBR). This can be figured out easily by searching for the boot signature (0x55 0xAA) at file offset 0x1FE.
First, let's take a quick look at the bootloader using Ghidra. Because the CPU is in real mode, we have to select the “x86 real mode 16-bit” as the language:
MBR entrypoint is at address 0x7C00. The memory block has to be moved manually: Also, instructions have to be disassembled manually.
At first, the bootloader checks that the CPU manufacturer ID “CX” register contains “nt”, i.e. that the manufacturer ID is “GenuineIntel”.ยด If the CPU is not intel it halts (0x7c56).
After that, before jumping to 0x7cb5 it patches the corresponding instruction to “INT 0x13”, which is used to load the data located after the MBR to the memory.
At 0x7e5d, the CPU changes from real 16-bit to protective 32-bit mode, by setting the Protection Enable (PE) bit in the CR0 register. Because of the mode change sequential instructions are displayed incorrectly. For now, let's continue the analysis using Unicorn.
Unicorn is a CPU emulator framework based on QEMU, that has bindings for various languages, such as Python. At first Unicorn and Capstone are initialized. Capstone is used to disasm instructions.
md = Cs(CS_ARCH_X86, CS_MODE_16) # Capstone
mu = Uc(UC_ARCH_X86, UC_MODE_16) # Unicorn
md.detail = True
To execute the bootloader
correctly, the MBR is
mapped to 0x7c00 and
0x10000 bytes of memory are allocated.
Processing interrupts
and tracking execution
is done using UC_HOOK_INTR
and UC_HOOK_CODE
hooks.
The emulator stops
after executing the defined amount of instructions
and the memory is dumped to
disk for further analysis in Ghidra.
if __name__ == "__main__":
with open("bootloader.bin", "rb") as f:
file_data = f.read()
img_base = 0x7c00
entry_point = img_base
try:
# Initialize CPU emulator
# Write image to the emulator's memory
mem_size = 0x10000
mu.mem_map(0, mem_size)
mu.mem_+write(img_base, file_data[0:512]) # Write MBR
# Set hooks
mu.hook_add(UC_HOOK_CODE, hook_code)
mu.hook_add(UC_HOOK_INTR, hook_intr)
# Exec 100000 instructions
mu.emu_start(entry_point, 0, count=100000)
print("Emulation done")
except UcError as e:
print("ERROR: %s" % e)
dump_memory(0, mem_size, "./bootloader_mem_dump.bin")
The following code processes the interrupts used to print characters and to read from the disk:
def hook_intr(mu, intno, user_data):
global mu_stdout
ah = mu.reg_read(UC_X86_REG_AH)
al = mu.reg_read(UC_X86_REG_AL)
dh = mu.reg_read(UC_X86_REG_DH)
dl = mu.reg_read(UC_X86_REG_DL)
cl = mu.reg_read(UC_X86_REG_CL)
ch = mu.reg_read(UC_X86_REG_CH)
es = mu.reg_read(UC_X86_REG_ES)
bx = mu.reg_read(UC_X86_REG_BX)
if intno == INT_PRINT:
char_addr = mu.reg_read(UC_X86_REG_SI)
if ah == 0x0e: # print char
mu_stdout += mu.mem_read(char_addr - 1, 1)
elif intno == INT_DISK:
if ah == READ_SECTORS:
n_sectors = al
head = dh
drive = dl
sector = cl
cylinder = ch
lba_start = (cylinder * 16 + head) * 63 + (sector - 1)
lba_end = (cylinder * 16 + head) * 63 + ((sector + n_sectors) - 1)
start_offset = lba_start * 512
end_offset = lba_end * 512
# Write results
data = file_data[start_offset:end_offset]
mu.mem_write(bx, data)
mu.reg_write(UC_X86_REG_AL, 0)
mu.reg_write(UC_X86_REG_AH, 0)
eflags = mu.reg_read(UC_X86_REG_EFLAGS)
mu.reg_write(UC_X86_REG_EFLAGS, eflags & ~(1 << 0))
else:
print("Unknown interrupt no: %xh" % intno)
print("Stop emulator")
mu.emu_stop()
The following code is used to spoof the CPUID and to switch from real mode to protective mode. The address where to switch modes is simply hardcoded. Executed instructions are printed to make the analysis easier.
def hook_code(mu, address, size, user_data):
global md
code = mu.mem_read(address, size)
insns = list(md.disasm(
bytes(code), address, count=1))
if len(insns) == 0:
print("Could not decode ins at 0x%x" % address)
return
insn = insns[0]
address = insn.address
size = insn.size
mnemonic = insn.mnemonic
op_str = insn.op_str
print("0x%x:\t%s\t%s\tBytes: %s\tBP: %x, BX: %x, CX: %x, DX: %x, EAX: %x, ESI: %x, EDI: %x, CR0: %x" %
(address, mnemonic, op_str, ''.join('{:02x}'.format(x) for x in code[:size]), mu.reg_read(UC_X86_REG_BP),
mu.reg_read(UC_X86_REG_BX), mu.reg_read(
UC_X86_REG_CX), mu.reg_read(UC_X86_REG_DX),
mu.reg_read(UC_X86_REG_EAX), mu.reg_read(
UC_X86_REG_ESI), mu.reg_read(UC_X86_REG_EDI),
mu.reg_read(UC_X86_REG_CR0)))
if (mnemonic == "cpuid"):
mu.reg_write(UC_X86_REG_IP, address + size) # Jump over
mu.reg_write(UC_X86_REG_CX, 0x746e) # Spoof CPUID
if address == 0x7e65:
# Because of switch from 16 bit real mode to 32 bit protect mode
md = Cs(CS_ARCH_X86, CS_MODE_32)
md.detail = True
The bootloader starts to loop indefinitely around 0x3028. Let's analyse the memory dump in Ghidra. By disassembling the dump we can clearly see that the code destroyes the data at 0x303F by writing 0xFF.
So let's dump the memory when the CPU reaches 0x3028:
if address == 0x3028:
mu.emu_stop()
Now the data is intact, and we can see the following strings:
The code seen below first outputs the “place flag at 0x3000” string and then begins validating the flag bytes. We can clearly see that the first and the second bytes of the flag are 0x42.
Because instructions have to be disassembled manually using the addresses of executed instructions from Unicorn and Ghidra does not find well instruction boundaries with only that information Unicorn is used to find the rest of the flag.
First, change the emulator to stop when the fail branch has been reached:
mu.emu_start(entry_point, 0x3114)
and place the known flag bytes to 0x3000:
if address == 0x304d:
mu.mem_write(0x3000, b'\x42\x42\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
Then the flag is calcuted manually:
0x308a: mov ax, word ptr [0x3004] EAX: 41
0x3090: add al, 0x32 EAX: 0
0x3092: cmp al, 0x7b EAX: 32
0x3094: jne 0x3114 EAX: 32
al + 0x32 = 0x7b => 0x49
0x3096: jmp 0x309d EAX: ff7b
0x309d: sub ah, 0x20 EAX: ff7b
0x30a0: cmp ah, 0x54 EAX: df7b
0x30a3: jne 0x3114 EAX: df7b
ah - 0x20 = 0x54 => 0x74
0x30a5: mov ax, word ptr [0x3002] BX: 7e00, EAX: 547b
0x30ab: push ax BX: 7e00, EAX: 0
0x30ad: pop bx BX: 7e00, EAX: 0
0x30af: and ax, 0xf0f BX: 0, EAX: 0
0x30b3: cmp ax, 0x202 BX: 0, EAX: 0
0x30b7: jne 0x3114 BX: 0, EAX: 0
(ax & 0xf0f) == 0x202
ax & 0xf = 0x2
(ax » 16) = 0x2
0x30b7: jne 0x3114 BX: 202, EAX: 202
0x30b9: xor bx, 0x202 BX: 202, EAX: 202
0x30be: cmp bx, 0x4040 BX: 0, EAX: 202
0x30c3: jne 0x3114 BX: 0, EAX: 202
bx = ax (bx & 0xf0) = 4 (bx » 16) & 0xf0 = 4
0x30d1: mov eax, dword ptr [0x3006] EAX: 202
0x30d6: inc al EAX: 0
0x30d8: jne 0x30de EAX: 1
0x30de: cmp al, 0x47 EAX: 1
0x30e0: jne 0x3114 EAX: 1
1 + al = 0x47 => 0x46
0x30de: cmp al, 0x47 EAX: 47
0x30e0: jne 0x3114 EAX: 47
0x30e2: shr eax, 8 EAX: 47
0x30e5: cmp al, 0x47 EAX: 0
0x30e7: je 0x3114 EAX: 0
0x30e9: cmp al, 0x69 EAX: 0
0x30eb: je 0x30f1 EAX: 0
eax = 0x00000047 (eax » 8) == 0x69
0x30eb: je 0x30f1 EAX: 69
0x30f1: shr eax, 8 EAX: 69
0x30f4: cmp ax, 0x7374 EAX: 0
0x30f8: jne 0x3114 EAX: 0
ax = 0x7374
Finally, the flag is \x42\x42\x42\x42\x49\x74\x46\x69\x74\x73
= BBBBItFits