Challenge Info
| Field | Details |
|---|---|
| Name | Diablo |
| Category | Reverse Engineering |
| Flag format | BITSCTF{} |
| Description | “I bought this program but I lost the license file…” |
| File | challenge — ELF 64-bit LSB PIE executable, x86-64, statically linked |
Flag
BITSCTF{l4y3r_by_l4y3r_y0u_unr4v3l_my_53cr375}
Read in leetspeak: “layer by layer you unravel my secrets” — perfectly describing the approach to solving this challenge: peeling off one protection layer at a time.
Vulnerability Summary
The challenge implements five stacked protection layers:
- UPX packing — real code is compressed inside the binary
- Anti-debug protection — checks
/proc/self/statusTracerPid, WSL detection, parent process name check (gdb, strace, ltrace, ida64, etc.) - Custom bytecode VM — program logic runs inside a hand-rolled virtual machine
- License format obfuscation — the license key bytes are hex-encoded (not raw binary)
- 10-byte repeating XOR encryption — flag characters are XOR’d with a cycling 10-byte key
The core vulnerability exploited is an unintentional hidden debug interface: each VM opcode handler calls getenv("OPCODE_NAME") at runtime and prints debug info if the environment variable is set. By setting PRINT_FLAG_CHAR=1, we force the VM to output each flag character as it is computed, without needing to reverse the full VM bytecode. Combined with a known-plaintext XOR attack using the known flag prefix BITSCTF{, the full 10-byte key can be recovered.
Step 1: Initial Reconnaissance
1.1 Identify the binary
$ file challenge
challenge: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV),
statically linked, no section header
Key observations:
- Statically linked + no section header: The binary has been stripped or packed. There is no symbol table for IDA/Ghidra to resolve function names. All analysis must be done on raw bytes.
- PIE executable: The load address changes every run (ASLR). Disable it in GDB with
set disable-randomization onto keep addresses reproducible.
1.2 Run strings to find clues
$ strings challenge | head -20
6UPX!P
/lib64
nux-x86-
strchr
fseek
atoimn
...
GLIBC_2.7
The very first meaningful string is UPX! — the magic marker of UPX (Ultimate Packer for eXecutables). UPX compresses the original binary and prepends a small decompression stub that inflates the real code into memory at runtime. This means:
- The actual program logic is compressed inside the binary and invisible to static analysis tools.
- IDA/Ghidra can only disassemble the decompression stub; the real functions are hidden.
- We must either unpack the binary statically (using
upx -d) or — if UPX is modified andupx -dfails — dump the decompressed code from memory at runtime using GDB.
1.3 Open with IDA — Confirm UPX
The file as.asm (exported from IDA Free) confirms this:
; String at LOAD:0x30CC:
aInfoThisFileIs db '$Info: This file is packed with the UPX executable packer http://upx.sf.net $',0Ah,0
; String at LOAD:0x311B:
aIdUpx396Copyri db '$Id: UPX 3.96 Copyright (C) 1996-2020 the UPX Team. All Rights Reserved. $',0Ah,0
IDA can only disassemble the UPX decompression stub at the entry point 0x2F88. The real program code is compressed data at this stage and IDA cannot interpret it.
The UPX entry stub (start at 0x2F88) calls into sub_2FFA, which implements the LZ77-based decompression loop. After this runs, the real code is mapped to new anonymous memory regions.
LOAD:0000000000002F88 start proc near
LOAD:0000000000002F88 push rax
LOAD:0000000000002F89 push rdx
LOAD:0000000000002F8A call loc_323E ; call decompressor setup
LOAD:0000000000002F8F push rbp
LOAD:0000000000002F90 push rbx
LOAD:0000000000002F91 push rcx
LOAD:0000000000002F92 push rdx
LOAD:0000000000002F93 add rsi, rdi
LOAD:0000000000002F96 push rsi
LOAD:0000000000002F97 mov rsi, rdi
LOAD:0000000000002F9A mov rdi, rdx
LOAD:0000000000002F9D xor ebx, ebx
LOAD:0000000000002F9F xor ecx, ecx
LOAD:0000000000002FA1 or rbp, 0FFFFFFFFFFFFFFFFh
LOAD:0000000000002FA5 call sub_2FFA ; LZ77 decompressor loop
Step 2: Run the Binary — Understand Behavior
2.1 Run without arguments
$ ./challenge
Welcome my DRM-protected application!
To sucessfully get in, you must present a valid license file.
Reverse engineer this binary to figure out the license format, and get the flag. :)
Good luck!
Usage: ./challenge <license file path>
The binary simulates a DRM-protected application that requires a license file. Supplying the correct file reveals the flag.
2.2 Run with a trivial license file
$ echo "test" > lic.txt
$ ./challenge lic.txt
[i] loaded license file
[!] invalid license format
The file is read successfully but fails the format check. We need to find the correct format.
2.3 Try the prefix LICENSE-
$ echo "LICENSE-" > lic.txt
$ ./challenge lic.txt
[i] loaded license file
processing... please wait...
[i] running program...
The flag lies here somewhere...
Key discovery: The prefix LICENSE- passes the format check! The binary runs an internal “program”, but the flag is not yet printed — something is still missing.
Step 3: Dump Runtime Memory with GDB
3.1 Why memory dump?
Because UPX compresses the real code, it only exists in memory after decompression. Static analysis tools see only the stub. To analyze the real logic we need to:
- Let the binary run (UPX decompresses itself).
- Pause execution after decompression, before the anti-debug check.
- Dump all readable memory regions to disk for offline analysis.
3.2 GDB dump script
The key insight is to catch syscall write — the very first output the binary writes (the “Welcome…” message) happens only after UPX has fully decompressed everything. This is a safe and reliable pause point.
dump_memory.gdb:
set disable-randomization on
set pagination off
set logging file /tmp/gdb_out.log
set logging on
file ./challenge
# Create a dummy license file
shell echo "test" > /tmp/lic_test.txt
set args /tmp/lic_test.txt
# The binary writes "Welcome..." via write() AFTER UPX decompression.
# Catch the first write syscall to pause at the decompressed state.
catch syscall write
run
# At this point UPX has fully decompressed and we're inside the real binary.
# Use embedded Python to dump all readable memory regions.
python
import gdb, os
inf = gdb.selected_inferior()
pid = inf.pid
print(f"[*] Process PID: {pid}")
maps_raw = open(f"/proc/{pid}/maps").read()
print("[*] Memory maps:")
print(maps_raw)
os.makedirs("/tmp/diablo_dump", exist_ok=True)
for line in maps_raw.splitlines():
parts = line.split()
if len(parts) < 2:
continue
perms = parts[1]
addr_range = parts[0]
start, end = [int(x, 16) for x in addr_range.split("-")]
size = end - start
if size == 0 or "r" not in perms:
continue
try:
data = bytes(inf.read_memory(start, size))
fname = f"/tmp/diablo_dump/seg_{start:016x}_{end:016x}_{perms}.bin"
with open(fname, "wb") as f:
f.write(data)
print(f"[+] Dumped {perms} {addr_range} ({size} bytes) -> {fname}")
except Exception as e:
print(f"[-] Failed {addr_range}: {e}")
end
quit
3.3 Resulting memory map
After decompression, the memory layout looks like this:
7ffff7c00000-7ffff7e05000 r-xp libc.so.6 (C standard library)
7ffff7fac000-7ffff7fad000 r--p challenge (original binary stub, only 4 KB)
7ffff7ff1000-7ffff7ff3000 r--p [anonymous] (read-only data)
7ffff7ff3000-7ffff7ff9000 r-xp [anonymous] (REAL CODE — 24 KB!)
7ffff7ff9000-7ffff7ffc000 r--p [anonymous] (RODATA — strings!)
7ffff7ffc000-7ffff7fff000 rw-p [anonymous] (DATA — heap, globals)
Regions of interest:
0x7ffff7ff3000(24 KB,r-xp) — The real code segment containing all program logic after decompression.0x7ffff7ff9000(12 KB,r--p) — The rodata segment containing all string constants referenced by the decompressed code.
Step 4: Analyze Strings in Rodata — Discover the Virtual Machine
4.1 Extract strings from the rodata dump
From the region at 0x7ffff7ff9000, we extract all meaningful strings:
# Anti-debug strings
'/proc/version' ← check if running under WSL
'Microsoft', 'microsoft' ← WSL marker: if present → detected
'/proc/self/status' ← read TracerPid field
'TracerPid:' ← if TracerPid != 0 → debugger detected
'/proc/%d/comm' ← read parent process name
'lldb', 'strace', 'ltrace', 'radare2', 'ida64', 'x64dbg', 'ollydbg', 'windbg'
'DEBUGGER DETECTED! LICENSING TERMS VIOLATED! >:('
# UI / UX strings
'Welcome my DRM-protected application!'
'Usage: ./challenge <license file path>'
'[i] loaded license file'
'[!] invalid license format'
'processing... please wait...'
'[i] running program...'
# License format
'LICENSE-' ← mandatory prefix
'%02x' ← sscanf format string → HEX DECODE!
'MYVERYREALLDRM' ← product/DRM name
# ===== VIRTUAL MACHINE STRINGS =====
'GET_LICENSE_BYTE: register out of bounds'
'GET_LICENSE_BYTE[%u] -> 0 (OOB/NULL)'
'PRINT_FLAG_CHAR'
'PRINT_CHAR: register out of bounds'
'[!] failed to create virtual machine instance.'
'Register dump'
'Register %02d - Decimal:%04d [Hex:%04X]'
'Register %02d - str: %s'
'Register %02d has unknown type!'
'Z-FLAG:true'
'Z-FLAG:false'
"The register doesn't contain a string"
"The register doesn't contain an integer"
'RAM allocation failure.'
'%04X - op_unknown(%02X)'
'Register out of bounds'
'Division by zero!'
'Reading from outside RAM'
'Writing outside RAM'
'stack overflow - stack is full'
'stack overflow - stack is empty'
4.2 Analysis — This Is a Custom Bytecode VM!
From these strings, we can reconstruct the VM architecture without reading a single line of real code:
| Component | Details |
|---|---|
| Registers | 10 registers (0–9), supporting both integer and string types |
| RAM | Dedicated VM memory with bounds checking |
| Stack | Separate stack with overflow and underflow detection |
| Z-FLAG | Zero flag used for conditional branching |
| Opcodes | Includes GET_LICENSE_BYTE, PRINT_FLAG_CHAR, arithmetic, memory operations |
| Debug mode | Each opcode handler calls getenv("OPCODE_NAME") to enable per-opcode tracing |
The critical detail — hidden debug interface: Every VM opcode handler checks if an environment variable with the opcode’s own name is set. If the variable exists (any value), the handler prints debug output to stdout. This is a developer debugging facility that was left in the binary. Crucially:
- Setting
PRINT_FLAG_CHAR=1will cause the VM to actually print each flag character as it is computed. - Setting
GET_LICENSE_BYTE=1will dump each license key byte access.
Step 5: Exploit the Debug Mode — Enable PRINT_FLAG_CHAR
5.1 The PRINT_FLAG_CHAR handler (reconstructed from Capstone disassembly)
Using Python + Capstone to disassemble the code segment dump, we find the handler at 0x7ffff7ff4785:
; PRINT_FLAG_CHAR opcode handler
; -----------------------------------------------
; Check if env var "PRINT_FLAG_CHAR" is set
lea rax, [rip + 0x4af2] ; rax = ptr to "PRINT_FLAG_CHAR" string
mov rdi, rax
call getenv ; rc = getenv("PRINT_FLAG_CHAR")
test rax, rax ; if NULL, env var not set
je skip_print ; → skip the print entirely
; env var is set → proceed to print the flag character
; Compute: flag_char = encrypted_byte XOR license_key[i % 10]
; ... (computation instructions) ...
mov edi, eax ; edi = flag character to print
call putchar ; print it to stdout
Takeaway: If we export PRINT_FLAG_CHAR=1 before running the binary, the VM will call putchar() for each flag character — effectively bypassing the need to reverse the XOR decryption logic.
5.2 Run with env vars enabled
$ echo "LICENSE-" > lic.txt
$ PRINT_FLAG_CHAR=1 GET_LICENSE_BYTE=1 ./challenge lic.txt
[i] loaded license file
processing... please wait...
[i] running program...
The flag lies here somewhere...
GET_LICENSE_BYTE[0] -> 0 (OOB/NULL)
ۼGET_LICENSE_BYTE[1] -> 0 (OOB/NULL)
¼GET_LICENSE_BYTE[2] -> 0 (OOB/NULL)
3GET_LICENSE_BYTE[3] -> 0 (OOB/NULL)
B...
Result: The VM prints encrypted bytes (interleaved with GET_LICENSE_BYTE debug lines). These bytes are the encrypted flag characters — they are not correct yet because the license key bytes are OOB (Out-Of-Bounds / NULL), meaning the XOR key bytes default to 0x00.
The pattern GET_LICENSE_BYTE[0] → [1] → … → GET_LICENSE_BYTE[9] → GET_LICENSE_BYTE[0] → … shows the VM reads exactly 10 license bytes in a cycle throughout the 46-character flag.
Step 6: Reverse the main() Function — Discover HEX DECODE
6.1 Disassemble main() with Capstone
We use Python + Capstone to disassemble the real code segment from the dumps:
from capstone import Cs, CS_ARCH_X86, CS_MODE_64
md = Cs(CS_ARCH_X86, CS_MODE_64)
with open("/tmp/diablo_dump/seg_7ffff7ff3000_7ffff7ff9000_r-xp.bin", "rb") as f:
code = f.read()
CODE_BASE = 0x7ffff7ff3000
# Disassemble from main() at 0x7ffff7ff3fcf
offset = 0x7ffff7ff3fcf - CODE_BASE
for insn in md.disasm(code[offset:offset+512], 0x7ffff7ff3fcf):
print(f"0x{insn.address:x}: {insn.mnemonic} {insn.op_str}")
6.2 Reconstructed main() logic
From the disassembly, we reconstruct the following pseudocode:
int main(int argc, char *argv[]) {
if (argc <= 1) {
// Print welcome message and usage
puts("Welcome my DRM-protected application!");
puts("Usage: ./challenge <license file path>");
return -1;
}
// 1. Read the license file into memory
char *file_data = read_file(argv[1]);
puts("[i] loaded license file");
// 2. Check mandatory prefix "LICENSE-"
if (file_data == NULL || strncmp(file_data, "LICENSE-", 8) != 0) {
puts("[!] invalid license format");
return -1;
}
// 3. Advance past the prefix
char *hex_data = file_data + 8; // pointer to hex string
size_t len = strlen(hex_data);
// 4. Strip trailing whitespace (\n, \r, space)
while (len > 0) {
char c = hex_data[len - 1];
if (c != '\n' && c != '\r' && c != ' ') break;
len--;
}
// 5. ★★★ HEX DECODE ★★★
// Each pair of hex digits encodes one raw key byte.
// The format string "%02x" is stored in rodata and used here.
size_t key_len = len / 2; // 2 hex chars → 1 byte
uint8_t *key = calloc(1, key_len + 1);
for (size_t i = 0; i < key_len; i++) {
sscanf(hex_data + i*2, "%02x", &key[i]); // ← "%02x" in rodata
}
// 6. Pass key bytes to the VM
setup_vm_license(key, key_len);
// 7. Anti-debug check
// Reads /proc/self/status → TracerPid field
// Reads /proc/version → checks for "Microsoft" (WSL)
// Reads /proc/<ppid>/comm → checks parent process name
if (check_debugger()) {
puts("DEBUGGER DETECTED! LICENSING TERMS VIOLATED! >:(");
return -1;
}
// 8. Trigger SIGILL → signal handler executes the VM
// (ud2 = undefined instruction, generates SIGILL)
__asm__("ud2"); // → SIGILL → installed handler runs VM
run_vm();
return 0;
}
6.3 The key discovery — HEX ENCODING
The line sscanf(hex_data + i*2, "%02x", &key[i]) is the most important finding.
The license file must contain hex-encoded bytes after the LICENSE- prefix — not raw binary data. For example:
LICENSE-41424344→ 4 key bytes =[0x41, 0x42, 0x43, 0x44]="ABCD"LICENSE-99f5671124d520d5f63c→ 10 key bytes (the correct key)
This explains why all our earlier tests with raw bytes after LICENSE- resulted in OOB/NULL readings — the raw chars were not valid two-digit hex sequences, so sscanf returned 0 for each.
Step 7: Confirm the XOR Cipher
7.1 Baseline — all-zero key
We craft a license where all 10 key bytes are 0x00:
$ echo "LICENSE-00000000000000000000" > lic.txt
$ PRINT_FLAG_CHAR=1 ./challenge lic.txt
This outputs 46 encrypted bytes. With key = 0x00, enc[i] XOR 0x00 = enc[i] — so the output is the raw encrypted flag bytes:
Encrypted hex: dbbc3342678166ae9a08e0c6154e46ac7fb9c245aa87386814a07fa0984ead83547d7bb8598ac30ffa87542611a8
7.2 Flip one bit — key[0] = 0x01
$ echo "LICENSE-01000000000000000000" > lic.txt
$ PRINT_FLAG_CHAR=1 ./challenge lic.txt
Only the first output byte changes: 0xDB → 0xDA.
7.3 Verification
0xDB XOR 0x01 = 0xDA ✓ (matches the observed output!)
Confirmed: The encryption is simple repeating XOR. The formula is:
flag_char[i] = encrypted_byte[i] XOR license_key[i % 10]
Where:
encrypted_byte[i]— a fixed byte hardcoded inside the VM bytecodelicense_key[i % 10]— one of the 10 key bytes, cycling with period 10flag_char[i]— thei-th character of the flag
7.4 Verify XOR at ALL positions
We use crack_key.py to verify that XOR holds across all 46 positions:
BINARY = "./challenge"
def get_output(key_bytes: bytes) -> bytes:
"""Run with hex-encoded key, return flag bytes after sentinel."""
hex_str = key_bytes.hex()
with open("/tmp/crack_lic.txt", "w") as f:
f.write(f"LICENSE-{hex_str}\n")
env = os.environ.copy()
env["PRINT_FLAG_CHAR"] = "1"
r = subprocess.run([BINARY, "/tmp/crack_lic.txt"],
capture_output=True, timeout=10, env=env)
marker = b"The flag lies here somewhere...\n"
idx = r.stdout.find(marker)
if idx >= 0:
return r.stdout[idx + len(marker):].rstrip()
return b""
# Baseline: all-zero key → raw encrypted bytes
enc = get_output(bytes(10))
# Test key: "ABCDEFGHIJ" (0x41..0x4A)
test_key = bytes([0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48, 0x49, 0x4A])
out_test = get_output(test_key)
all_match = True
for i in range(len(enc)):
expected = enc[i] ^ test_key[i % 10]
actual = out_test[i]
if expected != actual:
all_match = False
print(f" MISMATCH pos {i}")
if all_match:
print(" ✓ XOR confirmed at ALL 46 positions!")
Output: ✓ XOR confirmed at ALL 46 positions!
Step 8: Crack the 10-Byte XOR Key
8.1 Encrypted bytes table
The 46 encrypted bytes (from the all-zero key run), grouped by their key index (position mod 10):
| Key idx | Pos 0 | Pos 10 | Pos 20 | Pos 30 | Pos 40 |
|---|---|---|---|---|---|
| 0 | 0xDB |
0xE0 |
0xAA |
0xAD |
0xFA |
| 1 | 0xBC |
0xC6 |
0x87 |
0x83 |
0x87 |
| 2 | 0x33 |
0x15 |
0x38 |
0x54 |
0x54 |
| 3 | 0x42 |
0x4E |
0x68 |
0x7D |
0x26 |
| 4 | 0x67 |
0x46 |
0x14 |
0x7B |
0x11 |
| 5 | 0x81 |
0xAC |
0xA0 |
0xB8 |
0xA8 |
| 6 | 0x66 |
0x7F |
0x7F |
0x59 |
— |
| 7 | 0xAE |
0xB9 |
0xA0 |
0x8A |
— |
| 8 | 0x9A |
0xC2 |
0x98 |
0xC3 |
— |
| 9 | 0x08 |
0x45 |
0x4E |
0x0F |
— |
8.2 Known-plaintext attack using BITSCTF{
We know the flag starts with BITSCTF{ (8 bytes). Since enc[i] XOR key[i] = flag[i], we can invert this: key[i] = enc[i] XOR flag[i]. This gives us key bytes 0 through 7 directly:
pos 0: enc=0xDB, flag='B'=0x42 → key[0] = 0xDB ^ 0x42 = 0x99
pos 1: enc=0xBC, flag='I'=0x49 → key[1] = 0xBC ^ 0x49 = 0xF5
pos 2: enc=0x33, flag='T'=0x54 → key[2] = 0x33 ^ 0x54 = 0x67
pos 3: enc=0x42, flag='S'=0x53 → key[3] = 0x42 ^ 0x53 = 0x11
pos 4: enc=0x67, flag='C'=0x43 → key[4] = 0x67 ^ 0x43 = 0x24
pos 5: enc=0x81, flag='T'=0x54 → key[5] = 0x81 ^ 0x54 = 0xD5
pos 6: enc=0x66, flag='F'=0x46 → key[6] = 0x66 ^ 0x46 = 0x20
pos 7: enc=0xAE, flag='{'=0x7B → key[7] = 0xAE ^ 0x7B = 0xD5
8.3 Partial decrypt to find key[8] and key[9]
With 8 of the 10 key bytes known, we can decrypt positions 0–7, 10–17, 20–27, 30–37, 40–45 (every position whose index mod 10 < 8). The remaining positions (index % 10 == 8 or 9) remain unknown and are shown as ?:
B I T S C T F { ? ? y 3 r _ b y _ l ? ? 3 r _ y 0 u _ u ? ? 4 v 3 l _ m y _ ? ? c r 3 7 5 }
0 1 2 3 4 5 6 7 8 9 ...
Partial decrypted string: BITSCTF{??y3r_by_l??3r_y0u_u??4v3l_my_??cr375}
8.4 Infer from leetspeak context
The pattern is clearly leetspeak (where certain letters are replaced with visually similar digits: a→4, e→3, s→5, o→0). Reading the partial result:
| Positions | Partial | Inference | Plain text |
|---|---|---|---|
| 8–14 | ??y3r_b |
l4y3r_b |
“layer” (a→4, e→3) |
| 18–24 | l??3r_y0 |
l4y3r_y0 |
“layer” again |
| 28–34 | u??4v3l_ |
unr4v3l_ |
“unravel” (a→4, e→3) |
| 38–44 | ??cr375} |
53cr375} |
“secrets” (e→3, s→5) |
Reconstructed full content: l4y3r_by_l4y3r_y0u_unr4v3l_my_53cr375
Therefore:
pos 8: flag char = 'l' = 0x6C → key[8] = enc[8] ^ 0x6C = 0x9A ^ 0x6C = 0xF6
pos 9: flag char = '4' = 0x34 → key[9] = enc[9] ^ 0x34 = 0x08 ^ 0x34 = 0x3C
8.5 Verify key[8] and key[9] at other positions
We cross-check by applying these key bytes to other positions with the same index:
key[8] = 0xF6:
pos 18: 0xC2 ^ 0xF6 = 0x34 = '4' ✓ (l4y3r → the 'a'→'4')
pos 28: 0x98 ^ 0xF6 = 0x6E = 'n' ✓ (unravel → 'n')
pos 38: 0xC3 ^ 0xF6 = 0x35 = '5' ✓ (secrets → 's'→'5')
key[9] = 0x3C:
pos 19: 0x45 ^ 0x3C = 0x79 = 'y' ✓ (l4y3r → 'y')
pos 29: 0x4E ^ 0x3C = 0x72 = 'r' ✓ (unr4v3l → 'r')
pos 39: 0x0F ^ 0x3C = 0x33 = '3' ✓ (53cr375 → 'e'→'3')
All positions check out perfectly! ✓
Step 9: Create the License File and Get the Flag
9.1 Full 10-byte XOR key
Key (10 bytes): 99 F5 67 11 24 D5 20 D5 F6 3C
Hex string: 99f5671124d520d5f63c
9.2 Create the license file
The license file format is LICENSE- followed by the key bytes hex-encoded:
$ echo "LICENSE-99f5671124d520d5f63c" > license.txt
9.3 Run the binary
$ PRINT_FLAG_CHAR=1 ./challenge license.txt
[i] loaded license file
processing... please wait...
[i] running program...
The flag lies here somewhere...
BITSCTF{l4y3r_by_l4y3r_y0u_unr4v3l_my_53cr375}
🎉 FLAG: BITSCTF{l4y3r_by_l4y3r_y0u_unr4v3l_my_53cr375}
Full Solver Script
final_solve.py — complete solve script from scratch:
#!/usr/bin/env python3
"""
Diablo CTF Solver
Derives the 10-byte XOR key using a known-plaintext attack against "BITSCTF{"
then verifies by running the binary.
"""
import subprocess, os
BINARY = "./challenge"
def get_output(key_bytes: bytes) -> bytes:
"""Run binary with hex-encoded key; return flag bytes after the sentinel line."""
hex_str = key_bytes.hex()
fname = "/tmp/final_lic.txt"
with open(fname, "w") as f:
f.write(f"LICENSE-{hex_str}\n")
env = os.environ.copy()
env["PRINT_FLAG_CHAR"] = "1"
r = subprocess.run([BINARY, fname], capture_output=True, timeout=10, env=env)
marker = b"The flag lies here somewhere...\n"
idx = r.stdout.find(marker)
if idx >= 0:
return r.stdout[idx + len(marker):].rstrip()
return b""
# Step 1: Get the 46 encrypted bytes (XOR with all-zero key = identity)
enc = get_output(bytes(10))
print(f"[*] Encrypted bytes ({len(enc)}): {enc.hex()}")
# Step 2: Known-plaintext attack with flag prefix "BITSCTF{"
prefix = b"BITSCTF{"
key = [enc[i] ^ prefix[i] for i in range(8)]
# key[0..7] = [0x99, 0xF5, 0x67, 0x11, 0x24, 0xD5, 0x20, 0xD5]
# Step 3: Partial decrypt to find key[8] and key[9]
partial = []
for i in range(len(enc)):
idx = i % 10
if idx < 8:
partial.append(chr(enc[i] ^ key[idx]))
else:
partial.append('?')
print(f"[*] Partial flag: {''.join(partial)}")
# → BITSCTF{??y3r_by_l??3r_y0u_u??4v3l_my_??cr375}
# Step 4: Infer key[8] and key[9] from leetspeak context
# pos 8 = 'l' (from "l4y3r"), pos 9 = '4' (from "l4y3r")
key.append(enc[8] ^ ord('l')) # key[8] = 0x9A ^ 0x6C = 0xF6
key.append(enc[9] ^ ord('4')) # key[9] = 0x08 ^ 0x34 = 0x3C
# Step 5: Full decryption
flag = ''.join(chr(enc[i] ^ key[i % 10]) for i in range(len(enc)))
print(f"\n[+] KEY: {bytes(key).hex()}")
print(f"[+] FLAG: {flag}")
# Step 6: Verify by running the binary with the real key
out = get_output(bytes(key))
print(f"\n[+] Binary verification: {out.decode(errors='replace')}")
# Expected output:
# [+] KEY: 99f5671124d520d5f63c
# [+] FLAG: BITSCTF{l4y3r_by_l4y3r_y0u_unr4v3l_my_53cr375}
Technical Summary
┌─────────────────────────────────┐
│ UPX Packed Binary │ ← Layer 1: Packing
│ (strings → UPX signature) │
└───────────────┬─────────────────┘
↓ GDB catch syscall write + memory dump
┌─────────────────────────────────┐
│ Anti-Debug Protection │ ← Layer 2: Anti-debug
│ TracerPid, /proc/version, │
│ parent process name check │
└───────────────┬─────────────────┘
↓ Bypass: run without GDB, use env vars instead
┌─────────────────────────────────┐
│ Custom Bytecode VM │ ← Layer 3: VM obfuscation
│ 10 registers, stack, RAM, │
│ Z-FLAG, custom opcodes │
└───────────────┬─────────────────┘
↓ PRINT_FLAG_CHAR env var → enable putchar output
┌─────────────────────────────────┐
│ License Format Obfuscation │ ← Layer 4: Hex encoding
│ LICENSE-<hex encoded bytes> │
│ sscanf("%02x", ...) decode │
└───────────────┬─────────────────┘
↓ Capstone disassembly → discover sscanf("%02x")
┌─────────────────────────────────┐
│ 10-byte Repeating XOR │ ← Layer 5: Encryption
│ flag[i] = enc[i] ^ key[i%10] │
└───────────────┬─────────────────┘
↓ Known-plaintext: BITSCTF{ + leetspeak reasoning
┌─────────────────────────────────┐
│ FLAG │
│ BITSCTF{l4y3r_by_l4y3r_y0u_ │
│ unr4v3l_my_53cr375} │
└─────────────────────────────────┘
Tools Used
| Tool | Purpose |
|---|---|
strings |
Identify UPX packing from signature string |
GDB + catch syscall write |
Pause after UPX decompression, dump all memory |
| Capstone (Python) | Disassemble x86-64 machine code from raw dumps |
Python subprocess |
Run binary with controlled inputs and env vars |
| Environment variables | Activate per-opcode VM debug mode (PRINT_FLAG_CHAR, GET_LICENSE_BYTE) |
| Known-plaintext XOR attack | Recover 10-byte key from known flag prefix BITSCTF{ |
| Leetspeak pattern analysis | Infer the 2 remaining unknown key bytes from context |
Key Lessons
-
stringsis never useless — even on a packed binary, the presence ofUPX!in the output reveals the packing mechanism. Always runstringsas a first step. -
Runtime analysis beats static analysis for packed/obfuscated binaries — catching a first
write()syscall with GDB gives you a perfect decompression pause point that works universally for UPX-packed binaries. -
Always check
getenv()calls — developer debug backdoors left in production code are one of the most valuable findings in reverse engineering. Scan forgetenvin every binary you analyze. -
Known-plaintext XOR attack is devastatingly effective — if you know any part of the plaintext (e.g., a flag format prefix), you can directly recover the corresponding key bytes without needing to find the key elsewhere.
-
Context-aware cryptanalysis — when you can decrypt most of a message, human language patterns (leetspeak, natural language, repeated words) let you infer the remaining unknown parts, closing the last gap in a partial key.