In this challenge we are given an obfuscation plugin for llvm and an obfuscated binary. The plugin accepts a 16-byte seed as a parameter and uses it for internal PRNG. Our goal is to analyze the binary and recover the seed from it.

The binary is basically a big state machine. Here is start of the main function:

.text:4004C0 main:           ; DATA XREF: _start+1D
.text:4004C0     push rbp
.text:4004C1     mov  rbp, rsp
.text:4004C4     sub  rsp, 4038Ch
.text:4004CB     mov  dword ptr [rbp-4], 0
.text:4004D2     mov  dword ptr [rbp-8], 46544330h
.text:4004D9     mov  dword ptr [rbp-0Ch], 4E52BCE7h
.text:4004E0
.text:4004E0 main_switch:       ; CODE XREF: .text:loc_7635AC
.text:4004E0     mov  eax, [rbp-0Ch]
.text:4004E3     mov  ecx, eax
.text:4004E5     sub  ecx, 8000DEEAh
.text:4004EB     mov  [rbp-10h], eax
.text:4004EE     mov  [rbp-14h], ecx
.text:4004F1     jz   loc_728049
.text:4004F7     jmp  $+5 .text:4004FC .text:4004FC loc_4004FC: ; CODE XREF: .text:00000000004004F7 .text:4004FC mov eax, [rbp-10h] .text:4004FF sub eax, 8001EACAh .text:400504 mov [rbp-18h], eax .text:400507 jz loc_73E38A .text:40050D jmp$+5
.text:400512
.text:400512 loc_400512:        ; CODE XREF: .text:000000000040050D
.text:400512     mov  eax, [rbp-10h]
.text:400515     sub  eax, 8003CC64h
.text:40051A     mov  [rbp-1Ch], eax
.text:40051D     jz   loc_5C0C95
.text:400523     jmp  $+5 In pseudocode, it could be something like this: state = 0x4E52BCE7 main_switch: switch(state) { case 0x8000DEEA: goto loc_728049; case 0x8001EACA: goto loc_73E38A; case 0x8003CC64: goto loc_5C0C95; ... } loc_xxxxxx: // do something state = 0x8003CC64 goto main_switch Note that the state ids are sorted as signed 32-bit integers. Also, during the case probing, some intermediate data is written to the stack (i.e. [rbp-14h] and lower), but there are no further reads from that area. By running the binary and checking for system calls with strace we can see that it does not do anything useful. Let’s look at the llvm plugin to see how the seed is used. #### lib0opsPass.so The plugin is written in C++ and exports a bunch of functions. The main function of our interest is Oops::OopsFlattening::flatten(). Though it’s quite big, we can quickly locate interesting parts by looking at cross-references from crypto-related functions. First it calls prng_seed to seed its internal PRNG state: Oops::CryptoUtils::prng_seed(cryptoutils, &seed_string); Then it uses the PRNG to generate 16 bytes: memset(&key, 0, 0x20uLL); LODWORD(cryptoutils_) = llvm::ManagedStatic<Oops::CryptoUtils>::operator->(&Oops::Oopscryptoutils, 0LL); Oops::CryptoUtils::get_bytes(cryptoutils_, &key, 16); if ( crc32('FTC0', &key, 3) != 0xF9E319A6 ) ... Note that the hardcoded crc32 check allows us to easily find the first 3 bytes of the generated key: 179, 197, 140. The key is then used to “scramble” values from a hardcoded array called plains: LODWORD(plains0) = plains[0]; LODWORD(v35) = llvm::ManagedStatic<Oops::CryptoUtils>::operator->(&Oops::Oopscryptoutils, v33); v36 = plains0; v37 = Oops::CryptoUtils::scramble32(v35, plains0, &key); ... v60 = counter++; plainsCUR = plains[v60]; LODWORD(v62) = llvm::ManagedStatic<Oops::CryptoUtils>::operator->(&Oops::Oopscryptoutils, v59); v63 = Oops::CryptoUtils::scramble32(v62, plainsCUR, &key); It’s not clear for now how these “scrambled” values are used later. IDA tells us that there are around values in the array: .data:2345E0 ; _DWORD plains[65806] .data:2345E0 plains dd 0F6172961h, 0CB973739h, 904F3728h, 0DB7194B9h, 81E0B166h ... Probably it is possible to look at the LLVM-related code and see how exactly these values are used. But in the obfuscated binary there are not so many random-looking words. The only ones which come to mind are the state ids! Let’s log all the state ids passed to the main_switch. Here is a simple gdb script for it: $ cat >cmd
set confirm off
set pagination off
break *0x04004e3
commands
p/x $eax cont end run$ gdb -x cmd -n ./0llvm </dev/null >log
$head -30 log GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1 ... Reading symbols from ./0llvm...(no debugging symbols found)...done. Breakpoint 1 at 0x4004e3 Breakpoint 1, 0x00000000004004e3 in main ()$1 = 0x4e52bce7

Breakpoint 1, 0x00000000004004e3 in main ()
$2 = 0x3ac545da Breakpoint 1, 0x00000000004004e3 in main ()$3 = 0xff97c58e

Breakpoint 1, 0x00000000004004e3 in main ()
$4 = 0xe83342dd$ tail log
Breakpoint 1, 0x00000000004004e3 in main ()
$65789 = 0xf1dbf041 Breakpoint 1, 0x00000000004004e3 in main ()$65790 = 0xdb9a21b8

Breakpoint 1, 0x00000000004004e3 in main ()
$65791 = 0xb02b5689 [Inferior 1 (process 10622) exited with code 027] (gdb) quit 65791 values! Quite close to 65801 found in IDA. The hypothesis seems to be true. But what does it give us now? #### Recovering key What we have found is that we have pairs of plaintext/ciphertext under the “scramble32” function. Let’s look at it closer: __int64 __fastcall Oops::CryptoUtils::scramble32( Oops::CryptoUtils *this, unsigned int x, const char *key) { int v3; // ST20_4@1 int v4; // ST24_4@1 int v5; // ST20_4@1 v3 = AES_PRECOMP_TE3[(x ^ key[3])] ^ AES_PRECOMP_TE2[(BYTE1(x) ^ key[2])] ^ AES_PRECOMP_TE1[((x >> 16) ^ key[1])] ^ AES_PRECOMP_TE0[(BYTE3(x) ^ *key)]; v4 = AES_PRECOMP_TE3[(v3 ^ key[7])] ^ AES_PRECOMP_TE2[(BYTE1(v3) ^ key[6])] ^ AES_PRECOMP_TE1[((v3 >> 16) ^ key[5])] ^ AES_PRECOMP_TE0[(BYTE3(v3) ^ key[4])]; v5 = AES_PRECOMP_TE3[(v4 ^ key[11])] ^ AES_PRECOMP_TE2[(BYTE1(v4) ^ key[10])] ^ AES_PRECOMP_TE1[((v4 >> 16) ^ key[9])] ^ AES_PRECOMP_TE0[(BYTE3(v4) ^ key[8])]; return AES_PRECOMP_TE3[(v5 ^ key[15])] ^ AES_PRECOMP_TE2[(BYTE1(v5) ^ key[14])] ^ AES_PRECOMP_TE1[((v5 >> 16) ^ key[13])] ^ AES_PRECOMP_TE0[(BYTE3(v5) ^ key[12])] ^ ((key[2] << 8) | (key[1] << 16) | (*key << 24) | key[3]); } Interesting, it is related to the AES block cipher. The AES_PRECOMP_TE tables map 8-bit values to 32-bit values. Possibly these tables implement MixColumns, or even together with SBoxes and xors. Let’s compose them with inverse of MixColumns (aes.py): from aes import AES A = AES() for i in xrange(4): for x, t in enumerate(AES_PRECOMP_TE[i]): t = [BYTE3(t), BYTE2(t), BYTE1(t), BYTE0(t)] t2 = A.mixColumn(list(t), isInv=True) print t2 print $ python precomp.py
[99, 0, 0, 0]
[124, 0, 0, 0]
[119, 0, 0, 0]
[123, 0, 0, 0]
[242, 0, 0, 0]
[107, 0, 0, 0]
...

This is the AES SBox applied to one of the bytes! It means that

AES_PRECOMP_TE[i](x) = MixColumns(SBox(x) << 8*i).

Therefore scramble32 is a 4-round iteration of XorKey, SubBytes, MixColumn followed by another XorKey. How do we recover the key?

Recall that we know key[0], key[1], key[2] from a CRC32 check. We can guess one-byte key[3] and bypass the first round easily. Luckily, the last whitening key is the same as the first one: with the same guess we can decrypt the last round aswell! By moving the keys through the linear layers and decrypring the linear layers, we arrive at two rounds:

XK | SB | MC | XK | SB | XK.

Let’s use impossible polytopes. Assume that we have three plaintexts of the form (it is probable that we have such among the

• X1 = (x1, a, b, c)
• X2 = (x2, a, b, c)
• X3 = (x3, a, b, c)

We will study how a difference tuple (X1 X2, X1 X3) propagates through the cipher. Since it is a difference, it propagates through the key addition untouched. After SubBytes a set of possible difference tuples expands up to elements. Since MixColumn is linear, any difference (x, y) propagates to (MixColumn(x), MixColumn(y)). Therefore, before the last SubBytes layer, we have only possible differences, which can easily be precomputed. Thus, we can recover the last key byte-by-byte: guess byte of the key, decrypt through S-Box, check difference tuple (note that the middle XK does not affect it). We are truncating the differences of 32-bit values to differences of 8-bit values, but this is fine since we look at pairs of differences: possible pairs out of give us

filtration for each key byte. By tracking the first guessed key byte, we can ensure that only the correct key survives.

The full attack implementation is available here.

#### Recovering Seed

We have recovered the key, but the flag is the PRNG seed! How to recover it? The code for stream generation is as follows:

// in Oops::CryptoUtils::encrypt(Oops::CryptoUtils *this, unsigned __int8 *dst, unsigned __int8 *nonce, unsigned __int8 *seed)
memset(dst, 0, 0x10uLL);
v5 = *(nonce_ + 1);
*buf = *nonce_;
*buf2 = v5;
for ( i = 0; i <= 15; ++i ) {
seedbyte = seed_[i];
for ( j = 0; j <= 7; ++j ) {
if ( seedbyte & 1 ) {
for ( k = 0; k <= 15; ++k )
dst[k] ^= buf[k];
}
seedbyte = seedbyte >> 1;
for ( l = 0; l <= 15; ++l )
buf[l] = TABLE[buf[l]];
}
}

Here TABLE is an 8-bit nonlinear S-Box. Nonce is constant and hardcoded (note that it is increased by 1 before calling encrypt, as a big-endian number, see Oops::CryptoUtils::inc_ctr):

// in  Oops::CryptoUtils::prng_seed(struct CryptoUtils *cryptoutils, __int64 a2)
noncebuf = cryptoutils->nonce;
*noncebuf = 0xD7C59B4DFFD1E010LL;
*(noncebuf + 1) = 0x20C7C17B250E019ALL;

Though TABLE is nonlinear, the buf array is updated independently and therefore we can see its different versions as constants. Mixing of buf and seed is done linearly, so we can recover the seed from the PRNG output (which is the AES key) by simple linear algebra:

from sage.all import *

from struct import pack, unpack

def tobin(x, n):
return tuple(map(int, bin(x).lstrip("0b").rjust(n, "0")))

def frombin(v):
return int("".join(map(str, v)), 2 )

def tobinvec(v):
return sum( [tobin(c, 8) for c in v], () )

PRNG_OUT = [179, 197, 140, 9, 31, 61, 9, 48, 214, 74, 172, 159, 200, 11, 185, 236]

# note 0x20... -> 0x21
nonce = pack("<QQ", 0xD7C59B4DFFD1E010, 0x21C7C17B250E019A)

cols = [map(ord, nonce)]
for i in xrange(127):
v = [TABLE[c] for c in cols[-1]]
cols.append(v)
cols = [vector(GF(2), tobinvec(v)) for v in cols]
target = vector(GF(2), tobinvec(PRNG_OUT))

m = matrix(GF(2), 128, len(cols))
for x, c in enumerate(cols):
m.set_column(x, c)
try:
sol = m.solve_right(target)
except ValueError:
print "NO SOLUTION"
quit()

print "SOL", sol
res = ""
for i in xrange(0, 128, 8):
res += chr(frombin(sol[i:i+8][::-1]))
print "flag{%s}" % res

Flag:

PS：本站不是实验吧的官方站点，纯粹是个人博客，收取Flag费用仅是维持服务器费用，做站不易，且行窃珍惜！