In this crypto challenge, we are provided with a chosen plaintext attack (notably, one that has already been carried out; we cannot actually run a chosen plaintext attack) on a custom designed block cipher whose implementation is provided. The goal is to recover the key of the block cipher.

The cipher in question is a three-round substitution-permutation network. The cipher maps 36-bit inputs to a 36-bit output. Each substitution layer is implemented via 6 S-boxes which act on 6 bits each. Overall, it looks something like this picture from Wikipedia (but without the initial , and an additional P step at the end):

An interesting feature of the provided log of the chosen-plaintext attack is that there are several plaintexts which map to very similar ciphertexts. Here’s a short excerpt of the log for reference:

plaintext -> ciphertext ============================ 0x02637952D -> 0x674F229F7 0x02CC0F857 -> 0xE9BFB5DFD 0x02CC0F859 -> 0xE2BFF5DFD 0x03E37952D -> 0x777A30975 0x0C167731F -> 0xAE15A3564 0x0C167733F -> 0xCCA0C3D18 0x12E295065 -> 0xF5E17D8F3 0x12FAD81B5 -> 0x62254BE45 0x12FCD81B5 -> 0x71C92BB21 0x136295065 -> 0x658B0947C 0x142AF9FC3 -> 0x9AFAF4CEF 0x142CF9FC3 -> 0xABBEF534A ...

This was one of the first differential cryptanalysis challenges we’ve solved, so it was pretty interesting (generally I’m much better at more algebraic-based crypto). The key to differential cryptanalysis challenges seems to be to exploit linearity in the components of the cipher, especially linearity of sums/differences of ciphertexts (hence ‘differential’).

For example, let be the permutation on 36-bit inputs induced by the S-boxes, let be the permutation on 36-bit inputs induced by the P-box, and let be the permutation on 36-bit inputs induced by XORing with the th key. Then and act linearly on sums of inputs, in the sense that , and . If the function was also linear in this sense (say, satisfying ), then we could also easily compute the sum of inputs from the sum of encryptions for any inputs and . This would allow us to break the cipher and decrypt any value given any plaintext/ciphertext pair (finding the key is a bit harder).

Of course, unlike and , does not act linearly on sums. One could hope that acts almost linearly on sums and leverage that, but it is not too hard to check that is far from any linear function. Instead, for this challenge, we’ll use the fact that , , and all preserve (to some extent) the *sparsity* of the differentials.

To see what we mean by this, let equal the Hamming weight (i.e. the number of 1s) of an input . Then, in addition to being linear, it is also true that and . Unfortunately, it is not quite true that , but we can say some things. For instance if (and thus ), it is also true that and . More generally, since the function is comprised of 6 smaller S-boxes which each act on 6 bits of the input, can expand the sparsity by at most a factor of ; that is, (there are at most blocks of 6 bits where and differ; on all the other blocks, must equal ). And in fact, this is a worst-case bound; on average shouldn’t expand the sparsity by more than a factor of , and in many cases (especially for opportunely chosen inputs) may not expand the sparsity at all.

Since there are only three rounds, and there are many examples in the log with small values of (e.g. the pair 0x02CC0F857 -> 0xE9BFB5DFD, 0x02CC0F859 -> 0xE2BFF5DFD), this gives us hope for an attack. Let and be two chosen plaintexts with a very sparse differential (ideally, ). We’ll show how to recover some information about the key from these plaintexts and their encryptions.

Let and . We can visualize the overall cipher acting on and via the following diagram.

x->S->P->K1->S->P->K2->S->P->k3->x' y->S->P->K1->S->P->K2->S->P->k3->y'

Let , and let . Since we do not know , we cannot compute or , but we can compute : we directly compute and , and then because the xors cancel out. Likewise, let (the output of the third step for ), and define similarly for . Again, we cannot compute or , but we can compute by working backward from : because again the xors cancel out. Our diagram now looks something like this:

a->S->P->K2->S->a' b->S->P->K2->S->b' known: a+b, a'+b'

Unfortunately, this is as far as we can compute the differentials; knowing does not let us compute . However, if is very sparse, then this restricts the number of possibilities of . In particular, if is zero in all but 6-bit blocks, there are at most choices for (this looks like a lot, but is quite manageable if is less than 3).

This lets us implement a sort of meet-in-the-middle attack. Let , and let . Then, if and , we have that by definition, and working backwards we also have , because the xor cancels out and is a linear function. In many cases, this restricts the possibilities for down to one or two cases. Each of these cases restricts what several of the blocks in and can be (and hence what the blocks of can be). Doing this for enough pairs of similar plaintexts/ciphertexts lets us recover entirely.

Once we have , we can do a similar (but even easier) attack to recover , based on the input and output differentials to the third (details omitted). And finally, once we have and , we can easily compute from any plaintext/ciphertext pair.

We realize that we left out some details, so here’s a worked example for the second and third line of the log (bits are divided into groups of 6 for clarity, and the notation is the same as above):

We are given x=0x02CC0F857 -> x'=0xE9BFB5DFD y=0x02CC0F859 -> y'=0xE2BFF5DFD We can directly compute P(S(x))=111001 100111 101111 010011 000001 101101 P(S(y))=111001 100111 101111 010111 000001 101101 Therefore a+b=P(S(x))+P(S(y))=000000 000000 000000 000100 000000 000000 We also have P^-1(x')=110111 001011 111101 110111 011110 011111 P^-1(y')=110111 001011 111101 110111 110011 011111 So a'+b'=P^-1(x')+P^-1(y')=000000 000000 000000 000000 101101 000000 Now let's find the sets F(a+b) and P^-1(B(a'+b')) as described above. First, F(a+b) corresponds to possible values of a''+b''=S(a)+S(b). There are 64 (technically 63) possibilities, namely all strings of the form: 000000 000000 000000 ****** 000000 000000 (* can be either 0 or 1, but not all *s can be 0) To compute P^-1(B(a'+b')), we can first determine the 64 possible values in B(a'+b'), or S^-1(a')+S^-1(b'): 000000 000000 000000 000000 ****** 000000 Running this back through P^-1 gives: 000*00 00000* 00000* 00*000 0*0000 0000*0 We got super lucky in this case, because there is only one number in both sets, namely: 000000 000000 000000 001000 000000 000000 Therefore, we have the following two equations: a +b =000000 000000 000000 000100 000000 000000 S(a)+S(b)=000000 000000 000000 001000 000000 000000 Of course, this doesn't give us any information about the 000000 blocks, but this does restrict possibilities for the 4th block. By bruteforcing all 64^2 pairs, it turns out that there's only two possibilities for this block: a = ****** ****** ****** 101011 ****** ****** b = ****** ****** ****** 101111 ****** ****** (the second possibility is switching a and b) Finally, we know that K_1(P(S(x)))=a, so K_1=P(S(x))+a. Therefore, the fourth block of K_1 has two possibilities K_1=****** ****** ****** 111000 ****** ****** K_1=****** ****** ****** 111100 ****** ****** We've reduced the total number of K_1 possibilities from 2^36 to 2^30*2, which is pretty good for a single plaintext/ciphertext pair!

In the above example, we got really lucky because there was only one possible value in the intersection , but even if there are multiple possible values, this can still reduce the number of key possibilities, and there are enough pairs in the log to uniquely determine the value of the key using this strategy. However, if the cipher had 4 or more rounds, I’m not sure this strategy would work because the possibilities probably blow up too quickly.

For completeness, here’s the Python script we used to implement the above attack, but it’s uncommented and very messy, so I wouldn’t bother reading it. Though if you do read it and have questions on how it works, feel free to ask.

from itertools import product from hashlib import md5 rounds = 3 nsbox = 6 # number of s-boxes bsbox = 6 # input bits per s-box ssize = 1 << bsbox bits = nsbox * bsbox insize = 1 << bits s = [59, 21, 25, 32, 30, 3, 63, 38, 5, 29, 40, 53, 17, 56, 58, 37, 45, 43, 52, 61, 7, 55, 57, 12, 26, 13, 49, 16, 36, 8, 31, 41, 20, 51, 33, 15, 1, 0, 23, 27, 35, 18, 47, 62, 14, 60, 10, 54, 46, 50, 9, 48, 24, 28, 44, 2, 6, 34, 19, 42, 11, 39, 22, 4] p = [14, 7, 25, 2, 22, 32, 29, 0, 28, 31, 6, 18, 12, 27, 33, 9, 17, 21, 8, 26, 23, 35, 4, 5, 11, 20, 30, 24, 15, 1, 19, 16, 10, 13, 34, 3] s_inv = [0] * len(s) for i in range(len(s)): s_inv[s[i]] = i def Sp(r): ans = dict() for x in xrange(64): y = x^r v = s[x]^s[y] if v in ans: ans[v].append((x,y)) else: ans[v] = [(x,y)] return ans def Sinvp(r): ans = dict() for x in xrange(64): y = x^r v = s_inv[x]^s_inv[y] if v in ans: ans[v].append((s_inv[x],s_inv[y])) else: ans[v] = [(s_inv[x], s_inv[y])] return ans def split(x): y = [0] * nsbox for i in range(nsbox): y[i] = (x >> (i * bsbox)) & 0x3f return y def merge(x): y = 0 for i in range(nsbox): y |= x[i] << (i * bsbox) return y def pbox(x): y = 0 for i in range(len(p)): y |= ((x >> i) & 1) << p[i] return y def inv_p(x): y = 0 for i in range(len(p)): y |= ((x >> p[i]) & 1) << i return y Sps = [Sp(r) for r in xrange(64)] Sinvps = [Sinvp(r) for r in xrange(64)] def intersection(sdiff, ediff): sdpts = split(sdiff) sdsps = [Sps[pt].keys() for pt in sdpts] edpts = split(ediff) edsps = [Sinvps[pt].keys() for pt in edpts] sprod = reduce(lambda x, y: x*y, [len(x) for x in sdsps], 1) eprod = reduce(lambda x, y: x*y, [len(x) for x in edsps], 1) if sprod > 1000000 or eprod > 1000000: print 'abort!' return None middiffs = set([pbox(merge(x)) for x in product(*sdsps)]) middiffs2 = set([merge(x) for x in product(*edsps)]) return middiffs&middiffs2 def SBOX(x): return merge([s[y] for y in split(x)]) def start_diff(px, py): vx = pbox(merge([s[pt] for pt in split(px)])) vy = pbox(merge([s[pt] for pt in split(py)])) return vx^vy def end_diff(cx, cy): return inv_p(cx^cy) def filter_k1(px, py, middiffs, k1): vx = pbox(merge([s[pt] for pt in split(px)])) vy = pbox(merge([s[pt] for pt in split(py)])) vxpts = split(vx) vypts = split(vy) possk = [set() for _ in xrange(6)] for middiff in middiffs: minvPdiff = inv_p(middiff) mdiffpts = split(minvPdiff) for i in xrange(6): possi = Sinvps[mdiffpts[i]][vxpts[i]^vypts[i]] # list of possible (vx^k, vy^k) for z, _ in possi: possk[i].add(vxpts[i]^z) for i in xrange(6): k1[i] = [z for z in k1[i] if z in possk[i]] return k1 def obtain_k1(): k1 = [range(64) for _ in xrange(6)] for i in xrange(0, len(data), 2): px = data[i][0] cx = data[i][1] py = data[i+1][0] cy = data[i+1][1] sdiff = start_diff(px, py) ediff = end_diff(cx, cy) middiff = intersection(sdiff, ediff) if middiff: print middiff k1 = filter_k1(px, py, middiff, k1) print 'UPDATE', k1 return k1 def filter_k2(px, py, cx, cy, k1, k2): sx = pbox(SBOX(k1^pbox(SBOX(px)))) sy = pbox(SBOX(k1^pbox(SBOX(py)))) sdiff = sx^sy cdiff = cx^cy ediff = inv_p(cdiff) sxpts = split(sx) sdiffpts = split(sdiff) cdiffpts = split(cdiff) ediffpts = split(ediff) possk = [set() for _ in xrange(6)] for i in xrange(6): possi = Sinvps[ediffpts[i]][sdiffpts[i]] for z, _ in possi: possk[i].add(sxpts[i]^z) for i in xrange(6): k2[i] = [z for z in k2[i] if z in possk[i]] def obtain_k2(k1): k2 = [range(64) for _ in xrange(6)] for i in xrange(0, len(data), 2): px = data[i][0] cx = data[i][1] py = data[i+1][0] cy = data[i+1][1] filter_k2(px,py, cx, cy, k1, k2) print 'UPDATE K2:', k2 return k2 def obtain_k3(k1, k2): px = data[0][0] cx = data[0][1] return pbox(SBOX(k2^pbox(SBOX(k1^pbox(SBOX(px))))))^cx lines = [line for line in open('bloblog.txt','r')] data = [map(lambda x: int(x, 16), line.split('|')) for line in lines] k1 = obtain_k1() assert all([len(x)==1 for x in k1]) k1 = merge([x[0] for x in k1]) k2 = obtain_k2(k1) assert all([len(x)==1 for x in k2]) k2 = merge([x[0] for x in k2]) k3 = obtain_k3(k1, k2) print k1, k2, k3 def make_flag(k): param = md5(b"%x%x%x" % (k[0], k[1], k[2])).hexdigest() return "SharifCTF{%s}" % param print make_flag((k1, k2, k3))