Set 3
17. The CBC padding oracle
The first function we have to create selects one of the 10 given strings at random, pads & encrypts with a random key. Nothing fancy here, except that as usual we need to store the IV and the key.
The 2nd function is our oracle:
Now comes the tricky bit (see References below). We’ll start with the first 2 blocks only - C1||C2
. Our aim is to modify C1
- turning it into C'1
such that when we pass C'1||C2
to padding_oracle
it returns True. That is, we end up with P'2
such that the block ‘unpads’ nicely - which allows us to back out what I2
should be, bypassing the decryption step entirely. We can then do C1^I2
to get P2
.
For simplicity, the above only takes the first 2 blocks into account. Expanding this to cover all (adjacent) blocks should be trivial.
References
skullsecurity and Robert Heaton’s blog. This small article on padbuster is neat too.
18. Implement CTR, the stream cipher mode
CTR mode isn’t particularly complex, but we need to be careful about padding (or lack of). With CTR we essentially end up xor’ing a stream of data with our plaintext to generate ciphertext (or vice versa for decryption).
The following snippet illustrates how to do this as a standalone function. It’s a little counter-intuitive but we don’t use AES on the ciphertext itself - instead we use it on this incrementing counter function and only then do we xor it with the ciphertext.
Which yields b"Yo, VIP Let's kick it Ice, Ice, baby Ice, Ice, baby "
.
19. Break fixed-nonce CTR mode using substitions
For this one, the first thing to understand is that we are not recovering the AES key used to generate the key stream. Instead, we are recovering the key stream itself (which, in a way, is almost as good as the key given we already have the ciphertext - and K (+) C = P).
20. Break fixed-nonce CTR statistically
Challenge 20 is an extension of challenge 19 and 6 (Break repeating-key XOR). We first pick the length of our smallest ciphertext and truncate all the remaining to that length. We now have 40 samples of what should be sourced from an English text - so as before, the most common character is space (character, not letter).
21. Implement the MT19937 Mersenne Twister RNG
For this challenge, we look at the pseudocode made available in Wikipedia. There is a Python implementation but where’s the fun in that. Note that by default Python’s random
module uses that by default according to this PEP. At the very least it should be a way for us to test our implementation!
This is pretty much a straight-up implementation of the pseudocode listed on Wikipedia - I haven’t tried to optimise it.
Now trying this against random.getstate
, we don’t get anything matching at all. What gives?
It turns out that Python’s seed
method uses a different initialisation routine - as we can see from the states not matching at all. To verify our implementation, we use numpy
instead. And indeed after setting numpy.random.seed
, numpy.random.get_state
ends up matching our internal state.
For those interested, there is a bit more colour to be found in the Python source code for the random module - but to get underneath it all we’d probably need to look at the code for the actual _random
module.
22. Crack an MT19937 seed
The instructions given are fairly straightforward:
- time sleep (randint, 2, 30)
- seed with
int(time.time)
- do step one again
- return the first random value returned by the RNG
Here we need to figure out what the seed was. Simply, we need to figure out what the epoch was. It’s a bit of a bruteforce approach - we just need to ‘rewind’ the seconds until such a time that the first random value returned matches the original one.
If we didn’t know the seed had been based on the current time, this would have been considerably more difficult!
23. Clone an MT19937 RNG from its output
The steps in extract_number
are as follows, with the values hardcoded for simplicity:
y ^= (y >> 11) & 0xFFFFFFFF
y ^= (y << 7) & 0x9D2C5680
y ^= (y << 15) & 0xEFC60000
y ^= y >> 18
For us to invert the transform, we will reverse each operation starting from the last one. But first, let’s take a quick look at what this means.
For simplicity, let’s pick k = 4077814955. This is because the left-most bit will be set to 1, which makes it easier to visualise the shift.
Which yields:
k 11110011000011101000010010101011
k>>18 00000000000000000011110011000011
k^k>>18 11110011000011101011100001101000
We see the right shift moves everything down to the right by 18 places - leaving 18 zeros. When xor’ed with k
, those 18 left-most zeros are essentially a no-op and allow us to recover the original 18 bits.
Let’s do this piece-wise. For simplicity let kdash=k^k>>18
. We’ll isolate them by and’ing them with a 32-bit number composed of 18 leading 1’s (which you can easily find via hex(0b<insert 1's and 0's here>
):
yielding:
kdash 11110011000011101011100001101000
top18 11110011000011101000000000000000
We know that k
was bitshifted to the right by 18 zeros before being xor’ed with itself. We also know that k>>18
xor’ed with kdash
will yield the original k
. We can get the bottom 14 by bitshifting top18
by 18 to the right. Let’s break it down:
top18 11110011000011101000000000000000
bot14 00000000000000000011110011000011
kdash 11110011000011101011100001101000
k 11110011000011101000010010101011
We’re almost there! We see that by xoring the bottom 14 bits with kdash
we recover k
. The actual sequence can be summarised as:
As a side note, I just want to emphasise all we’ve done is derive k
back from kdash
(i.e., what k
was before the transform).
Now let’s take a look at the next transform:
y ^= (y << 15) & 0xEFC60000
Just as before, we’ll break it down. Let’s keep k
as defined above and let m
be the mask 0xEFC60000
.
k 11110011000011101000010010101011
(k<<15)&0xffffffff 01000010010101011000000000000000
0xefc60000 11101111110001100000000000000000
(k<<15)&0xefc60000 01000010010001000000000000000000
For the 2nd step we need to mask with 0xFFFFFFFF
because we want to keep the numbers as 32-bit (you don’t need to do this in code because the mask will take care of this for you - only for display purposes). The mask means we can easily recover the last 17 bits - because we just &
them with 0’s.
For the sake of briefty:
TODO: rework this section. It was painful getting the below, and a decent explanation would help - just so it doesn’t look like those ops were plucked out of thin air.
We can then string the above to reverse each transform successively (look for reverse_mersene_transform
in crypto_utils.py
).
With this done, all we need is to source 624 numbers from the generator to recover the initial state:
>>> o=MT()
>>> o_clone = MT()
>>> o.seed_mt(123)
>>> state = [reverse_mersene_transform(n) for n in [o.extract_number() for _ in range(624)]]
>>> o_clone.index = 624 # since we generated 624 values
>>> o.index
624
>>> o_clone.mt = state
>>> o.extract_number()
2439673420
>>> o_clone.extract_number()
2439673420
Job done!