Assignment 2 Solutions

Problem 1. Distributivity over XOR

AND distributes over XOR. We can verify that a AND (b XOR c) = (a AND b) XOR (a AND c) always holds by constructing a truth table with all 8 possibilities for bits a,b,c. Alternatively, what does it mean for the left-hand side to be 1? We must have both arguments to AND be 1, so a=1 and exactly one of b,c is 1 and the other 0. That means exactly one of a AND b, a AND c is 1, and the other 0, so (a AND b) XOR (a AND c) is 1. A very similar argument shows that if the right-hand side is 1 then so is the left-hand side. Another way to look at this is that AND is multiplication mod 2 and XOR is addition, so this is just the ordinary distributive law of arithmetic mod 2.

OR does not distribute over XOR. A single counterexample suffices: 1 OR (1 XOR 1)= 1 OR 0 = 1. But (1 OR 1) XOR (1 OR 1) = 1 XOR 1 = 0.

Problem 2. Monoalphabetic Substitution Cipher for One- and Two-letter Plain Texts

Let's first think about the problem intuitively. What information does the ciphertext give away about the plaintext in the absence of the key? In the first instance, 'q' is just as likely to be the encryption of 'a' as the encryption of 'b', or any other letter. So Pr['q'= E(K,'a')] = 1/26 = Pr['q'=E(k,'b')], where the probability is taken over all keys uniformly. The same equation holds for any pair of plaintext letters and any ciphertext letter. So this cipher has perfect secrecy.

However, in the two-letter case, 'qr' cannot be the encryption of 'aa' or any other repeated letter. The information that the ciphertext leaks is that the two plaintext letters are different. Formally Pr['qr'=E('ab')]=1/(25 x 26) != 0 = Pr['qr'=E('aa')]. This one example is enough to show we DON'T have perfect secrecy.

Problem 3. Predicting a Linear Congruential PRNG

We get the first two bytes of the keystream by XORing the known plaintext and the ciphertext. This will give the original state of the RNG. We use the formula for the RNG to get the next state, and XOR this with the ciphertext to get the next two bytes of plaintext, which are a8 11 in hex. Here is the entire computation in Python: note how we switch between hex representation and ordinary decimal notation. But we never really needed to print the decimal result.

In [2]:
first_ptword=0xf329
first_ksword=first_ptword^0x00ff
next_ksword = (first_ksword*25173+13849)%(2**16)
print next_ksword
next_ptword=next_ksword^0xb036
print next_ptword, hex(next_ptword)
 6183
43025 0xa811

Problem 4. Decrypting Ciphertext from a Stream Cipher

This was the problem that caused the random-number generator headaches. Here is my solution with the newer version of the generator, so I used the second ciphertext. Basically, we convert the ciphertext back to a sequence of bytes, use the key to generate a keystream of the same length, and XOR the two, then convert the result to ASCII.

In [5]:
import bytestuff
import random

ct64='Lk5AWq6k+FQjzPPy2Gw/TwwJKZBp0lICWtnxSQ8RSJgwYjlPg6+kR/MW7yVvT1pNy1QEz8/0ph8Zi9AzEv13JqGHU2aZmwJPDX1l+YSS1P93fRQQMs3QcE99dCyRS2A6J4vNcw9saUls8KyNFEEqchM5hbLAsCT9Q0SOGPA6iD1oKeZhBib02laE7LsZ/fBfeLEFmRLykZOuAWxyS15ddSHQZ17z911vIZW1A/KIZGK22G428in5D8IPA7gFC+9nzJytLU3c+jLjHZRjmghidXP4Pdh38owB3fr6hjalj5MUAq8h7SRy/3/nA1BIGND0GLTVWmbndOCUx90UfUWIWqaaOpDTpgAQsYFyYDknPeDcOKRFtOv84xBW0gFe8FQhAdcFB2rf56ZVBCqj14FTc8h6SKWkKpU9+l6/1FfvxOuNoEbcIEi5+NIsDUMaM4SikB8F/aF4TC8GizJcyBHyJ2wE/mI5eXrdMq6cU3VyoGuW1bL10bgZ+ZYa6zRJOYDymluvCT4EvLU0n7y318yaFmVhkQzg+m0UaVq7omUmXygF/ExrExhGu2OnO2fVh5fXnYF5L6IA+xirB5PyjDhHa2t8Eu7oFku3SDoH0IcNztCJY2Q+48A7CJCqT5E9DsKew4H0EibdnwNqGZn4q2Vx1dhbhgf0gEfGZlIr9jPKUmZRRjMv3gYcMXxBT7RJ1lWVybu+Rz8TF+oV6CGBxztrBYmK97qWWumXyNj1K6yHj/VvKQWqzYnj0g1sEsNnOtYcLUoq8LJnpi1HDqPfkeHfDGgPzgZiQK/gyPwFgasn2SVeeeKzNUqwrSHzsAF48POhyLrs+lYJnCYbgxZVlnwebVVHo72Jv9mBX1rMZaxSQAZKcSwsaZ9F9SZGq5kTsAXj38fqBpuU6e0IOA0nHqxDSVdMaR1tqE3GDfvq6y0fohhiQKWU8puyw2racLgXfkts2VHguZJU5rSH/fxEjLzAevauM3YvIP4YmB6rfTlc1XWOdb8UAXpLdsWoN3B7eu/NzyYriRi7JRsaEleV1rsli0JCOjgxrnK3BWgKOnCHFg/5Gv48/wd/pWawOlQujEcSRia06uG6GI6FSMBWbQfa7VF2g2JrEIZMs+9CREU2uxkCjeMhF6BDgq+dJmwgYmK1yFwCETMHbclxS1Czm+9nUrmQduRnmyUpFI4z3wVxMlXAVWw7R17YevWwq2FqWP22YxS6ekpIaAFewWwV9NM1Iz3CzFdxH022NLSdeWWCJvZRgsKCZZfW5512GyU='
ct=bytestuff.b64_to_bytes(ct64)
random.seed('Rosebud')
keystream=[random.randint(0,255) for j in range(len(ct))]
pt=bytestuff.xor(keystream,ct)
print bytestuff.bytes_to_ascii(pt)
Ours was the marsh country, down by the river, within, as the river wound, twenty miles of the sea. My first most vivid and broad impression of the identity of things, seems to me to have been gained on a memorable raw afternoon towards evening. At such a time I found out for certain, that this bleak place overgrown with nettles was the churchyard; and that Philip Pirrip, late of this parish, and also Georgiana wife of the above, were dead and buried; and that Alexander, Bartholomew, Abraham, Tobias, and Roger, infant children of the aforesaid, were also dead and buried; and that the dark flat wilderness beyond the churchyard, intersected with dykes and mounds and gates, with scattered cattle feeding on it, was the marshes; and that the low leaden line beyond, was the river; and that the distant savage lair from which the wind was rushing was the sea; and that the small bundle of shivers growing afraid of it all and beginning to cry, was Pip.

The nice thing about these problems (I guess) is that you know when it's right, because if you get it wrong, it's complete gibberish. This passage is from the first page of Great Expectations by Charles Dickens.

Problem 5. Identify which two ciphertexts were sent.

Because the same keystream was used for both messages, if we denote by m1 and m2 the two plaintexts, and by c1 and c2 the corresponding ciphertexts, then we have the property

c1 XOR c2 = m1 XOR m2

So we have our work cut out for us: Compute the XOR of the two ciphertexts, then compute the XORs of all pairs of plaintexts (there are six pairs in all), and see if it matches what we got with the ciphertexts. The code below carries this out.

In [3]:
import bytestuff
c1_64='ZCkTVMy6oBNZm9QZTH2zzqU0c+PPVSf2dfoJWO8k391OTCsS5QA='
c2_64='aD5XGsK55UdYjdQmU2C73L8xdqLIEG+oe7MQUL0vyt1EVTUQ91w='
ciphertext_sum=bytestuff.xor(bytestuff.b64_to_bytes(c1_64),bytestuff.b64_to_bytes(c2_64))
p1_as='I met a traveller from an antique land'
p2_as='And on the pedestal these words appear'
p3_as='My name is Ozymandias---king of kings.'
p4_as='Look on my works ye Mighty and despair'
pts=[p1_as,p2_as,p3_as,p4_as]
for j in range(4):
    for k in range(j,4):
        psum=bytestuff.xor(bytestuff.ascii_to_bytes(pts[j]),bytestuff.ascii_to_bytes(pts[k]))
        if psum==ciphertext_sum:
            print 'match texts',j+1,k+1
match texts 2 3

The second and third texts, 'And on these pedestals these words appear My name is Ozymandias, king of kings.', give the only match. The lines are from the poem 'Ozymandias', by Percy Shelley. Note that there is no way to determine which of the two texts sent correspond to which of the two ciphertexts.

6a. Predicting the keystream of Java's Linear Congruential RNG

We first XOR the first seven bytes of ciphertext with the seven known bytes of plaintext to obtain the first seven bytes of keystream. We'll convert the first four bytes of the keystream to an integer. This is the first output of the RNG, and the top 32 bits of its initial state.

In [4]:
import bytestuff
c64='LNiDuYprmvEnf+OE0p9Eiz1WvhJG0OsiJJVC8RkR6gfyqri3W3fLapo9QCarlZRJhUFFGyD1KuhdgTfxkksTiPt9BHJQVMSaU0ECN9TRsqbcUlttrpLwJ5gUqEwXCGcpAF0l6BTubZhfYy5n1VciML4n4I09kMtHjLbYC7mD/Tcw0zP/SDsM60tn1kf3UIe+C59HOmL/Vl51rEFX8lOH3kmGPS2X2OlPIhASyH4eht/BsqQzcGxvNhRkYDdfQOYNEf4Bn2v2yn+SLiWDHFv4CLo46GDM+Y6rf0U40AeuykiEVAButlTnzCkzFfYjVqjusIqzyjtFN4k7XYWUXgLLfGOeihRVMATwU9peibsoIpuW2bfKcDy15CxjXkccfFlYjHbN9x+J/JDu9LAV9+63Hl1osJukx+QtR+04oNfxIuQZJ/9OuQu8fVr1AtRGczJrjwQFwIdSkRoKdLUId9vuYauRirnWX6enfTASbH53/Z0NZGutapsJXSWyE5wzVjsiOhlM19iRJRR7SuuhtqRFEgX4uS+d6xCAYkSRd16XVcTvDJpM44PQxsnEOkZiA+0OVAj1U93z+0EAYbYquf0sXhSMUtZhEvS4Yqfld1I/M6Ugi8HrK9WNJOAZrWdlsOzA4bquH2NSkHGNP1z2NAlxmIXtObpfaM/1Go3l8ZthHJXV8TY/HeEYfygEkvrRoRDbptqgG5n13P09WKhJ1wX4curF3E1ZvDkuvv58TOCX06Ov/qwQUvRnFYjIBPHthkN64740nXFxZhA5o3AOQS0Ke7GZ3pHc1yR1P8VC4yXBlqYurOKPbu9zN40wtrxV01vn99+YPk22uuEPZifSLrXS3Lh8/qr7U4/ysCIOA3j0btvdIAmhhXRzU26Ro+bzWReF0oRub1qvs4m3F9m8jfeaPQlbGFr7RTCAYkVxQG5+tKgJWg0MVhWHwD3VaGl7x9Ts8SMvsW+pWUOVH3ap/PkzFdqs7KxIpR2d49DD8zL3H8c97Z4LtFb0nrNl9lHY2m4QaPr05mY1pSctkJogGGV37NTHcCf2QAJ4ayNpe735AmjXpN7VWLHSBeB+NvGqCW58WWAuIQqtcly1WrlrJLlP5vcbTmMR+PYyjL+EmiHZ'
ciphertext=bytestuff.b64_to_bytes(c64)
plain_start=bytestuff.ascii_to_bytes('If you ')
key_start = bytestuff.xor(ciphertext[:7],plain_start)
print key_start
first_output=bytestuff.bytes_to_long(key_start[:4])
print first_output
[101, 190, 163, 192, 229, 30, 186]
1706992576

We do not know the low 16 bits of the initial state of the RNG, but since there are only 216 possibilities, we can try out each one, update the generator using this guess, and see if the top three bytes of the new state are [229,30,186], the next three bytes of keystream. We hope we get a match, and it would be really nice if we get only one match!

In [9]:
import javarandom
for j in range(2**16):
    javarandom.state=1706992576*(2**16)+j
    next4=javarandom.next_bytes()
    if next4[:3]==[229, 30, 186]:
        print 'match with state', 1706992576*(2**16)+j
print 'done'
match with state 111869465525919
done

Great--we got just one answer for the state. Let's generate the whole keystream. We'll first do a little reality check to make sure that the first seven bytes of the keystream are what we want, then XOR the keystream with the ciphertext to obtain the plaintext.

In [12]:
javarandom.state=111869465525919
keystream=javarandom.get_bytes(len(ciphertext))
print keystream[:7]
[229, 30, 186, 131, 66, 30, 143]

We're 4 bytes out of sync, so we will reattach the first four bytes to the keystream and then decrypt.
In [13]:
keystream=[101, 190, 163, 192]+keystream
keystream=keystream[:len(ciphertext)]
print bytestuff.bytes_to_ascii(bytestuff.xor(keystream,ciphertext))
If you really want to hear about it, the first thing you'll probably want to know is where I was born, and what my lousy childhood was like, and how my parents were occupied and all before they had me, and all that David Copperfield kind of crap, but I don't feel like going into it, if you want to know the truth.  In the first place, that stuff bores me, and in the second place, my parents would have two hemorrhages apiece if I told anything pretty personal about them. They're quite touchy about anything like that, especially my father.  They're nice and all-I'm not saying that-but they're also touchy as hell. Besides, I'm not going to tell you my whole goddam autobiography or anything.  I'll just tell you about this madman stuff that happened to me last Christmas just before I got pretty run-down and had to come out and take it easy.

The passage is the start of the novel The Catcher in the Rye, by J.D. Salinger

6b: Predicting keystream of an LFSR-based stream cipher

Here is the code for the LFSR. The state vector consists of the 17 bits

\[b_{16}\cdots b_1b_0\]

When the state is updated, it becomes

\[bb_{16}\cdots b_2b_1\] The new high-order bit \[b=b_{14}\oplus b_0\] is also the next output bit.

In [3]:
import bytestuff
state = 1

#The key is a string which we will convert to
#a long integer and then grab the low-order 17 bits
#as a regular int to initialize the register.

def init_state(key):
    global state
    bytelist=bytestuff.ascii_to_bytes(key)
    state=int(bytestuff.bytes_to_long(bytelist)&(0x1ffff))



#update the register and return the new high-order bit
def update_state():
    global state
    lowbit=state&1
    tapbit=((1<<14)&state)>>14
    state=(state>>1)
    highbit=(lowbit^tapbit)<<16
    state = state|highbit
    return lowbit^tapbit

#Get the next byte from the register
def next_byte():
    val=0
    for j in range(8):
        val=2*val+update_state()
    return val

#Return a list of the next k bytes from the register
def get_bytes(k):
    blist=[]
    for j in range(k):
        blist.append(next_byte())
    return blist

To decrypt our ciphertext, we first convert the base 64 representation to a sequence of bytes. We are given the first three bytes of plaintext, so we xor these with the start of the ciphertext to obtain the first thre bytes of the keystream.

In [6]:
c64="cXE/EbRiYqWRzN507PGyn5uNioiiUIruw0gbaWvdpqRiRQXVo8bwUh5UV3ZPWsqFgYei7SdnHvQTbIx0e+RI11igJNRzOcAQrvIZHqYak8iypBn5RQyp9WwhdgwF43+1S/Y/IioAVO+BTaeFPoIONQll7lNyviVPdogi42gqUk9lNUXtFQomQ25x8mCqvmSflG2I6u2w0fCS0+QQizfO18YjjHDvD7BT11DoVG8BBoTmoMdIoxVlPijUTK3nDsKF+BNSVjSGNnqKZZa16BXaF/xdbcx+ZiYsP94gwBmno7bZMIjRzZLNwtbAVpgvF03ky+foLVVRwfCiYx73il+cptAlcQNtRm3P0bzOvTkjlJWhcIy6V4DAHTo+3p3pldMUOxVKjc1yejYJZDKolvNZZ4AcXqKv/Lv3o5vhoAwDNCsI2EE/H5j5jRw0c97MHvqxT5cNfdpoyUOA4hDDAYulzkuMeT2dnTxptpe6yiCKP+1gHRx8RejOCeMasXW8oJi5JZqTHsymzWiFShsUUcaT7l7QSOyv8HJYscpszPJZJxjZAeew0Ck83h3hfghI4n0amMZ2GSTYGl88dJ4sNbjkX8Eb5IgMn2NcrX8FCHgwgbwALoPcn958je2ogcTFgJ11C+gz+kfLQQA1ycuCdTRmhyJbSGvPfcjxUdRRp3ZJ93+9t0qer+zeH2/s/1EuoWn1t783hgbpDbGVEDIfMjzXZv4P5a/2jDGMR4TsWvqjCB/oy2F2TPWNCBuj65yVdl2+Vdtd"
ct_bytes=bytestuff.b64_to_bytes(c64)
ptstart_bytes=bytestuff.ascii_to_bytes("He ")
keystream_start=bytestuff.xor(ptstart_bytes,ct_bytes[:3])
print bytestuff.bytes_to_bin(keystream_start)
00111001 00010100 00011111 

We now have the first 24 bits of the keystream. At this point it is possible to mount a brute-force attack, testing every one of the \(2^{17}\) possible initial states and seeing if they generate the start of our keystream. But such an attack scales up badly and becomes impractical for larger registers. The algebraic method we present below is practical for longer registers as well.

As above, we will denote the initial state of the register as \(b_{16}\cdots b_0.\) We will also denote the next 17 bits generated by \(b_{17},\ldots,b_{33}.\) For \(i=0,\ldots 16,\) we have \[b_{i+17}=b_i\oplus b_{i+14}.\]

Now we know the first 17 bits of our keystream, which means we have the values of \(b_{17},\ldots b_{33}.\) This lets us solve the equation above for \(i=3,\ldots,16.\)

In [7]:
b_next=[0,0,1,1,1,0,0,1,0,0,0,1,0,1,0,0,0]
b_before=[b_next[i]^b_next[i+3] for i in range(14)]
print b_before
[1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1]

We can now solve the first three equations: \(b_2=b_{16}\oplus b_{19}=1\oplus 1 = 0.\)

\(b_1=b_{15}\oplus b_{18}=0\oplus 0 = 0.\)

\(b_0=b_{14}\oplus b_{17}=1\oplus 0=1.\)

Now we've reconstructed the original state vector. We'll reverse it so that the low-order bit is at the right, convert it back into an integer, and then just check to make sure that it generates the same 3 keystream bytes we started with.

In [8]:
initial_bits=[1,0,0]+b_before
initial_bits.reverse()
val=0
for j in initial_bits:
    val=2*val+j
state=val
get_bytes(3)==keystream_start
Out[8]:
True

It worked! So now let's reset the initial state, generate the entire keystream, xor it with ciphertext, and read the message.

In [10]:
state=val
keystream=get_bytes(len(ct_bytes))
pt_bytes=bytestuff.xor(keystream,ct_bytes)
print bytestuff.bytes_to_ascii(pt_bytes)
He was worth looking at. He wore a shaggy borsalino hat, a rough gray sports coat with white golf balls on it for buttons, a brown shirt, a yellow tie, pleated gray flannel slacks and alligator shoes with white explosions on the toes.From his outer breast pocket cascaded a show handkerchief of the same brilliant yellow as his tie.There were a couple of colored feathers tucked into the bank of his hat, but he didn't really need them.Even on Central Avenue, not the quietest dressed street in the world, he looked about as inconspicuous as a tarantula on a slice of angel food.

This is an extract from the first page of the detective novel Farewell, My Lovely, by Raymond Chandler.