Title: Cryptography, Attacks and Countermeasures Lecture 3 Stream Ciphers
1Cryptography, Attacks and Countermeasures
Lecture 3 - Stream Ciphers
- John A Clark and Susan StepneyDept. of Computer
Science - University of York, UKjac,susan_at_cs.york.ac.uk
2Stream Ciphers
- Part I Pseudo-random number generators.
- Lots of Bad Ways
- Part II Divide and conquer attacks.
3Stream Ciphers - Vernam
- Vernam Cipher works by generating a random bit
stream and then XORing that stream on a bit by
bit basis with the plaintext.
Key K
Key K
Random Stream Bi
Random Stream Bi
Bi
Ci
Both sender and receiver can generate key stream
Bi. Receiver XORs the ciphertext stream with the
key stream to recover the plaintext stream. We
will use this cipher to illustrate several
concepts.
4Linear Feedback Shift Registers
Lij
At each iteration there is a right shift, a bit
falls off the end, and the leftmost bit is set
according to the linear feedback function. Here
0011
5Periodicity
- We would like the stream to be random-looking.
- One feature should be that the stream should not
repeat itself too quickly. - Note that this is in effect a finite state
machine and so must repeat itself eventually. - The maximal period for an n-bit register is 2n-1.
- Why not 2n ?
6Maximal Period m-sequences
- The tap sequence defines the linear feedback
function and is often regarded as a finite field
polynomial. - You have to choose the tap sequence very
carefully. - Some choices provide a maximal length period.
- These are primitive polynomials
7Primitive Polynomials Give m-sequences
0
0
1
1
Common to denote the above by the polynomial
C(D)1DD4. Note we are back to where we started.
8Some Polynomials Dont
0
0
1
1
The polynomial C(D)1DD3 does not give a
maximal period sequence.
9Not good for PRNG
- Consider a 64 bit register. Can this be used as a
key stream generator? - No. Once you know a very small amount of
plaintext (e.g. 32 consecutive bits) then you
can calculate the corresponding key stream and so
you know the rightmost 32 bits in the register. - You can now try in turn all other 232
combinations for the rest. When you get the right
one, you are able to generate the whole key
stream - And so plaintext should make sense.
- This is just too easy to break.
- But LFSRs are very easy to implement and execute
quickly. - Can we fix matters?
- How about a less primitive way of extracting the
key stream. - How about combining several streams to achieve
any better security?
10Very Simple Model
LFSR 1
Use some function f to operate on some subset of
the LFSR register components
f
Zj
11Boolean Functions Algebraic Normal Form (ANF)
- A Boolean function on n-inputs can be represented
in minimal sum (XOR ) of products (AND .)
form - This is the algebraic normal form of the
function. - The algebraic degree of the function is the size
of the largest subset of inputs (i.e. the number
of xj in it) associated with a non-zero
co-efficient. - 1 is a constant function (as is 0)
- x1x3x5 is a linear function
- x1.x3x5 is a quadratic function
- x1.x3.x5x4x5x2 is a cubic function
f(x1,,xn)a0a1. x1 an. xn
a1,2.x1.x2 an-1,n.xn-1.xn
a1,2,..,n x1.x2 ...xn
12Very Simple Model
What about a linear function f?
LSFR 1
f
Zj
13Very Simple Model
- This would be pretty awful. Suppose we know a
sequence of keystream bits z0, z1 , z2, z2 ,1,
1, 1, 1, - Essentially every key stream output can be
expressed as a linear function of the elements of
the initial state. We can derive a number of
these equations and then solve them by standard
linear algebra techniques.
14Very Simple Model
XORed to give feedback
z0 s0 s2
z1 s1 s3
z2 s2 s0 s3
z3 s3 s0 s1s3 s0 s1
15Very Simple Model
We can apply linear algebra equation solving
techniques and solve for the si.
z0 s0 s2
z1 s1 s3
1 0 1 00 1 0 11 0 1 11 1 0 0
z2 s2 s0 s3
s0s1s2s3
1111
z0z1z2z3
z3 s3 s0 s1s3 s0 s1
This has solution
0
0
1
1
16Harder Model
What about a non-linear function f
This is better but it is still possible to attack
such systems if f is approximated by a linear
function. We will talk about approximations later.
17Classical Stream Cipher Model
Plaintext Stream PjKeystream ZjCipherstream Cj
LSFR 1
L1j
LSFR 2
L2j
Combining Boolean function f.
f
Receiver can generate key stream and recover
plaintext
LSFR n
Choose f very carefully
Lnj
N- Bit registers Initial register values form the
key
18Periodicity
- The LFSRs need not all be the same length.
- The LFSRs will give a vector input which has
period that is the product of the least common
multiple of the periods of each of the LFSRs. - E.g. if period LFSR13,LFSR27 then overall
period is 21
19Awful Choice for f
LSFR 1
This is a truly awful choice. The key is intended
to be 2 x 32 64 bits.
x1j
f
Zjf(x1j , x2j) x1j
You have completely ignored LFSR 2. Key size
32 bits only
LSFR 2
x2j
32- Bit registers Initial register values form
the key
20Better but Still Awful Choice for f
LSFR 1
Congratulations! You have not ignored LFSR
2! Key size 64 bits?
x1j
f
LSFR 2
x2j
32- Bit registers Initial register values form
the key
21Better but Still Awful Choice for f
- Well not quite such a good choice.
- Suppose you know 32 consecutive bits of plaintext
(or can guess them correctly). - Calculate the 32 bits of key stream.
- But if stream bit is 0 then there are only 2
possible pairs. Similarly, if stream value is 1.
Effective key size 232
x32
x31
x30
x2
x1
or
y32
y31
y30
y2
y1
1
1
0
1
0
22First Bit of Bad Linearity
- The combination function here is a linear
function of the inputs - f(x1,x2)x1
- f(x1,x2)x2
- f(x1,x2)x1x2
- The following are quadratic functions
- f(x1,x2)x1.x2 x1
- f(x1,x2) x1.x2 x2
- f(x1,x2) x1.x2 x1x2
- Extreme examples given but beware linearity -
even a hint of it can spell trouble. - linear functions are so called because they can
cause some cryptosystems to be broken straight
away?? ? ? ? ?
23Divide and Conquer Attacks
- Exploiting simple correlations in the combining
function
24Geffe Generator
LSFR 1
a1j
select
LSFR 2
2x1Multiplexor
b2j
LSFR 3
c3j
Zj
Z(a b) (not(a) c)
25Geffe Generator DIVIDE AND CONQUER
Looking at the table it is clear that the output
z agrees with b 75 of the time. Also agrees
with c 75 of the time.
a b c z 0 0 0 0 0 0 1 1 0 1 0 0 0 1 1 1 1 0 0
0 1 0 1 0 1 1 0 1 1 1 1 1
26Geffe Generator DIVIDE AND CONQUER
- So consider each possible initial state s of
register LFSR2. - Determine the LFSR2 stream that s produces.
- Check the degree of agreement of this stream with
the actual key stream. - Turns out
- if state s is correct you will get roughly the
right amount of agreement. - if state s is incorrect you will get roughly
random (50) agreement. - Thus we have targeted LFSR2 and can easily break
it. - Now can target LFSR3 in exactly the same way.
- So we can get LFSR2 and LFSR3. Now we can derive
the selection LFSR1 state very straightforwardly - try every possible state. The correct one should
allow you to simulate the whole sequence. - Other ways too.
27Divide and Conquer
- Divide and conquer attacks were suggested by
Siegenthaler as a means of exploiting approximate
linear relationships between function inputs and
its output. - This led to new criteria being developed as
countermeasures to these correlation attacks. - We will consider an extremely simple example.
28Divide and Conquer
- Consider the following combining function
- f(x1,x2)x1.x2x1
- Clearly not linear. But
- f(x1,x2) agrees with x1 75 of the time here.
- Consider each possible initial state of LFSR1 and
determine the degree of agreement with the actual
key stream. - The correct initial state will give approximately
75 agreement and the rest will give fairly
random agreement. - Its also obvious that if we know f(x1,x2)1 then
we know both x1 and x2 gt this is simply due to
the incredibly small nature of the example.
x1 x2 f(x1,x2) 0 0 0 0 1 0 1 0 1 1 1 0
29Divide and Conquer
- Consider two functions f(x1,x2) and g(x1,x2)
- We say that f(x1,x2) is approximated by g(x1,x2)
if the percentage of pairs (x1,x2) which given
the same values for f and g differs from 50. - If they agree precisely half the time we say that
they are uncorrelated. - Note if the percentage of agreement is less than
50 we can always find a function that has
positive agreement, namely - g(x1,x2)true.
30Ideas Generalise
- We can consider similar ideas for n-input
functions - f(x1,x2,,xn) and
- g(x1,x2,,xn)
- Degree of approximation with linear functions may
be slight. - The smaller the degree of approximation the more
data you need to have to break the system.
31And then what?
- The idea of multiple LFSRs is that the size of
the keyspace should be the product of the
keyspace sizes for each register. - Divide and conquer reduces this to a sum of key
sizes and you attack each in turn. - Note what happens when you crack one LFSR. The
complexity of the remaining task is reduced - f(x1,x2)x1.x2x1
- Once you know x1 then the task for x2 is simpler
whenever you know x11 you know what x2 is.
32All Fall Down
- In a similar vein, suppose
- There is a small exploitable correlation with
input x1. - There is a small correlation with x1x2.
- If LFSR 1 can be broken to reveal x1 then we have
now have a straightforward correlation with x2 to
exploit.
33Dont tell them
- But what if you dont publicise the taps sequence
keep the feedback polynomial secret (as part of
the key). - Makes things harder but there are in fact some
further attacks here too.
34Summary
- Have presented some very simple stream cipher
models. - Divide and conquer attacks.
- Dangers of linearity and hints of it.
- Next lecture
- What do we do about the dangers?
- Boolean function criteria.
- High non-linearity.
- High algebraic degree.
- Correlation immunity.
- Tradeoffs between them.
35(No Transcript)
36(No Transcript)
37