Title: Application of Combinatorial Mathematics to Cryptology: A Personal Journey
1Application of Combinatorial Mathematics to
Cryptology A Personal Journey
- Ed Dawson
- Information Security Institute
- Queensland University of Technology
2Overview
- Introduction
- Combinatorial Structures
- Secret Sharing Schemes
- Latin Squares and Authentication Schemes
- Linear Codes and Boolean Functions
- Discrete Optimisation
- Genetic Algorithm
- Knpasack Cipher and Genetic Algorithm
- Boolean Functions and Discrete Optimisation
- Lessons Learned
3Introduction
4Areas of Application
- Combinatorial structures with special properties
- Provide discrete structures to build
cryptographic systems - Discrete Optimisation
- Provide methods to search large finite structures
- Tool for designing cryptographic systems and for
cryptanalysis
5Combinatorial Structures
- Examples
- Ordered and unordered block designs used in
secret sharing schemes. - Linear codes used in stream ciphers, block
ciphers, public key ciphers, authentication
codes, secret sharing schemes. - Latin squares used in authentication schemes.
- Primitive polynomials used in stream ciphers.
6Discrete Optimisation Techniques
- Genetic Algorithm
- Hill Climbing
- Simulated Annealing
- Tabu Search
7Secret Sharing Schemes
8Shamirs Secret Sharing Scheme (1979)
- Key Generation
- Select a polynomial f(x)Ka,x at-1 xt-1 over
Zp where P is large prime - Distribution to participant Pi share f(i) for
i1, , n - Key Recovery
- Any t participants can recover key K using their
shares by lagrange interpolation - This is a perfect t-out-of-n threshold scheme
9Secret Sharing Schemes
- A t-out-of-n perfect threshold scheme is a method
whereby n pieces on information called shares, to
a secret key are distributed such that - K can be reconstructed from knowledge of any t or
more shares - Knowledge of fewer that t shares provides no
information about K
10Orthogonal Arrays(Dawson, Mahmodian, Rahilly,
1993)
- t-out-of-n perfect threshold schemes can be
constructed using orthogonal arrays -
- The simplest construction is Shamirs secret
sharing scheme.
11Breadth of Shamirs Secret-Sharing
Scheme(Dawson and Donovan 1994)
- General access control system for secret sharing
using Shamirs scheme including -
- Democratic schemes
- Multi level
12Linear Codes andBoolean Functions
13Properties of Boolean Function
- Hamming Weight
- wtH is number of ones in truth
- f(x) with n inputs is balanced if wtH (f)2 (n-1)
- Hamming Distance
- DistH (f,g) is the number of truth table
positions in which f and g differ. - Nonlinearity, Nf, of f (x) is the minimum Hamming
distance between f(x) and any affine function.
14Properties of Boolean Function
- Correlation
-
- f(x) has correlation immunity order m if there is
zero correlation between f(x) and any linear
function Lw(x) with wtH(w) m
15Table Upper bounds on numbers balanced CI(m)
Boolean functions
Correlation Immune Function
16Construction of Correlation Immune Functions
(Dawson, Wu 1997)
- Linear codes can be used to construct Boolean
Functions with known order or correlation
immunity and nonlinearity. - Theorem Let f(x)g(xGT), where g is a
non-degenerate Boolean function of k variables,
and G is a generating matrix of an n,k,d linear
code. Then - f(x) is balanced if and only if a g(y) is
balanced, - Order(f(x))ord(g(y))
- Nf2n-kNg
- The correlation immunity of f(x) is at least d-1
17Latin Squares and Authentication Schemes(Denes
and Keedall 1992)
- Let (Q, ) denote a quasigroup where
- Q is a set of q elements
- a binary operation where axb and yab has
exactly the same solution - Let a message consist of s-blocks of length t
18Latin Squares and Authentication Schemes
- Key Generation
- Sender and receiver select secret (Q,)
- Authentication
- M a, a2, , am B, B2, , Bs
- Bi (((ai1 ai2)8ai3))ait
- Transmit a1 a2 am b1 b2 bs
- Verification
- Receiver uses (Q,) on a1 a2 am to verify b1
b2 bs
19Attack on Authentication Scheme (Dawson, Donovan,
Offer, 1996)
- Attack 1
- Given sufficient messages and authentication tags
it is possible for an attacker to recover (Q,) - Attacker can then impersonate sender
- Attack 2
- There exists equivalent quasigroups
20Genetic Algorithm
21Genetic Algorithm
- Holland circa 1975
- modelled on an evolutionary strategy
- reproduction incorporating mutation, and
- survival of the fittest
- a pool of solutions evolve based upon suitable
mating, mutation and selection schemes - traditionally solutions are represented as a
binary string, however newer techniques allow for
arbitrary solution structures (evolutionary
programming).
22Example of Operators
- Selection parents are chosen from the current
solution pool either at random, or based upon
their fitness (weighted selection) - Mating traditional crossover
- Mutation random bit complementation each bit
in the string is complemented with probability,
pm, the mutation.
23Example of Operators
- 1. Generate an initial pool of solutions
(randomly or otherwise) and calculate the fitness
of each. - 2. For G iterations, using the current pool
- (a) Select the breeding pool from the current
solution pool and make pairings of parents. - (b) Using a suitable mating function, use each
pair of parents to generate a new pool of
solutions. - (c) Apply the mutation to each solution in the
new pool. - (d) Evaluate the fitness of each of the new
solutions. - (e) Based on the fitness of the solutions in the
new pool and the current pool, select the
solutions which will become the current pool in
the next iteration. - 3. Output the best solution found.
24Attacks onKnapsack-Type Ciphers
- Merkle-Hellman cryptosystem
- based on an NP-hard adaptation of the subset sum
problem Given a set of integers, A, and an
integer B obtained by summing a subset of A, find
the subset (which is unique). - a number of exploits exist which attack the
structure of the secret key (trapdoor) - these
are very effective. In the Merkle-Hellman
cryptosystem the secret key is a super-increasing
sequence and the public key is obtained by
modular multiplication with a secret constant - Spillman (1993) proposed a genetic algorithm to
solve the subset sum problem and hence attack the
knapsack cipher!
25Knapsack-Type Ciphers(Clark, Dawson 1994)
- Example (trivial in the extreme!)
- Public key A5457, 1663, 216, 6013, 7439
- Message M1, 0, 1, 1, 0
- Sum5457216601311686
- Spillman proposed a fitness based on how close
the subset sum is to the target . will not work
since difference in sums does not correlate with
Hamming distance - M1 11110, Sum113349.
- M2 10001, Sum212896.
- This is not an exception, it is the general rule
26Knapsack-Type Ciphers
- Experiment with knapsack size 30. Fitness
values lie in the range (0,1)
27Knapsack-Type Ciphers
- Therefore
- there is little to no correlation between the
Hamming distance and the fitness - since the fitness is not accurate, optimisation
heuristics will not be effective - consider the following results averaged over 100
different sums for each knapsack size
28Knapsack-Type Ciphers
- The results indicate that the genetic algorithm
searches approximately one quarter of the
solution space before finding the correct
solution - this is only twice as good as exhaustive search
which would search half the solution space (on
the average) before finding the correct solution
- experiments indicate that the exhaustive search
is much more efficient since it doesn't suffer
from the complexities of the GA. - Conclusion
- optimisation heuristics are ineffective if there
is no suitable solution assessment technique
available.
29Searching for Cryptographic Boolean
Functions(Millan, Clark, Dawson 1998)
- Overview
- nonlinearity (distance to the closest linear
function) is an important cryptographic property
of Boolean functions - balance is another important property
- a new technique for improving the nonlinearity of
arbitrary Boolean functions, while maintaining
balance, is proposed - this technique can be used to find
locally-maximum (in nonlinearity) Boolean
functions using a hill-climbing approach - the hill climbing method can be incorporated in a
genetic algorithm to find Boolean functions with
even higher nonlinearity.
30Improving Nonlinearity
- It is possible to define
- conditions for determining a set of pairs of
truth table positions such that complementing
both truth table positions in the pair will
increase the nonlinearity while maintaining the
balance of the function - an efficient technique for calculating the new
WHT of a function modified using the above
method. - Locally-maximum functions
- functions for which such a set does not exist are
locally maximum and their nonlinearity cannot be
improved by complementing two of their truth
table values.
31Hill Climbing
- This technique can be used to successively update
a Boolean function's truth table until it is no
longer possible to improve the nonlinearity - 1. Generate a random truth table and calculate
the Walsh-Hadamard transform. - 2. Determine a set of pairs of truth table
positions which, upon complementation, will
improve the nonlinearity of the function (using
techniques described above). If the set is empty
go to Step 4. - 3. Select one of the elements of the set (either
randomly, or using some other heuristic), and
complement the corresponding truth table
positions. Update the Walsh-Hadamard transform.
Return to Step 2. - 4. The current function is locally maximum in
nonlinearity.
32Using a GA to find Nonlinear Boolean Functions
33Using a GA to find Nonlinear Boolean Functions
- Notes
- complementing a function does not effect its
nonlinearity - moving the functions closer to each other (by
complementing one), if necessary, reduces the
amount of randomness in the child and, therefore,
leads to children with similar characteristics - since this mating operation incorporates
randomness, a mutation operation is not required.
34The Genetic Algorithm
- 1. Generate a pool of P random Boolean functions
and calculate their Walsh-Hadamard transforms. - 2. For G iterations do
- (a) Perform the mating operation an all P(P-1)/2
pairings of solutions in the current pool - (b) Hill climb each child function so that they
are all locally maximum with respect to the
technique being used. - (c) Select the best solutions from the list of
children and the current pool to form the new
pool. To encourage diversity in the search, when
a child has an equal fitness to a solution in the
current pool, replace it with the child. - 3. Report the best solution(s) from the current
solution pool.
35Boolean Function Results
- Benchmark results based upon random search of
1000000 functions - R HC hill climbing of random functions
- GA genetic algorithm with mating function no
hill climbing - GA HC genetic algorithm with mating function
and hill climbing. - Number of functions considered by each technique
before finding the benchmark
36Boolean Function Results
- best nonlinearity achieved by each technique
after testing 10000 functions
37Application of GA Construction
- Design of Boolean functions for LILI stream
cipher - LILI-128 Cipher (Millan, Simpson, Dawson 1999)
- LILI-II Cipher (Millan, Simpson, Dawson 2001)
- Design of S-Boxes for SOBER stream cipher
(Burnett, Dawson, Millan 1999) - Design of S-Boxes for MARS block cipher (Burnett,
Dawson, Millan 2001) - Design of S-Boxes for Dragon stream cipher
(Fuller, Millan, Dawson 2003)
38Lessons Learned
- Combinatorial mathematics offers a powerful tool
for designing and analysing cryptographic
systems. - Simplify! Simplify!
- To apply combinatorial techniques one needs to
understand cryptology. - For application of discrete optimisation make
sure correct fitness function is used.