Making and Breaking Security Protocols with Heuristic Optimisation - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

Making and Breaking Security Protocols with Heuristic Optimisation

Description:

Introduction to heuristic optimisation techniques. Part I: making security protocols ... One of the most important heuristic techniques of the past 30 years. ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 58
Provided by: cla104
Category:

less

Transcript and Presenter's Notes

Title: Making and Breaking Security Protocols with Heuristic Optimisation


1
Making and Breaking Security Protocols with
Heuristic Optimisation
  • John A ClarkDept. of Computer Science
  • University of York, UKjac_at_cs.york.ac.uk
  • IBM Hursley 13.02.2001

2
Overview
  • Introduction to heuristic optimisation techniques
  • Part I making security protocols
  • Part II breaking protocols based on NP-hardness

3
Heuristic Optimisation
4
Local Optimisation - Hill Climbing
z(x)
Really want toobtain xopt
Neighbourhood of a point x might be
N(x)x1,x-1Hill-climb goes x0 ? x1 ? x2
since f(x0)ltf(x1)ltf(x2) gt f(x3) and gets
stuck at x2 (local optimum)
xopt
5
Simulated Annealing
Allows non-improving moves so that it is possible
to go down
z(x)
in order to rise again
to reach global optimum
x
In practice neighbourhood may be very large and
trial neighbour is chosen randomly. Possible to
accept worsening move when improving ones exist.
6
Simulated Annealing
  • Improving moves always accepted
  • Non-improving moves may be accepted
    probabilistically and in a manner depending on
    the temperature parameter T. Loosely
  • the worse the move the less likely it is to be
    accepted
  • a worsening move is less likely to be accepted
    the cooler the temperature
  • The temperature T starts high and is gradually
    cooled as the search progresses.
  • Initially virtually anything is accepted, at the
    end only improving moves are allowed (and the
    search effectively reduces to hill-climbing)

7
Simulated Annealing
  • Current candidate x. Minimisation formulation.

At each temperature consider 400 moves
Always accept improving moves
Temperature cycle
Accept worsening moves probabilistically. Gets
harder to do this the worse the move. Gets
harder as Temp decreases.
8
Simulated Annealing
Do 400 trial moves
Do 400 trial moves
Do 400 trial moves
Do 400 trial moves
Do 400 trial moves
Do 400 trial moves
9
Genetic Algorithms
  • Based on evolution survival of the fittest.
  • Encode solution to optimisation problem as a gene
    string.
  • Carry out the following (simple GA approach)
  • take a group of solutions
  • assess their fitness.
  • choose a new population with fitter individuals
    having more chance of selection.
  • mate pairs to produce offspring.
  • allow individuals to mutate.
  • return to first step with offspring as new group.
  • Eventually the strings will converge to a
    solution.

10
Genetic Algorithms Simple Example
  • The problem is
  • maximise the function g(x)x over the integers
    0..15
  • We shall now show how genetic algorithms might
    find this solution.
  • Lets choose the obvious binary encoding of the
    integer solution space
  • x0 has encoding 0000
  • x5 has encoding 0101
  • x15 has encoding 1111
  • Choose the obvious fitness function,
    fitness(x)g(x)x

11
Genetic Algorithms Simple Example
12
General Iteration
  • We now have our new generation, which is subject
    to selection, mating and mutation
    again......until some convergence criterion is
    met.
  • In practice its a bit more sophisticated but the
    preceding slide gives the gist.
  • Genetic algorithms have been found to be very
    versatile. One of the most important heuristic
    techniques of the past 30 years.

13
Making Protocols with Heuristic Optimisation
14
Security Protocols
  • Examples
  • Secure session key exchange
  • I am alive protocols.
  • Various electronic transaction protocols.
  • Problems
  • Rather hard to get right
  • We cannot even get three-line programs right
  • Probably the highest profile area of academic
    security research.
  • Major impetus given to the area by Burrows Abadi
    and Needhams belief logic BAN logic.

15
BAN Logic
  • Allows the assumptions and goals of a protocol to
    be stated abstractly in a belief logic.
  • Messages contain beliefs actually held by the
    sender.
  • Rules govern how receiver may legitimately update
    his belief state when he receives a message.
  • Protocols are series of messages. At the end of
    the protocol the belief states of the principals
    should contain the goals.

16
BAN Logic
  • Basic elements

P,Q stand for arbitrary protocol principals
K is a good key for communicating between P and Q
Np is a well-typed nonce, a number to be used
only once in the current protocol run, e.g. a
randomlygenerated number use as a challenge.
Np is fresh , meaning that it really is a
valid nonce
17
BAN Logic
P once said X, i.e. has issued a message
containing X at some point
P believes X. The general idea is that
principals shouldonly issue statements they
actually believe. Thus, P mighthave believed
that the number Na was fresh yesterdayand said
so, but it would be wrong to conclude that
hebelieves it now. If the message is recent (see
later) then we might conclude he believes it.
P has jurisdiction over X. This captures the
notion that P is an authority about the
statement X. If you believeP believes X and you
trust him on the matter, then you should believe
X too (see later)
18
BAN Logic - Assumptions and Goals
A and S share common belief in the goodness of
the key Kasand so they can use it to
communicate. S also believes thatthe key Kab is
a good session key for A and B. A has a number Na
that he also believes is fresh and believes
thatS is the authority on statements about the
goodness of key Kab. The goal of the protocol is
to get A to believe the key Kab is good for
communication with B
19
BAN Logic Message Meaning Rule
20
BAN Logic Nonce Verification Rule
This rule promotes once saids to actual beliefs
21
BAN Logic Jurisdiction Rule
Jurisdiction captures the notion of being an
authority. A typical use would be to give a key
server authority over statements of belief about
keys. If I believe that a key is good and you
reckon I am an authority on such matters then you
should believe the key is good too
22
Messages as Integer Sequences
sender
Belief_1
receiver
Belief_2
22
8
19
12
022 mod 3
38 mod 5
119 mod 3
212 mod 5
P
Q
Say 3 principals P, Q and SP0, Q1,S2 Message
components are beliefs in thesenders current
belief state (and so if P has 5 beliefsintegers
are interpreted modulo 5)
23
Search Strategy
  • We can now interpret sequences of integers as
    valid protocols.
  • Interpret each message in turn updating belief
    states after each message
  • This is the execution of the abstract protocol.
  • Every protocol achieves something! The issue is
    whether it is something we want!
  • We also have a move strategy for the search, e.g.
    just randomly change an integer element.
  • This can change the sender,receiver or specific
    belief of a message (and indeed subsequent ones)

24
Fitness Function
  • We need a fitness function to capture the
    attainment of goals.
  • Could simply count the number of goals attained
    at the end of the protocol
  • In practice this is awful.
  • A protocol that achieves a goal after 6 messages
    would be good as ne that achieved a goal after
    1 message.
  • Much better to reward the early attainment of
    goals in some way
  • Have investigated a variety of strategies.

25
Fitness Functions
is given by
One strategy (uniform credit) would be to make
all the weightsthe same. Note that credit is
cumulative. A goal achievedafter the first
message is also achieved after the second
andthird and so on.
26
Examples
One of the assumptions made was that B would take
Ssword n whether A Na
27
Examples
28
General Observations
  • Able to generate protocols whose abstract
    execution is a proof of their own correctness
  • Have done so for protocols requiring up to 9
    messages to achieve the required goals.
  • Other methods for protocol synthesis is search
    via model checking. Exhaustive but limited to
    short protocols.
  • Can generalise notion of fitness function to
    include aspects other than correctness (e.g.
    amount of encryption).

29
Breaking Protocols with Heuristic Optimisation
30
Identification Problems
  • Notion of zero-knowledge introduced by Goldwasser
    and Micali (1985)
  • Indicate that you have a secret without revealing
    it
  • Early scheme by Shamir
  • Several schemes of late based on NP-complete
    problems
  • Permuted Kernel Problem (Shamir)
  • Syndrome Decoding (Stern)
  • Constrained Linear Equations (Stern)
  • Permuted Perceptron Problem (Pointcheval)

31
Pointchevals Perceptron Schemes
  • Interactive identification protocols based on
    NP-complete problem.
  • Perceptron Problem.

32
Pointchevals Perceptron Schemes
  • Permuted Perceptron Problem (PPP). Make Problem
    harder by imposing extra constraint.

33
Example Pointchevals Scheme
  • PP and PPP-example
  • Every PPP solution is a PP solution.

Has particular histogram H of positive values
34
Generating Instances
  • Suggested method of generation

Significant structure in this problem high
correlation between majority values of matrix
columns and secret corresponding secret bits
35
Instance Properties
  • Each matrix row/secret dot product is the sum of
    n Bernouilli (1/-1) variables.
  • Initial image histogram has Binomial shape and is
    symmetric about 0
  • After negation simply folds over to be positive

-75-3-1 1 3 5 7
36
PP Using Search Pointcheval
  • Pointcheval couched the Perceptron Problem as a
    search problem.

37
Using Annealing Pointcheval
  • PPP solution is also PP solution.
  • Based estimates of cracking PPP on ratio of PP
    solutions to PPP solutions.
  • Calculated sizes of matrix for which this should
    be most difficult
  • Gave rise to (m,n)(m,m16)
  • Recommended (m,n)(101,117),(131,147),(151,167)
  • Gave estimates for number of years needed to
    solve PPP using annealing as PP solution means
  • Instances with matrices of size 200 could
    usually be solved within a day
  • But no PPP problem instance greater than 71 was
    ever solved this way despite months of
    computation.

38
Perceptron Problem (PP)
  • Knudsen and Meier approach (loosely)
  • Carrying out sets of runs
  • Note where results obtained all agree
  • Fix those elements where there is complete
    agreement and carry out new set of runs and so
    on.
  • If repeated runs give same values for particular
    bits assumption is that those bits are actually
    set correctly
  • Used this sort of approach to solve instances of
    PP problem up to 180 times faster than
    Pointcheval for (151,167) problem but no upper
    bound given on sizes achievable.

39
Profiling Annealing
  • Approach is not without its problems.
  • Not all bits that have complete agreement are
    correct.

1
-1
40
Knudsen and Meier
  • Have used this method to attack PPP problem sizes
    (101,117)
  • Needs hefty enumeration stage (to search for
    wrong bits), allowed up to 264 search complexity
  • Used new cost function w130, w21 with histogram
    punishment
  • cost(y)w1costNeg(y)w2costHist(y)

41
Analogy Time I Encryption
Plaintext P
The Black Box Assumption essentially
considering encryption only as a mathematical
function.
In the public arena only really challenged in the
90s when attacks based on physical
implementation arrived
Key
  • Fault Injection Attacks (Belcore, and
    others)

Ciphertext C
  • Paul Kochers Timing Attacks
  • Simple Power Analysis
  • Differential Power Analysis

The computational dynamics of the implementation
can leak vast amounts of information
42
Analogy Time II Annealing
Problem P
The Black Box Assumption virtually every
application of annealing simply throws the
technique at problem and awaits the final
output. Is this really the most efficient use of
information? Lets look inside the box..
Initialisation data
Final Solution C
43
Analogy Time III Internal Computational Dynamics
Problem P, e.g. minimise cost(y,A,Hist)
The algorithm carries out 100 000s of cost
function evaluations which guide the search.
Initialisation data
Why did it take the path it did? Bear in mind
the whole search process is public and so we can
monitor it.
Final Solution C
44
Analogy Time IV Fault Injection
Warped or Faulty Problem P
Invariably people assume you need to solve the
problem at hand. Reflected in well-motivated or
direct cost functions
Initialisation data
What happens if we inject a fault into the
process?
Mutate the problem into a similar but different
one. Can we make use of the solutions obtained to
help solve original problem?
Final Solution C
45
PP Move Effects
  • What limits the ability of annealing to find a PP
    solution?
  • A move changes a single element of the current
    solution.
  • Want current negative image values to go positive
  • But changing a bit to cause negative values to go
    positive will often cause small positive values
    to go negative.

0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
46
Problem Fault Injection
  • Can significantly improve results by punishing at
    positive value K
  • For example punish any value less than K4 during
    the search
  • Drags the elements away from the boundary during
    search.
  • Also use square of differences Wi-K2 rather
    than simple deviation

0
1
2
3
4
5
6
7
47
Problem Fault Injection
  • Comparative results
  • Generally allows solution within a few runs of
    annealing for sizes (201,217)
  • Number of bits correct is generally worst when
    K0.
  • Best value for K varies between sizes (but can do
    profiling to test what it is)
  • Has proved possible to solve for size (401,417)
    and higher.
  • Enormous increase in power for essentially change
    to one line of the program
  • Using powers of 2 rather than just modulus
  • Use of K factor
  • Morals
  • Small changes may make a big difference.
  • The real issue is how the cost function and the
    search technique interact
  • The cost function need not be the most natural
    direct expresion of the problem to be solved.
  • Cost functions are a means to an end.
  • This is a form of fault injection on the problem.

48
Profiling Annealing
  • But look again at the cost function templates
  • Different weights w1 and w2 will given different
    results yet the resulting cost functions seem
    plausibly well-motivated.
  • We can view different choices of weights as
    different viewpoints on the problem.
  • Now carry out runs using the different costs
    functions.
  • Very effective using about 30 cost functions
    have managed to get agreement on about 25 of the
    key with less than 0.5 bits on average in error
  • Additional cost functions remove incorrect
    agreement (but may also reduce correct agreement).

49
Radical Viewpoint Analysis
Problem P
Problem P1
Problem P2
Problem Pn-1
Problem Pn
Essentially create mutant problems and attempt to
solve them. If the solutions agree on particular
elements then they generally will do so for a
reason, generally because they are correct. Can
think of mutation as an attempt to blow the
search away from actual original solution
50
Profiling Annealing Timing
  • Simulated annealing can make progress, typically
    getting solutions with around 80 of the vector
    entries correct (but dont know which 80)
  • But this throws away a lot of information
    better to monitor the search process as it cools
    down.
  • Based on notion of thermostatistical annealing.
  • Watch the elements of the secret vector as the
    search proceeds.
  • Record the temperature cycle at which the last
    change to an elements value occurs, i.e. 1 to 1
    or vice versa
  • At the end of the search all elements are fixed.
  • Analysis shows that some elements will take some
    values early in the search and then never
    subsequently change.
  • They get stuck early in the search.
  • The ones that get stuck early often do so for
    good reason they are the correct values.

51
Profiling Annealing Timing
  • Tested 30 PPP instances (101,117) with 32
    different strategies (different weights wi for
    negativity and histogram component costs and
    different values of K). Ten runs at each
    strategy.
  • Maximum number of initial bits fixed at correct
    values
  • Some strategies far better than others value of
    K is very important K13 seems very good
    candidate.
  • Channel is highly volatile hence need for
    repeated runs.
  • Note also that some runs had up to 108 of 117
    bits set correctly in final solution.
  • For small K the minimum number of bits correct in
    final solution is radically worse than for larger
    values of K.

lt4040-4950-5960-6970-79
2101422
52
Profiling Annealing Timing
  • Tested 30 PPP instances (151,167) with 16
    different strategies (different weights wi for
    negativity and histogram component costs and
    different values of K). Ten runs at each
    strategy.
  • Maximum number of initial bits fixed at correct
    values
  • Similar general results as before.
  • Also tried for (201,217) some runs in excess of
    100 initial stuck bits correct.

lt4040-4950-5960-6970-7980
1591122
53
Some Questions
  • Can you fix an element of the solution at 1 and
    1 and determine likelihood of correctness based
    on distribution of results obtained?
  • Affects of different parameters (e.g. power
    parameters)?
  • How well can we profile the distribution of
    results in order to isolate those ones at the
    extremes of correctness?
  • Can we apply similar profiling tricks to other
    NP-complete problems
  • Permuted Kernel Problem
  • Syndrome Decoding

54
Example Permuted Kernel Problem
  • Arithmetic carried out mod p

55
Example Syndrome Decoding
  • Arithmetic carried out mod 2

Small number k of bits in S set to 1
56
Some Questions
  • Why does everyone try to find the secret/key
    directly?
  • e.g. for Block ciphers can we use guided search
    techniques to generate better approximations?
  • Use search to generate better (or more)
    cryptanalytic tools, e.g. multiple
    approximations?
  • Very loose. What would happen if you tried to
    search for a key on a difficult traditional
    encryption algorithm?

Encrypt(K P)C
Suppose you tried a guided search based on
Hamming Distance Encrypt(K P)C Cost(C,C)h
amming(C,C) (or sum of such costs over
Pi) No chance of success at all. But what is the
distribution of the failures? Is there a cost
function that would induce an exploitable
distribution of solutions?
57
Some Questions
  • Work combines fault injection and a timing
    attack?
  • What is the equivalent of differential power
    analysis for heuristic search?
Write a Comment
User Comments (0)
About PowerShow.com