Longer Keys may Facilitate Side Channel Attacks - PowerPoint PPT Presentation

About This Presentation
Title:

Longer Keys may Facilitate Side Channel Attacks

Description:

Longer Keys may Facilitate Side Channel Attacks Colin D. Walter C O M O D O RESEARCH LAB www.comodogroup.com (Bradford, UK) colin.walter_at_comodogroup.com – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 23
Provided by: Coli107
Category:

less

Transcript and Presenter's Notes

Title: Longer Keys may Facilitate Side Channel Attacks


1
Longer Keys may Facilitate Side Channel Attacks
Colin D. Walter
C ? O ? M ? O ? D ? O RESEARCH LAB
  • www.comodogroup.com (Bradford, UK)
  • colin.walter_at_comodogroup.com

2
Overview
  • Side Channel Attacks as motivation for looking at
    RSA key lengths.
  • Extracting Data by Power and Timing Attacks.
  • Reconstructing Secret Keys.
  • Comparing different key lengths for
  • a timing attack
  • a power attack
  • Conclusion

3
Timing Power Analysis Attacks
  • Conditional statements in executing code can
    cause minute variations in time for decryption
    and signing. This may leak information about
    the secret key.
  • Changing inputs to H/W gates causes minute data
    dependent current variations in a smart card.
    This leaks secret data when performing RSA
    decryption or signing.
  • For example, in the standard implementation,
    average time for a modular multiplication is
    different from that of a modular squaring.
    Power variations make this visible. Then use of
    the binary expn algm reveals the secret key.

4
History
  • NSA Tempest programme
  • P. Kocher (Crypto 96) Timing attack on
    implementations of Diffie-Hellman, RSA, DSS, and
    other systems
  • Dhem, Quisquater, et al. (CARDIS 98) A
    practical implementation of the Timing Attack
  • P. Kocher, J. Jaffe B. Jun (Crypto 99)
    Introduction to Differential Power Analysis .
  • Messerges, Dabbish Sloan (CHES 99) Power
    Analysis Attacks of Modular Exponentiation in
    Smartcards

5
Recent Attacks
  • C. D. Walter S. Thompson (CT-RSA 2001)
    Distinguishing Exponent Digits by Observing
    Modular Subtractions
  • a timing attack which averaged over a number
    of exponentiations with same exponent
  • C. D. Walter (CHES 2001) Sliding Windows
    succumbs to Big Mac Attack
  • a DPA attack which averaged using
    the trace from a single exponentiation

6
Question
  • Counter-measures can be employed,
    but there is no guarantee that better
    monitoring machinery and better statistical
    techniques might not still reveal
    the key. So,
  • How much protection is there in selecting
    a longer key length for RSA?
  • The body of the talk looks at the last two
    attacks to see how much more difficult they are
    for longer keys.
  • Surprisingly, it appears that longer RSA ECC
    keys are weaker under the power and timing
    attacks.

7
Security Model
  • Smartcard running RSA
  • Unknown secret exponent D
  • Known algorithms H/W characteristics
  • Single H/W multiplier
  • Non-invasive, passive attack
  • Attacker unable to read or influence I/O
    directly
  • He can observe timing variations in long int
    multns
  • He can measure multiplier power usage.
  • He can check correctness of D.

8
The Timing Attack on RSA
  • Context
  • Need to compute A?B mod M
  • Output from main loop of Montgomery Modular
    multiplier P lt 2M
  • Expected output P lt M (or lt 2n)
  • So conditional subtraction in S/W
  • This affects timing, and so we assume it can be
    observed.

9
Distribution of Products
  • The loop output in Montgomery modr multn
    is uniformly distributed over the
    interval ABR1, ABR1 M ) So
    the probability of the conditional subtraction
    can be computed from the distributions of A
    and B.
  • This shows the probabilities pmu and psq
    are different for squares and multiplications.
    So they can be distinguished if
    enough samples are available.
  • This makes the usual binary square and multiply
    algorithm vulnerable to attack.

10
Separating Multiplications Squares
  • Let Q (qij) be the matrix for which qij
    1 or 0 according to whether or not
    there is a conditional subtraction in the
    ith modular multiplication of the jth
    exponentiation.
  • It is possible to compute the averages and
    variances etc of the Hamming weight distances
    between the rows.
  • Rows for multiplications have separations
    clustered round one average, rows for squares
    cluster round another, and distances
    from multiplications to squares around a third.
    This enables the rows to be partitioned into
    two sets, M and S.
  • The probability of one row being close to the
    wrong set is small, but computable (and decreases
    as the sample size increases).

11
Doubling the Key Length
  • Now double the key length n but keep all other
    parameters the same. Will the number of errors
    increase (a stronger key) or decrease (a
    weaker key)?
  • There are twice as many multiplicative
    operations, so the sets
    S and M of squares and multiplies are twice as
    big.
  • The average distances between one row and the
    (provisional) sets S and M are unchanged, but the
    variances are halved. This makes an
    individual classification error less likely.
  • If the probability of one error in two multve
    opns of the 2n-bit key is less than that for one
    multve. opn. of the n-bit key, longer
    keys are weaker.

12
Doubling the Key Length
  • Let Z be a normal N(0,1) random variable
    representing the (scaled) distance of a row of Q
    to the set S or the set M and let d be the
    distance at which the row is more likely to
    belong to the other set.
  • Then d2n v2 dn because d is inversely
    proportional to the S.D.
  • The probty of classifying an opn correctly for
    key length n is 1 p(Z gt dn)2
  • The probty of classifying two opns correctly for
    key length 2n is (1 p(Z gt v2 dn)2 )2
  • From tables, the first is smaller if dn gt 0.616

13
Result
  • So longer keys are weaker if d gt 0.616
  • But d is proportional to vN where N is the sample
    size. So the condition holds
    and longer keys are weaker if enough
    exponentiations are available with the same key.
  • Several hundred samples are enough under good
    conditions. (The actual number depends on the
    accuracy of data collection, the ratio of the
    modulus to the Montgomery constant, etc.
    and decreases as key length increases.)

14
The DPA Attack on RSA
  • Assume that the exponent is blinded and there is
    no timing variation. So the secret key must be
    recovered from a single use.
  • As a result of gate switching, a k-bit digit
    multiplication ab has a data dependent
    contribution to power consumption
    roughly linear in the Hamming weights of a and
    b.
  • Variation resulting from the previous state can
    be averaged away for long integers A ?i0 airi
    For each ai the traces for
    aibj are averaged as j varies. These are
    concatenated to give a trace with length s,
    characteristic of A.

s1
15
Distances between Traces
power
tr0
tr1
i
s
0
The scaled Euclidean distance between traces for
A0 and A1 is d0,1 ( s1? i0(tr0(i)?tr1(i))2
)½
s1
16
Average Separation
  • Let Q (qij) be the matrix for which
    qij ? R is the averaged trace weight associated
    with the jth multiplicand digit in the ith
    modular multn. Use Euclidean distance
    between rows, divided by digits s.
  • For modular multiplications with different
    multiplicands using a k-bit multiplier, the
    average distance apart is ( k(s1)/2 2s2
    )½ where s2 is the variance of
    measurement noise.
  • For multns with a common multiplicand, this
    distance is only ( k/2 2s2 )½

17
Results
  • Multiplications can be identified because they
    are close together they share a common
    multiplicand (the initial plaintext
    input).
  • Squares can be identified because they are not
    close they have different
    multiplicands.
  • For m-ary exponentiation, different exponent
    digits can be recognised the set of
    multiplications for the same digit share a
    common multiplicand and so are close together.
  • So the secret key can be recovered.

18
Longer Keys?
  • Again, consider doubling the key length to see
    what happens.
  • A longer key means more k-bit digits, so
    a better average in traces and longer
    concatenated traces so a higher
    probability of classifying multns correctly.
  • As before, sets M and S are twice the size,
    and so variances of interest are
    halved.
  • Since successive digit multiplications are not
    independent, simulations give a more accurate
    view than what can be achieved by theory.

19
Simulation
  • Example
  • Distance stats for gate switching in 8-ary expn
    with 32-bit multiplier.
  • Key Length 128 256 512 1024 2048
  • Av to nearest 234 201 177 176 171
  • SD to nearest 137 129 106 110 100
  • Av to others 324 434 843 1453 2153
  • SD to others 78 102 140 131 118
  • (Smaller key length choices to help
    illustrate trends.)

20
Longer Keys?
  • For equal multiplicands, avage distance decreases
    as key length
    increases, with S.D. about 3/5ths of this.
  • For distinct multiplicands, avage distance
    increases almost in line with key
    length, but S.D. is close to constant.
  • Consequently, it becomes much easier to
    distinguish squares from multiplies
    and which multiplicand is used
    (i.e. what exponent digit occurs) as key length
    increases.
  • Specifically, from tables we can calculate the
    probability of correct exponent digit
    determination p128 0.4836 p256
    0.7114 p512 0.9932 p1024 0.9999...

21
Result
  • Very easily, two exponent digits are correctly
    determined for key length 2n with higher
    probability than one digit for length n.
  • Thus, increasing key length is definitely unwise
    if such implementation
    attacks are possible!
  • The full power of the theory was not used
    distances were between two traces, not
    between one trace and a provisional set
    which represents the same exponent digit.
    So better results hold in practice.

22
Final Conclusion
  • Counter-intuitively, it appears that these
    attacks become easier when key length is
    increased.
  • The timing attack may become more difficult
    initially, but is easier eventually but
    counter-measures are easy.
  • With the DPA averaging above, it appears possible
    to use a single exponentiation to obtain the
    secret key D especially if key length is
    increased
  • Then the counter-measure of blinding Drf(M)
    with random r is no defence.
Write a Comment
User Comments (0)
About PowerShow.com