Vulnerabilities on high-end processors - PowerPoint PPT Presentation

About This Presentation

Title:

Vulnerabilities on high-end processors

Description:

Vulnerabilities on high-end processors Andr Seznec IRISA/INRIA CAPS project-team – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 21

Provided by: Sez58

Category:

more less

Transcript and Presenter's Notes

Title: Vulnerabilities on high-end processors

1
Vulnerabilities on high-end processors
André Seznec IRISA/INRIA CAPS project-team
2
A paradox

Microarchitectures are more and more complex
Timing side channel attacks were presented on
versions of AES (Bernstein) and RSA (Açiimez et
al.)

3
Many hardware features only to improve performance

Caches
Pipeline
Superscalar execution
Branch prediction
Thread parallelism

4
Execution time of a short instruction sequence is
a complex function !
5
Execution time of a short instruction sequence is
a complex function (2)

Depends on the precise state of every
microarchitecture component
More than 100 speculative instructions inflight
at the same time on a Pentium 4
Instructions are executed out-of-order.
Strange correlations almost impredictable at
compile time
(even in the back-end compiler)

6
Understanding AES cache timing attack on high end
microprocessor (follows Bernstein2005)

AES with lookup tables is a 10 round algorithm
with the following vulnerabilties
The number, the types and the order of the
instructions are independent of the key K and the
message M to be encrypted.
The exact locations of the data word read and
written by the first round only depend on K xor
M
The execution time of the first round depends on
K xor M (at least statistically)
CAN BE EXPLOITED

7
Bernstein 2005 (empty cache)

Plaintext attack
Irrealistic hypothesis
Access to cycle-accurate encryption timing
Cache is flushed between two encryptions
Not explicit in the paper (but see Lauradoux et
al.)
Byte by byte determination of the key based on
statistically determining the maximum encryption
time for each byte of K xor M
works only on Pentium 3, not on Pentium 4 ?

8
A loaded cache attack (proof of concept codes
available)

Plaintext attack
Timing of large number of encryptions
An irrealistic hypothesis
Access to cycle-accurate encryption timings
On a byte basis of K xor M, determine bit
subchains statistically leading to the highest
encryption time ( threshold to get confidence)
Depending on microarchitectures
0 to 80 bits of the key recovered by this method
depending on the model and stepping of Pentium 4
Suspect exercising banking in the cache

9
First vulnerability

For given sequence,
Timings are erratic
Unlikely to get exactly the same timing
But statistically correlated
cache banking, operation chaining appears in the
average

10
A possible counter measure for AES

Periodically and randomly change the mapping of
the look up tables
9000 cycles for this change XOR based
permutation
See Lauradoux et al
HAVEGE can provide the random numbers.

11
Indirect timing measures ?

Hypothesis
The attacker has access to user mode on the
system (legal or illegal)
The attacker has no access to your data
He/she can run concurently its process with the
encryption
On conventional systems, no access to microscopic
timing of your application
Time slice in 1,000,000s cycles

12
Simultaneous Multithreading (SMT) parallel
processing on a single processor

functional units are underused on superscalar
processors
SMT
Sharing the functional units on a superscalar
processor between several process
Advantages
Single process can use all the resources units
dynamic sharing of all structures on
parallel/multiprocess workloads

Second Vulnerability
13
Superscalar
Issue slots
14
Indirect timing measures on a SMT processor
(principles)

SPY wants to get information on CRYPT
SPY and CRYPT runs in parallel
SPY tracks a specific event on CRYPT
For instance execution of a branch ?
SPY saturates hardware resources needed for this
event by CRYPT for fast execution
SPY records its own execution time (reading the
hardware clock counter)
Irregurality in its own execution time signals
the event
CRYPT has try to grab the hardware resource

15
Indirect timing measures on a SMTproof of
concept (derived from SBPA)
The skeleton of a naive RSA core
For I 1 to N Sequence X //
1,000s of cycles If KeyI1
Sequence Y // 1,000s of cycles Endfor
Spy this branch B
16
Indirect timing measures on a SMTproof of
concept (2)

Branch instructions are buffered in a BTB
On Pentium 4, when the branch misses in the BTB,
more than 20 cycles penalty
SPY nearly infinite loop iterating on branching
over a set of branches occupying the possible
entries for B
Track irregularities in the timing of the loop
When B is executed, a branch of the SPY is
ejected from the BTB, thus creating a timing
irregularity
Iteration is X-type or XY-type

Able to reproduce this attack on a toy example
17
Indirect timing measures on a SMT

Feasible
On a branch on Pentium4 HT, information is
leaking
I recovered all the bits of 32 bits key in a
single run (on a toy example)
Same kind of attack may apply for cache access
memory access sequence could be discovered

18
Feasible, but difficult

Technically, very difficult
Lack of documentation on the BTB
Strange indexing, unknown associativity, BTB
hierarchy
Requires relatively infrequent events 1,000s
cycles frequency measure resolution is in the
100s cycles resolution

19
So what ?