Private Matching - PowerPoint PPT Presentation

About This Presentation

Title:

Private Matching

Description:

... how to guess session ids in the Apache Java implementation for Servlet 2.4 ... Seeding can use system based noise/entropy (process scheduling, hard disk timing, ... – PowerPoint PPT presentation

Number of Views:40

Avg rating:3.0/5.0

Slides: 69

Provided by: Ben5152

Category:

more less

Transcript and Presenter's Notes

Title: Private Matching

1
On the (in)security of the random number
generators of Linux and Windows
Benny Pinkas, University of Haifa Zvi Gutterman,
Leo Dorrendorf, Tzachy Reinman, Hebrew University
2
In this talk

What are PRNGs (pseudo-random number generators)?
Why are they important? Why should the generators
of Windows/Linux be investigated?
The generator used by Windows
Its algorithm, and its weaknesses.
A little on the generator used by Linux
Its algorithm, and its weaknesses.
Security issues when using a generator in a
systems without a hard disk.

3
Why are random number generators important?
4
Usage of random bits

Many applications need random bits for their
operation
This is particularly true for security
applications
All cryptosystems are only secure if they use
random keys
Pick a key K at random
In practice looks like
CryptGenRandom(Key, 16)
Relevant for SSL, SSH, etc.

5
Usage of random bitssession ids

http is stateless. Session ids can make http
stateful.

Session id, blah-blah
Session id, blah-blah
Session id, blah-blah

Knowledge of session ids enables to impersonate
clients.
Session ids must therefore be random/unpredictabl
e.
GM05 showed how to guess session ids in the
Apache Java implementation for Servlet 2.4

6
Usage of random bitspreventing TCP spoofing

TCP sequence numbers should be unpredictable
to prevent packet spoofing
I.e., prevent attackers from pretending to come
from fake IP addresses
"completing" a TCP handshake with a victim server
without ever receiving any responses from the
server.
Predictable TCP sequence numbers enable such
attacks

7
Security of random number generators
8
Security

Applications are designed to be secure when using
truly random bits
Random bits are hard to get
? instead use pseudo-random bits (from a
pseudo-random number generator - PRNG)
Applications are now only secure if pseudo-random
bits are indistinguishable from random
Otherwise an attacker can, e.g.,
Guess cryptographic keys (SSL in Netscape
GW96)
Guess session ids (Apache session ids GM05)

9
Pseudo-random generator

Pseudo-random number generator a deterministic
function mapping a short, random, secret seed, to
a long output which is indistinguihsable from
random.

pseudo-random generator
seed
G
G(s)
s
long output
(random, short)
a deterministic function
10
Possible Random Number Generators

Pure hardware generator (of true randomness)
Cost / portability / interface issues
Application based PRNGs
Too little noise available for seeding
Implementer can make mistakes
(The generators provided by most programming
languages are insecure for security related
applications)
Operating system based PRNGs
Seeding can use system based noise/entropy
(process scheduling, hard disk timing, etc.)
PRNG can be implemented and hidden in the kernel
Implementer is less likely to make mistakes

11
Why investigate the PRNGs of major operating
systems?

Implementers of applications use the
pseudo-random number generators provided by the
major operating systems
But
The algorithms and code of these generators were
never published !
We dont know how they are initialized !
Yet their output is crucial for almost any
security application !

12
Operating system based PRNGs

The PRNG keeps an internal state, which advances
(in a deterministic way) when output is
generated.
The state is periodically refreshed with entropy
generated by the operating system.
Different than the theoretical model of a PRNG.

13
Operating system based PRNGs

When analyzing PRNG security, we assume that
everything but the initial seed (system entropy)
is known to the attacker
The OS manufacturer might try to hide the
algorithm, but reverse engineering can find it

secret
secret
14
Desired security properties
15
Desired property 1 Pseudo-randomness
Output is indistinguishable from
random (Therefore, it can be used instead of
truly random bits)
16
Desired property 2 Backward security (break-in
recovery)

An attacker that learns the internal state cannot
learn future outputs of the generator, assuming
that sufficient entropy is used to refresh the
state.

compromised
Statei3
17
Desired property 3 Forward security

Given statei1 it is hard to compute statei
(i.e., an attacker which learns the internal
state cannot learn previous outputs of the
generator).
A mandatory requirement of the German evaluation
guidance for PRNGs.

HARD
compromised
18
Why is forward security important?

Security systems are secure as long as attackers
cannot access secret keys
Determined attackers might be able to access keys
How can we minimize the damage of key exposure?

19
Minimizing the damage of key exposure

Threshold crypto (space dimension)
Use n servers. Critical operations require
participation of t (ltn) servers. Attacker must
break into t servers in order to break security.
Example secret sharing, threshold signatures.
Proactive crypto (space time)
At end of every day the n servers exchange
messages and change their state. Attacker must
break into t servers at the same day to break
security.
Disadvantages At least t servers are needed for
any operation (e.g., signatures).

20
Key evolution

Use time dimension.
The sensitive information (e.g. key) is
frequently updated.
Adversary which learns the current key cannot
break security of past operations.
Forward security current users do not have to
worry about attacks which might happen in the
future.

21
Forward secure PRGs K,BY,BH

Also, the proof of the HILL construction of PRGs
from one-way functions can be extended to show
forward security.

22
Forward secure signatures

A single public verification key.
A different private signature key per day.
Signatures with all private keys can be verified
with the same public key.
At the end of the day the signature key is
erased.
Forward security An attacker which obtains
todays private signature key, cannot learn the
keys of previous days.

23
Forward secure signatures

Basic scheme Anderson
Private keys a certification key 365 day keys
Public key public verification key of
certification key
Initialization
Use certification key to sign 365 certificates
Key PKi is the public verification key of day
i.
Erase certification key
Day i
Sign using Ki. Add to the signature the
verification key PKi, and the certificate of day
i. Erase Ki at end of day.
Improvement K
O(1) storage. Use a forward-secure PRG to
generate private day keys. Sign certificates
using a hash tree.
Many more improvements (time vs. space).

24
Cryptanalysis of the Windows random number
generator

With Leo Dorrendorf, Zvi Gutterman
Hebrew University
ACM CCS 2007

25
CryptGenRandom

The only API provided by Windows OS for getting
secure random numbers
The worlds most common PRNG
Used by Internet Explorer to generate SSL keys
Its exact design and code were unknown (until
now)
Security by obscurity?

26
Our research

Examined the binary code of Windows 2000
Windows 2000 is still the 2nd/3rd most popular OS
PRNGs of all Windows systems are said to be
similar
Identified the algorithm used by the PRNG
Did not have access to the source code.
Used static and dynamic reverse engineering. This
was not easy.
Verified the algorithm by writing a user-mode
simulator which outputs the same values as the
OS.
Showed attacks on forward and backward security

27
The main loop (never before published!)

CryptGenRandom (Buffer , Len)
// output Len bytes to buffer
while (Len gt0)
R R ? get_next_20_rc4_bytes ()
State State ? R
T SHA-1( State )
Buffer Buffer T
// denotes concatenation
R0..4 T0..4
// copy 5 least significant bytes
State State R 1
Len Len - 20

28
Two 20 byte long registers

CryptGenRandom (Buffer , Len)
// output Len bytes to buffer
while (Len gt0)
R R ? get_next_20_rc4_bytes ()
State State ? R
T SHA-1( State )
Buffer Buffer T
// denotes concatenation
R0..4 T0..4
// copy 5 least significant bytes
State State R 1
Len Len - 20

SHA-1 is a hash function
29
Uses RC4 and SHA1

CryptGenRandom (Buffer , Len)
// output Len bytes to buffer
while (Len gt0)
R R ? get_next_20_rc4_bytes ()
State State ? R
T SHA-1( State )
Buffer Buffer T
// denotes concatenation
R0..4 T0..4
// copy 5 least significant bytes
State State R 1
Len Len - 20

RC4 is a stream cipher
30
Odd usage of ? and

CryptGenRandom (Buffer , Len)
// output Len bytes to buffer
while (Len gt0)
R R ? get_next_20_rc4_bytes ()
State State ? R
T SHA-1( State )
Buffer Buffer T
// denotes concatenation
R0..4 T0..4
// copy 5 least significant bytes
State State R 1
Len Len - 20

31
CryptGenRandom

Scoping a different state is kept for every
thread
RC4 states in static DLL space. R and State
stored in the stack.
For an attacker, it is easier to learn this data
compared to a system where this data is stored in
the kernel.
Initialization gathers 3584 bytes of system data
(most of this data is predictable).
Internal states, OS and CPU queries, registry
keys, etc.
Applies SHA1 and RC4 to this data to compute the
initial RC4 states.
Initialization is crucial for security.
Reseeding after a process reads 128 Kbytes of
output from CryptGenRandom initialization is
repeated.
New system entropy is only collected at time of
rekeying.

32
Attack on backward security (learning future
outputs)

Since we know the algorithm, if we learn the
state we can compute future states and outputs
until the next entropy refresh. (This requires no
cryptanalysis.)
This is not surprising
but since entropy is refreshed every 128 Kbytes
of output for each thread (e.g., never for IE),
the attack is very severe.
The generator should have been refreshed more
often.

33
Dont know how to attack the pseudo-randomness of
the generator

The main loop
Uses RC4 and SHA1 to advance state
Applies SHA1 to (part of) state to compute output

RC4, SHA1
SHA1

We dont know how to distinguish the PRNGs
output from random, or compute state from output.

34
Attack on forward security (learning previous
states)

RC4 is a good stream cipher, but it was not
designed to provide forward security given its
state at time i1 it is easy to compute its state
at time i.
This enables us to break the forward security of
the generator
Main result (for CryptGenRandom) given Statei1
it is possible to compute Statei with 223 work.
Attack is based (among other things) on
exploiting the relation between and ?

35
An even simpler attack on forward security

Suppose we know the initial values of State and R
These variables are never initialized, but rather
take whatever value is on the stack location in
which they are stored.
These values are quite predictable.
Given current value of RC4 state(s) we can rewind
them to the initial values
Now, given initial values of all registers we can
simulate the RNG.

36
Implications

MSFT this is a local information disclosure
vulnerability and has no possibility of code
execution and cannot be accessed remotely.
But,
New remote execution attacks are found every
week.
Our attack can be used to amplify their effect.

37
Implications possible attack scenario

Attacker learns state
E.g., by using an attack based on a buffer
overflow, or on physical access.
PRNG is implemented in user space rather than
kernel, so getting the state is easier.
Attacker can compute all previous and future
states and outputs
Combining the two attacks, attacker can compute
all states until state is refreshed with system
entropy.
System entropy based refresh is very rare (occurs
only after 128KB of output per process).

38
Implications possible attack scenario

Attacker gets access to the machine
Buffer overflow, temporary physical access (_at_
café).
Attacker learns a single state.
Does not need to control the machine afterwards.

39
The new attack

Attacker can now compute all states and outputs
from the previous to the next entropy refresh
Does not need any more interaction with the
system
Can now, e.g., decrypt all SSL connections.

(hundreds of SSL sessions)
40
Previously known attacks - key loggers

Attacker can only learn about the machine in the
period of time it owns it
Cannot learn about the past
To learn about the future it needs a long-lived
channel with the attacked machine

41
The Attack onForward Security
42
The generator
43
The attack on forward securitywhat is known
when the attack begins
44
The attack on forward security
45
The attack on forward security
46
The attack on forward security
47
Looking at the previous round(40 bits are
missing in every register)
48
Looking at the previous round(one step earlier)
49
Completing the attack
50
False positives

The attack checks 240 options for the missing 5
bytes
True value always gives a match. Each other value
gives a (false positive) match with probability
2-40.
False positives are identified if they have no
preimages.

Not really a problem, since we expect O(k) false
positive at time t-k (analysis using martingales).

t
t-1
t-2
t-3
51
Overhead of our attack

A simple attack requires 240 invocations of SHA1
A more intricate attack (using the relation
between and ?) requires 223 work on average
Our implementation of this attack runs in 19
seconds (no optimization)
Dag Arne Osvik implements SHA1 on the PlayStation
3
Performs 83 Million SHA1 invocations per second
? can implement our attack in 1/10 of a second!
Details of improved attack on whiteboard.

52
What about XP

The external layer of the PRNG in XP is identical

More complex than Windows 2000 No forward
security here means no forward security for the
entire PRNG
while (Len gt0) R R ? SystemFunction036()
State State ? R T SHA-1( State
) Buffer Buffer T // denotes
concatenation R0..4 T0..4 // copy 5
least significant bytes State State R
1 Len Len - 20
53
News Flash!

MSFTs first answer (later versions of
Windows) contain various changes and enhancements
to the random number generator.
MSFTs later answer
XP is vulnerable to the attack.
Vista, Windows Server 2008 and Windows Server
2003 SP2 are not affected by the attack.
The XP vulnerability will be fixed in SP3.

54
Future work

Investigate other Windows OSs
XP, Vista, Mobile.
In particular, examine the PRNG-OS interaction.
Recommendations
Switch to a design which supports forward
security (e.g., Barak-Halevi ACM CCS 05, which
provides theoretical guarantees).
Perform entropy rekeys more often (but not too
often).

55
Analysis of the Linux random number generator

With Zvi Gutterman, Tzachy Reinman
Hebrew University
IEEE Symposium on Security and Privacy, 2006
Black Hat 2006

56
Linux PRNG (LRNG)

Development Started by Theodore Tso in 1994
Engineering Implemented in the kernel. Complex
structure, hundreds of patches to date, changes
on a weekly base.
Used by many applications
TCP, PGP, SSL, S/MIME,
Two interfaces
Kernel interface get_random_bytes
(non-blocking)
User interfaces /dev/random (blocking)
extremely secure /dev/urandom
(non-blocking)

57
Entropy estimation

A counter estimates physical entropy in the LRNG
Increased on entropy addition (from OS events)
Decreased on output extraction
Blocking and non-blocking interfaces
Blocking interface does not provide output when
entropy estimation reaches zero, it is considered
more secure.
Non-blocking interface always provides output

58
On reverse engineering

The Linux PRNG is part of the Linux kernel and
hence its source is open
The entire code is 2500 lines written in C
The code is unclear, complex and constantly being
patched. (Some major bugs took more than a year
to be identified.)
Our tools
Static analysis
Kernel modification required many kernel builds,
while ensuring that kernel changes do not affect
generator entropy usage.
We implemented and confirmed our findings with a
user mode simulator.

59
LRNG structure
C entropy collection A entropy addition E
data extraction
60
Entropy Collection

Asynchronous entropy is constantly gathered and
added to the pools (unlike Windows).
Events are represented by two 32-bit words
Event type
E.g., mouse press, keyboard value, HD id.
Event time in milliseconds from up time
Bad news
Actual entropy in every event is very limited
Most entropy comes from HD reads/writes. We
conducted experiments which showed that each op
contributes 1 bit.
Good news
There are many events

61
Entropy Addition

Cyclic pool, generalization of LFSR, 32 it
words.
Different polynomial for each pool size
A is a known matrix
Polynomial X32X26X20X14X7X1
Addition algorithm g input, j current pool
position

62
Output extraction
add
add

After extraction
Set i Ã i-1
Update entropy-estimation

63
Comparison to Windows

We showed an attack on forward security in Linux,
but it is less efficient (264 vs. 223)
The implications of the attack are less severe
(frequent entropy updates in Linux vs. infrequent
updates in Windows)
Also, in Linux
Generator is in kernel space ?
Same generator used by all processes
Blocking interface (/dev/random) ?
Susceptible to DoS attacks (even by remote
attackers).

64
Implications to disk-less systems
65
Implications to disk-less systems

More and more systems are using solid state
(flash) based storage instead of hard disks.
The timing of HD r/w operations is unpredictable.
Solid state operations always have the same
timing.
HD timing is the major source of entropy for the
Linux PRNG.
Other sources of entropy are quite limited user
input, system interrupts.
They might be guessed by an attacker.
Possible threat to the security of the Linux PRNG
in many future systems.

66
What can be done to seed the generator with more
entropy?

Known recommendation Linux PRNG should simulate
continuity between shutdown and reboot
At shutdown 512 bytes are read from /dev/urandom.
At reboot these bytes are written to the PRNG.
?An attacker that wants to guess the PRNG state
must now record/guess all inputs and interrupts
since the first run of the machine.
This is done by the Linux distribution (not the
kernel).
But, Linux distributions on CD/DVD (Knoppix)
cannot save the state.
Many other systems (e.g., the OpenWRT Linux based
router) do not save the state of their generator.

67
Analysis of a certain Linux based device

A surprising finding The device always boots
with one of 6 possible values of the PRNG state.
The device uses solid state storage and no HD.
The PRNG does not save its state at shutdown.
During reboot, the PRNG
Reads values from a hardware based noise
generator.
Copies the system clock onto its state.
But, at that time
Hardware noise generator provides non-random
output.
Hardware clock is not loaded to the software
clock, so the value read from the software clock
is always fixed.

68
Preliminary results