Private Matching - PowerPoint PPT Presentation

About This Presentation
Title:

Private Matching

Description:

... how to guess session ids in the Apache Java implementation for Servlet 2.4 ... Seeding can use system based noise/entropy (process scheduling, hard disk timing, ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 69
Provided by: Ben5152
Category:

less

Transcript and Presenter's Notes

Title: Private Matching


1
On the (in)security of the random number
generators of Linux and Windows
Benny Pinkas, University of Haifa Zvi Gutterman,
Leo Dorrendorf, Tzachy Reinman, Hebrew University
2
In this talk
  • What are PRNGs (pseudo-random number generators)?
  • Why are they important? Why should the generators
    of Windows/Linux be investigated?
  • The generator used by Windows
  • Its algorithm, and its weaknesses.
  • A little on the generator used by Linux
  • Its algorithm, and its weaknesses.
  • Security issues when using a generator in a
    systems without a hard disk.

3
Why are random number generators important?
4
Usage of random bits
  • Many applications need random bits for their
    operation
  • This is particularly true for security
    applications
  • All cryptosystems are only secure if they use
    random keys
  • Pick a key K at random
  • In practice looks like
  • CryptGenRandom(Key, 16)
  • Relevant for SSL, SSH, etc.

5
Usage of random bitssession ids
  • http is stateless. Session ids can make http
    stateful.

Session id, blah-blah
Session id, blah-blah
Session id, blah-blah
  • Knowledge of session ids enables to impersonate
    clients.
  • Session ids must therefore be random/unpredictabl
    e.
  • GM05 showed how to guess session ids in the
    Apache Java implementation for Servlet 2.4

6
Usage of random bitspreventing TCP spoofing
  • TCP sequence numbers should be unpredictable
  • to prevent packet spoofing
  • I.e., prevent attackers from pretending to come
    from fake IP addresses
  • "completing" a TCP handshake with a victim server
    without ever receiving any responses from the
    server.
  • Predictable TCP sequence numbers enable such
    attacks

7
Security of random number generators
8
Security
  • Applications are designed to be secure when using
    truly random bits
  • Random bits are hard to get
  • ? instead use pseudo-random bits (from a
    pseudo-random number generator - PRNG)
  • Applications are now only secure if pseudo-random
    bits are indistinguishable from random
  • Otherwise an attacker can, e.g.,
  • Guess cryptographic keys (SSL in Netscape
    GW96)
  • Guess session ids (Apache session ids GM05)

9
Pseudo-random generator
  • Pseudo-random number generator a deterministic
    function mapping a short, random, secret seed, to
    a long output which is indistinguihsable from
    random.

pseudo-random generator
seed
G
G(s)
s
long output
(random, short)
a deterministic function
10
Possible Random Number Generators
  • Pure hardware generator (of true randomness)
  • Cost / portability / interface issues
  • Application based PRNGs
  • Too little noise available for seeding
  • Implementer can make mistakes
  • (The generators provided by most programming
    languages are insecure for security related
    applications)
  • Operating system based PRNGs
  • Seeding can use system based noise/entropy
    (process scheduling, hard disk timing, etc.)
  • PRNG can be implemented and hidden in the kernel
  • Implementer is less likely to make mistakes

11
Why investigate the PRNGs of major operating
systems?
  • Implementers of applications use the
    pseudo-random number generators provided by the
    major operating systems
  • But
  • The algorithms and code of these generators were
    never published !
  • We dont know how they are initialized !
  • Yet their output is crucial for almost any
    security application !

12
Operating system based PRNGs
  • The PRNG keeps an internal state, which advances
    (in a deterministic way) when output is
    generated.
  • The state is periodically refreshed with entropy
    generated by the operating system.
  • Different than the theoretical model of a PRNG.

13
Operating system based PRNGs
  • When analyzing PRNG security, we assume that
    everything but the initial seed (system entropy)
    is known to the attacker
  • The OS manufacturer might try to hide the
    algorithm, but reverse engineering can find it

secret
secret
14
Desired security properties
15
Desired property 1 Pseudo-randomness
Output is indistinguishable from
random (Therefore, it can be used instead of
truly random bits)
16
Desired property 2 Backward security (break-in
recovery)
  • An attacker that learns the internal state cannot
    learn future outputs of the generator, assuming
    that sufficient entropy is used to refresh the
    state.

compromised
Statei3
17
Desired property 3 Forward security
  • Given statei1 it is hard to compute statei
    (i.e., an attacker which learns the internal
    state cannot learn previous outputs of the
    generator).
  • A mandatory requirement of the German evaluation
    guidance for PRNGs.

HARD
compromised
18
Why is forward security important?
  • Security systems are secure as long as attackers
    cannot access secret keys
  • Determined attackers might be able to access keys
  • How can we minimize the damage of key exposure?

19
Minimizing the damage of key exposure
  • Threshold crypto (space dimension)
  • Use n servers. Critical operations require
    participation of t (ltn) servers. Attacker must
    break into t servers in order to break security.
  • Example secret sharing, threshold signatures.
  • Proactive crypto (space time)
  • At end of every day the n servers exchange
    messages and change their state. Attacker must
    break into t servers at the same day to break
    security.
  • Disadvantages At least t servers are needed for
    any operation (e.g., signatures).

20
Key evolution
  • Use time dimension.
  • The sensitive information (e.g. key) is
    frequently updated.
  • Adversary which learns the current key cannot
    break security of past operations.
  • Forward security current users do not have to
    worry about attacks which might happen in the
    future.

21
Forward secure PRGs K,BY,BH
  • Also, the proof of the HILL construction of PRGs
    from one-way functions can be extended to show
    forward security.

22
Forward secure signatures
  • A single public verification key.
  • A different private signature key per day.
  • Signatures with all private keys can be verified
    with the same public key.
  • At the end of the day the signature key is
    erased.
  • Forward security An attacker which obtains
    todays private signature key, cannot learn the
    keys of previous days.

23
Forward secure signatures
  • Basic scheme Anderson
  • Private keys a certification key 365 day keys
  • Public key public verification key of
    certification key
  • Initialization
  • Use certification key to sign 365 certificates
    Key PKi is the public verification key of day
    i.
  • Erase certification key
  • Day i
  • Sign using Ki. Add to the signature the
    verification key PKi, and the certificate of day
    i. Erase Ki at end of day.
  • Improvement K
  • O(1) storage. Use a forward-secure PRG to
    generate private day keys. Sign certificates
    using a hash tree.
  • Many more improvements (time vs. space).

24
Cryptanalysis of the Windows random number
generator
  • With Leo Dorrendorf, Zvi Gutterman
  • Hebrew University
  • ACM CCS 2007

25
CryptGenRandom
  • The only API provided by Windows OS for getting
    secure random numbers
  • The worlds most common PRNG
  • Used by Internet Explorer to generate SSL keys
  • Its exact design and code were unknown (until
    now)
  • Security by obscurity?

26
Our research
  • Examined the binary code of Windows 2000
  • Windows 2000 is still the 2nd/3rd most popular OS
  • PRNGs of all Windows systems are said to be
    similar
  • Identified the algorithm used by the PRNG
  • Did not have access to the source code.
  • Used static and dynamic reverse engineering. This
    was not easy.
  • Verified the algorithm by writing a user-mode
    simulator which outputs the same values as the
    OS.
  • Showed attacks on forward and backward security

27
The main loop (never before published!)
  • CryptGenRandom (Buffer , Len)
  • // output Len bytes to buffer
  • while (Len gt0)
  • R R ? get_next_20_rc4_bytes ()
  • State State ? R
  • T SHA-1( State )
  • Buffer Buffer T
  • // denotes concatenation
  • R0..4 T0..4
  • // copy 5 least significant bytes
  • State State R 1
  • Len Len - 20

28
Two 20 byte long registers
  • CryptGenRandom (Buffer , Len)
  • // output Len bytes to buffer
  • while (Len gt0)
  • R R ? get_next_20_rc4_bytes ()
  • State State ? R
  • T SHA-1( State )
  • Buffer Buffer T
  • // denotes concatenation
  • R0..4 T0..4
  • // copy 5 least significant bytes
  • State State R 1
  • Len Len - 20

SHA-1 is a hash function
29
Uses RC4 and SHA1
  • CryptGenRandom (Buffer , Len)
  • // output Len bytes to buffer
  • while (Len gt0)
  • R R ? get_next_20_rc4_bytes ()
  • State State ? R
  • T SHA-1( State )
  • Buffer Buffer T
  • // denotes concatenation
  • R0..4 T0..4
  • // copy 5 least significant bytes
  • State State R 1
  • Len Len - 20

RC4 is a stream cipher
30
Odd usage of ? and
  • CryptGenRandom (Buffer , Len)
  • // output Len bytes to buffer
  • while (Len gt0)
  • R R ? get_next_20_rc4_bytes ()
  • State State ? R
  • T SHA-1( State )
  • Buffer Buffer T
  • // denotes concatenation
  • R0..4 T0..4
  • // copy 5 least significant bytes
  • State State R 1
  • Len Len - 20

31
CryptGenRandom
  • Scoping a different state is kept for every
    thread
  • RC4 states in static DLL space. R and State
    stored in the stack.
  • For an attacker, it is easier to learn this data
    compared to a system where this data is stored in
    the kernel.
  • Initialization gathers 3584 bytes of system data
    (most of this data is predictable).
  • Internal states, OS and CPU queries, registry
    keys, etc.
  • Applies SHA1 and RC4 to this data to compute the
    initial RC4 states.
  • Initialization is crucial for security.
  • Reseeding after a process reads 128 Kbytes of
    output from CryptGenRandom initialization is
    repeated.
  • New system entropy is only collected at time of
    rekeying.

32
Attack on backward security (learning future
outputs)
  • Since we know the algorithm, if we learn the
    state we can compute future states and outputs
    until the next entropy refresh. (This requires no
    cryptanalysis.)
  • This is not surprising
  • but since entropy is refreshed every 128 Kbytes
    of output for each thread (e.g., never for IE),
    the attack is very severe.
  • The generator should have been refreshed more
    often.

33
Dont know how to attack the pseudo-randomness of
the generator
  • The main loop
  • Uses RC4 and SHA1 to advance state
  • Applies SHA1 to (part of) state to compute output

RC4, SHA1
SHA1
  • We dont know how to distinguish the PRNGs
    output from random, or compute state from output.

34
Attack on forward security (learning previous
states)
  • RC4 is a good stream cipher, but it was not
    designed to provide forward security given its
    state at time i1 it is easy to compute its state
    at time i.
  • This enables us to break the forward security of
    the generator
  • Main result (for CryptGenRandom) given Statei1
    it is possible to compute Statei with 223 work.
  • Attack is based (among other things) on
    exploiting the relation between and ?

35
An even simpler attack on forward security
  • Suppose we know the initial values of State and R
  • These variables are never initialized, but rather
    take whatever value is on the stack location in
    which they are stored.
  • These values are quite predictable.
  • Given current value of RC4 state(s) we can rewind
    them to the initial values
  • Now, given initial values of all registers we can
    simulate the RNG.

36
Implications
  • MSFT this is a local information disclosure
    vulnerability and has no possibility of code
    execution and cannot be accessed remotely.
  • But,
  • New remote execution attacks are found every
    week.
  • Our attack can be used to amplify their effect.

37
Implications possible attack scenario
  • Attacker learns state
  • E.g., by using an attack based on a buffer
    overflow, or on physical access.
  • PRNG is implemented in user space rather than
    kernel, so getting the state is easier.
  • Attacker can compute all previous and future
    states and outputs
  • Combining the two attacks, attacker can compute
    all states until state is refreshed with system
    entropy.
  • System entropy based refresh is very rare (occurs
    only after 128KB of output per process).

38
Implications possible attack scenario
  • Attacker gets access to the machine
  • Buffer overflow, temporary physical access (_at_
    café).
  • Attacker learns a single state.
  • Does not need to control the machine afterwards.

39
The new attack
  • Attacker can now compute all states and outputs
    from the previous to the next entropy refresh
  • Does not need any more interaction with the
    system
  • Can now, e.g., decrypt all SSL connections.

(hundreds of SSL sessions)
40
Previously known attacks - key loggers
  • Attacker can only learn about the machine in the
    period of time it owns it
  • Cannot learn about the past
  • To learn about the future it needs a long-lived
    channel with the attacked machine

41
The Attack onForward Security
42
The generator
43
The attack on forward securitywhat is known
when the attack begins
44
The attack on forward security
45
The attack on forward security
46
The attack on forward security
47
Looking at the previous round(40 bits are
missing in every register)
48
Looking at the previous round(one step earlier)
49
Completing the attack
50
False positives
  • The attack checks 240 options for the missing 5
    bytes
  • True value always gives a match. Each other value
    gives a (false positive) match with probability
    2-40.
  • False positives are identified if they have no
    preimages.

Not really a problem, since we expect O(k) false
positive at time t-k (analysis using martingales).

t
t-1
t-2
t-3
51
Overhead of our attack
  • A simple attack requires 240 invocations of SHA1
  • A more intricate attack (using the relation
    between and ?) requires 223 work on average
  • Our implementation of this attack runs in 19
    seconds (no optimization)
  • Dag Arne Osvik implements SHA1 on the PlayStation
    3
  • Performs 83 Million SHA1 invocations per second
  • ? can implement our attack in 1/10 of a second!
  • Details of improved attack on whiteboard.

52
What about XP
  • The external layer of the PRNG in XP is identical

More complex than Windows 2000 No forward
security here means no forward security for the
entire PRNG
while (Len gt0) R R ? SystemFunction036()
State State ? R T SHA-1( State
) Buffer Buffer T // denotes
concatenation R0..4 T0..4 // copy 5
least significant bytes State State R
1 Len Len - 20
53
News Flash!
  • MSFTs first answer (later versions of
    Windows) contain various changes and enhancements
    to the random number generator.
  • MSFTs later answer
  • XP is vulnerable to the attack.
  • Vista, Windows Server 2008 and Windows Server
    2003 SP2 are not affected by the attack.
  • The XP vulnerability will be fixed in SP3.

54
Future work
  • Investigate other Windows OSs
  • XP, Vista, Mobile.
  • In particular, examine the PRNG-OS interaction.
  • Recommendations
  • Switch to a design which supports forward
    security (e.g., Barak-Halevi ACM CCS 05, which
    provides theoretical guarantees).
  • Perform entropy rekeys more often (but not too
    often).


55
Analysis of the Linux random number generator
  • With Zvi Gutterman, Tzachy Reinman
  • Hebrew University
  • IEEE Symposium on Security and Privacy, 2006
  • Black Hat 2006

56
Linux PRNG (LRNG)
  • Development Started by Theodore Tso in 1994
  • Engineering Implemented in the kernel. Complex
    structure, hundreds of patches to date, changes
    on a weekly base.
  • Used by many applications
  • TCP, PGP, SSL, S/MIME,
  • Two interfaces
  • Kernel interface get_random_bytes
    (non-blocking)
  • User interfaces /dev/random (blocking)
    extremely secure /dev/urandom
    (non-blocking)

57
Entropy estimation
  • A counter estimates physical entropy in the LRNG
  • Increased on entropy addition (from OS events)
  • Decreased on output extraction
  • Blocking and non-blocking interfaces
  • Blocking interface does not provide output when
    entropy estimation reaches zero, it is considered
    more secure.
  • Non-blocking interface always provides output

58
On reverse engineering
  • The Linux PRNG is part of the Linux kernel and
    hence its source is open
  • The entire code is 2500 lines written in C
  • The code is unclear, complex and constantly being
    patched. (Some major bugs took more than a year
    to be identified.)
  • Our tools
  • Static analysis
  • Kernel modification required many kernel builds,
    while ensuring that kernel changes do not affect
    generator entropy usage.
  • We implemented and confirmed our findings with a
    user mode simulator.

59
LRNG structure
C entropy collection A entropy addition E
data extraction
60
Entropy Collection
  • Asynchronous entropy is constantly gathered and
    added to the pools (unlike Windows).
  • Events are represented by two 32-bit words
  • Event type
  • E.g., mouse press, keyboard value, HD id.
  • Event time in milliseconds from up time
  • Bad news
  • Actual entropy in every event is very limited
  • Most entropy comes from HD reads/writes. We
    conducted experiments which showed that each op
    contributes 1 bit.
  • Good news
  • There are many events

61
Entropy Addition
  • Cyclic pool, generalization of LFSR, 32 it
    words.
  • Different polynomial for each pool size
  • A is a known matrix
  • Polynomial X32X26X20X14X7X1
  • Addition algorithm g input, j current pool
    position

62
Output extraction
add
add
  • After extraction
  • Set i à i-1
  • Update entropy-estimation

63
Comparison to Windows
  • We showed an attack on forward security in Linux,
    but it is less efficient (264 vs. 223)
  • The implications of the attack are less severe
    (frequent entropy updates in Linux vs. infrequent
    updates in Windows)
  • Also, in Linux
  • Generator is in kernel space ?
  • Same generator used by all processes
  • Blocking interface (/dev/random) ?
  • Susceptible to DoS attacks (even by remote
    attackers).

64
Implications to disk-less systems
65
Implications to disk-less systems
  • More and more systems are using solid state
    (flash) based storage instead of hard disks.
  • The timing of HD r/w operations is unpredictable.
    Solid state operations always have the same
    timing.
  • HD timing is the major source of entropy for the
    Linux PRNG.
  • Other sources of entropy are quite limited user
    input, system interrupts.
  • They might be guessed by an attacker.
  • Possible threat to the security of the Linux PRNG
    in many future systems.

66
What can be done to seed the generator with more
entropy?
  • Known recommendation Linux PRNG should simulate
    continuity between shutdown and reboot
  • At shutdown 512 bytes are read from /dev/urandom.
  • At reboot these bytes are written to the PRNG.
  • ?An attacker that wants to guess the PRNG state
    must now record/guess all inputs and interrupts
    since the first run of the machine.
  • This is done by the Linux distribution (not the
    kernel).
  • But, Linux distributions on CD/DVD (Knoppix)
    cannot save the state.
  • Many other systems (e.g., the OpenWRT Linux based
    router) do not save the state of their generator.

67
Analysis of a certain Linux based device
  • A surprising finding The device always boots
    with one of 6 possible values of the PRNG state.
  • The device uses solid state storage and no HD.
  • The PRNG does not save its state at shutdown.
  • During reboot, the PRNG
  • Reads values from a hardware based noise
    generator.
  • Copies the system clock onto its state.
  • But, at that time
  • Hardware noise generator provides non-random
    output.
  • Hardware clock is not loaded to the software
    clock, so the value read from the software clock
    is always fixed.

68
Preliminary results
  • Can predict keys used by this device in SSH
    sessions, and decrypt communication
  • If these sessions are initiated before an input
    from user is received
  • Output of PRNG is a function of users input only
  • If we observe the output of the PRNG, can we
    deduce users input?
  • Yes, if the user enters his input slowly enough!
  • Can this be done by an external attacker?

69
Take-home message
  • Be careful when designing (or using) devices
    without a hard-disk

70
Conclusions
  • The security of the output of pseudo-random
    generators is crucial.
  • It is hard to examine OS based PRNGs
  • The algorithms are not published. One must
    examine the code. This is not easy.
  • Intricate dependencies with the OS analyzing
    the algorithm alone is not enough.
  • The generators of both Windows and Linux do not
    provide forward security
  • and have additional design issues

71
Conclusions
  • Most severe findings
  • The PRNGs of Windows XP/2000 do not provide
    forward security
  • Affecting gt 90 of all PCs
  • Disk-less Linux systems

72
TODO
  • Windows
  • Examine other windows systems
  • Understand the relation with the OS
    (initialization)
  • Linux
  • Disk-less systems many new attacks are possible
  • PRNG usage within a virtual machine?
  • Change the design of common PRNGs
  • Use the Barak-Halevi construction
Write a Comment
User Comments (0)
About PowerShow.com