Title: CHAPTER 4: MODERN CRYPTOGRAPHY
1- CHAPTER 4 MODERN CRYPTOGRAPHY
- Modern cryptography differs from classical.
- 1. Classical methods kept algorithms secret.
- 2. Classical systems were not grounded in science
- Modern systems assume the algorithm is public
- knowledge and are science-based.
- Two major technologies in modern cryptography
- Symmetric or Secret Key Cryptography
- Asymmetric or Public Key Cryptography
2Secret Key Cryptography Model
Source Plaintext
Recovered Plaintext
Non-secure Channel
Decryptor
Encryptor
Attacking Cryptanalyst
Secret Key
Secure Channel
3Public Key Cryptography - Type 1 Model
Type 1 function Authentication of source (no
privacy)
Un-secured Channel
Decryptor
Encryptor
Source Plaintext
Recovered Plaintext
Attacking Cryptanalyst
Private Key
Public Key
Un-secured Channel
Split Key Generator
4Public Key Cryptography - Type 2 Model
Type 2 function Ensures Privacy
Un-secured Channel
Decryptor
Encryptor
Source Plaintext
Recovered Plaintext
Attacking Cryptanalyst
Public Key
Private Key
Un-secured Channel
Split Key Generator
5Cryptography - Ground Rules Adversaries have
access to the ciphertext. Adversaries know the
encryption algorithm. These are simply real
world realities. The secret key is kept secret
and is not available to an adversary. If
distributed, it is over a secure
channel. Difficult, but doable.
6Cryptography - Ground Rules The ciphertext is
randomized and contains no statistics associated
(it appears as if the plaintext were composed of
random characters). Not easy - messages have
structure by definition, but some things can be
done compression removes structure, but the
main job of eliminating structure is the
ciphers. The secret key is composed of random
characters. Hard - depends on the quality of the
key generation (a random number generator
issue).
7Secret Key Cryptography - Scientific
Basis Classical cryptography has been known for
centuries and has worked to one degree or
another. However, it was not founded on science.
There was no way to prove that a particular
system would work or to assess its strength.
That changed in 1949 when Claude Shannon at
Bell Labs asked and answered two questions.
8Shannons Questions How secure is a system
against cryptanalysis if the adversary has
unlimited time and manpower available for the
analysis of intercepted cryptograms? Establishes
the requirements for perfect secrecy. How secure
is the system when a cryptanalyst has a limited
amount of time and computational power available
for the analysis of intercepted cryptograms?
Establishes the requirements for practical
secrecy.
9Shannon's Conditions for Perfect Secrecy One of
two conditions are required for theoretically
perfect security. Either The key must be
random, used only once, and the key length must
be at least plaintext length, or The message M
to be encrypted must be composed of random
characters or randomized during encryption such
that there is no statistical relationship between
the plaintext and the ciphertext.
10Shannon's Ground Rules Only M digits of the
plaintext will be encrypted before the key
and/or randomizer are changed. The adversary
only has access to the ciphertext. Shannon
approached the problem using the theory
of uncertainty. If uncertainty could be made
infinite or very large, then it would be
possible to have perfect or practical secrecy.
11Shannon Uncertainty Shannon argued that H(P
pC c) H(P p) for all possible plaintexts
p p1, p2, pn and specific ciphertexts
c He shortened this to H(PC) This
means The uncertainty (H) about P given
complete knowledge of C and large uncertainties
are desired.
12Shannon Uncertainty Some additional
observations H(P, C) H(P) H(CP) Meaning
the uncertainty about P C is equal to the
Uncertainty of P plus the uncertainty of C given
knowledge of P (a rule of mathematical
uncertainty).
13More Uncertainty Observations An encryption
system can be described as follows C Ekr(P)
encryption of P using key k randomizer r P
Ekr(C) decryption of C using same k r The
use of k (key) r (randomizer), where the key is
not available to the adversary is what creates
the uncertainty (also called entropy) we are
after in encryption.
14More Uncertainty Observations Joining encryption
and uncertainty theory H(CP,K,R) 0 (knowing
P,K,R determines C exactly). This is encryption
so the uncertainty about C 0 (none).
H(PC,K) 0 (knowing C, K determines P
exactly). This is decryption so the uncertainty
about P 0 (none) The point is The uncertainty
of C under encryption is 0 And the uncertainty of
P under decryption is 0, given The knowledge of
P,K, and R and C,K respectively.
15Getting to the Result Shannon observed that P
C must be statistically independent for perfect
secrecy. The uncertainty H(PC) H(P)
uncertainty about P given C uncertainty of
Pi.e., knowing C doesnt disclose P This
leads to certain inequalities H(PC) gt H(P,
KC) we know less about P K than P given C,
so H(PC) gt H(KC) H(PC,K)
H(KC) 0 since H(PC,K) 0, decryption So,
H(PC) gt H(KC)
16The Result It is this result that leads to
Condition 1 The uncertainty about the key K
must be at least as good as the uncertainty about
the plaintext P that K is encrypting (i.e.,
random key). Condition 2 The plaintext message
must be random. Reality says isnt the usual
case. This relaxes the randomness requirement of
the key, but at the cost of imposing it on the
message itself (no free lunch).
17Shannon's Results The key must still be as long
as the message. Either condition 1 or condition
2 will produce perfect secrecy. Both are
difficult to implement in practice. Shannon
addressed this by asking "How good is good
enough."
18- Shannon's Results - Practical Security
- Defined Requirements necessary for an enemy who
has - only a limited amount of time, and limited
computational - resources.
- Mathematical basis - beyond our interest, but
need to - understand five key ideas
- Entropy Uncertainty
- Language Rates
- Key equivocation function
- Plaintext redundancy
- Work functions
19Entropy Uncertainty Classical Entropy A
measure of the disorder in a closed (no energy
input or output) thermodynamic system. The
degradation of the matter and energy in the
universe to a final state of inert
uniformity. As energy is expended it
re-distributes and gets smaller.
20Entropy Uncertainty Information entropy A
measure of the amount of information in a message
that is based on the logarithm of number of
possible messages for the given size of the
message. For example, a one character message
has one of 26 possibilities (for alphabetic
English). SO it is a measure
possibilities, where the possibilities introduce
uncertainty! (1 bit possibilities 2) (2 bit
4).
21Information Entropy Information theory The
amount of information in a message is defined as
the minimum number of bits required to encode
all possible meanings of the message. For
example, the gender message could be 0
Male 1 Female This encoding of the gender
message has an entropy H(M) of exactly 1 bit it
takes at least one bit to encode.
22Information Entropy Example 2 -
Day-of-the-week 000 Sunday 001
Monday 010 Tuesday 011 Wednesday 100
Thursday 101 Friday 110 Saturday 111
Unused Requires 3 bits. The entropy, or
uncertainty, called H(M), is 23 1 7, or
slightly less than 3 bits .
23Language Rate The theoretical absolute rate of a
language is the maximum number of bits that can
be coded in each character assuming each
character sequence is equally likely. The rate
of a language (e.g., English) can be specified in
bits. R log2L, Where R is the Rate, and
L is the of characters in the language
24Language Rate R log2L, Where R is the
Rate, and L is the of characters in the
language For English R log226, or about 4.6
bits per character. This is theoretical. In
any real language each character sequence is not
equally likely For example, the sequence the is
more likely than qua. (once th is specified, next
is limited the, thi, thr..no thb)
25Language Rate The actual rate of English is
smaller because the language is highly redundant.
The real rate r for a language is given by r
H(M)/N, where N is the length of the message For
long messages (large values of N), r is between
1.0 and 1.5 bits/character. Shannon calculated
a rate of 2.3 bits/character for 8 character
messages. This is about ½ of the theoretical
rate R.
26Language Redundancy The redundancy, D, of a
language is given by D R r for R 4.7
and r 1.3, then D 4.7 1.3 3.4
bits/character. For ASCII 1 character (8
bits), 8 1.3 6.7 redundant bits or about
6.7/8 .84 bits per bit of redundancy and .16
bits of information per bit. The actual rate is
length (N) dependent and ranges from 1.0 to
2.3 Bottom Line English is highly redundant and
so is ACSII, which is just another
representation of English.
27Key Equivocation Function f(n) H(K)(Y1,
Y2,..Yn) Where f(n) is the uncertainty about
the key K after examining the first n digits of
the ciphertext Y. The unicity distance u is the
smallest n where f(n) 0 Meaning Given u
digits of ciphertext, and not before, there will
be only one value of the secret key that will
correctly decode c1, cn.
28Key Equivocation Function The expression for
unicity distance is u H(K)/D Where H(K)
is the key uncertainty (entropy) and is log2K,
where K is the key length in bits. D is the
fractional redundancy of the Information
contained in the N digit cryptogram, made up of
symbols from the alphabet Ly.
29Key Equivocation Unicity Distance For
practical systems, this is just the redundancy of
the Plaintext message. For the English language
rate D .75 or 75. So u log2K/D Where K
key space r redundancy. For DES (56 bit key)
and ASCII plaintext u 56/.75 66 bits or
about 8.3 ASCII characters.
30Unicity Distance Bottom line Only 8 ASCII
characters of encrypted information are enough
for there to be only one valid key that will
uniquely decrypt the message. This is due to
redundancy in the underlying language. As we
make D small compared to the key length, u
log2K/D increases rapidly. For plaintext and
even short keys , D can be very small. That is,
very little redundancy in the cryptogram!
31Unicity Distance - Making D Small Before
encryption - compress or randomize
message. Original This is a sample of data
compression. Compress Thisisasampleofdatacompres
sion. Randomze xTHISisyyazzzsampleaofbbdataccomp
ressionc These methods won't make D near 0, but
will reduce it significantly. The discussion so
far provides some insights, but little
real help.So Shannon added one final observation.
32The Work Function Observation Shannon defined a
work function W(n) average amount of work
required (e.g., computation in cpu hours, MIPS,
etc.) to find the key given n digits of
ciphertext. Determine limit of W(n) as n
approaches infinity (large messages) as W(n)lim
W(?). This is difficult (i.e., attempts to
predict events that cant be measured - such as
processing infinite length messages). Even large
n is hard to determine.
33The Work Function Observation Shannon improvised
by suggesting the use of the historical work
function Wh(n). Meaning amount of work
required to break an encryption using the
best known method of attack Puts a practical
face on the problem and can readily be solved.
34Best Known Attacks Know attacks are developed by
cryptography research and published in the
literature Known attacks include Search key
space (brute force) Factor large prime numbers
(for public key systems) While attack methods
are widely discussed, the methods Developed by
the spooks (i.e., NSA, CIA) are not. The best
method may be a secret and not disclosed by
an adversary (e.g., a foreign intelligence
service, hacker, etc.).
35Best Known Attacks New methods are being
developed all the time, some in the open, others
in secrecy. Result is Wreal(n) may be ltlt
Wh(n). Clearly Shannon didnt solve all the
problems. He did put a science underpinning
under the field for both theoretical and
practical security. We now know what it takes to
be secure.
36Perfect Secrecy Condition 1 The key is
randomly selected, is used only once, is at least
as long as the message, and is kept secret. This
condition applies independent of the statistics
of the message. This is feasible, is called the
one-time pad, but is cumbersome (due to the pad
and key length requirement). Rarely used.
37Perfect Secrecy Condition 2 Alternatively The
plaintext and ciphertext must be
statistically independent (no plaintext inferred
from the statistics of the ciphertext) This is
feasible The plaintext must be completely
randomized during encryption. We can come very
close.
38Perfect Secrecy - One-time pads Adversary has
only the ciphertext. Algorithm P K mod 26
(Polyalphabetic shift cipher) Plaintext
ONETIMEPAD Key TBFRGFARFM (K is the
secret) Cipher I PKLPSFHGO Since O T mod 26
I N B mod 26 P, etc.
39Perfect Secrecy - One-time pads Now assume a
different plaintext and key Plaintext SALMONEGGS
(was ONETIMEPAD) Key POYYAEAAZX (was
TBFRGFARFM) Cipher I PKLPSFHGO (was I
PKLPSFHGO) Point Both produce the same
ciphertext All combinations of 10 characters can
produce the same ciphertext using different
keys. Adversary doesnt know which plaintext is
correct. Unbreakable without some additional
information!
40One-Time Pads Whats wrong? SCALABILITY!
Encryption turns long secrets (messages) into
short secrets (keys). Short keys are easier to
exchange. One-time pads do not shorten secrets
Both the secret message and the pad are of equal
length. Just as hard to courier the pad as it
would be to courier the entire message. It is
really a matter of practicality. It can work in
very special situations, but not in mass market
encryption.
41Secret Key Systems Three components for real
systems Plaintext message to be encrypted, an
encryption/decryption algorithm, and an
encrypting/decrypting key. Characteristics
The key is symmetrical (same key to
encrypt/decrypt). Algorithm is public with no
secrecy requirement. Cryptographically strong
algorithm to resist breakage. Strong key,
distributed securely, kept secret by user.
42Block Stream Ciphers Both encrypt messages.
How they do it is slightly different. Block
Divide message into blocks Encrypt one block
at a time Blocks typically
64/128 bits (8/16 characters) Same key and
algorithm is used on every block. Identical
blocks produce the same output. Stream Divide
message into bits or Bytes Encrypt one bit
or Byte at a time Use same start
key, but key changes per bit/byte.
Identical blocks produce different outputs.
43Block Ciphers Like a table look-up. For each key
(264), build a table. In left column are all
possible plaintext blocks of length 64 and in
the right column are the corresponding ciphertext
blocks of length 64 (or whatever the actual
block length is). NSA calls these cipher
codebooks (they build such things). They have
some 18 billion entries for the smallest block
codes. Easy on a computer, but takes lots of
storage. Typically used for encrypting text
messages, like computer messages or files.
44Stream Ciphers Feedback State Machine Encrypt
streaming messages (e.g., telephone,
video). Current state (key-based) Ciphertext
transform (f(text input current state)) Next
state (updates current state as a f(current state
and the ciphertext output) Block/stream methods
are converging. Block ciphers are adapting
feedback like stream ciphers. In fact, block
ciphers are used as the transformation engines
inside Stream ciphers.
45Inside the Cipher Algorithm At the core of every
symmetric cipher is an engine that performs a
series of substitutions and permutations on the
input message. Substitution Replaces one binary
string with another. Takes n - bit input and
produces an n - bit output with substitutions
n x 2n For a 64 bit block (typical block
size) possible outputs 64 x 264 26 x
264 270 1021
46Substitution Permutation Substitution can be
done with hardware or software and is done in a
construct called an S-Box. Permutation -
re-orders bits in a string. An n-bit input
produces n-bit permuted outputs. (e.g., 1010
permuted to 0111) The number of possible
permutations is n factorial (n!). Permutation
can be done with hardware or software and is done
in a construct called a P-Box.
47S-Box
Substitution Box (S-Box)
0
0
1
0
1
1
2
2
0
0
3
3
4
4
1
0
5
5
6
6
7
7
38 Decoder
83 Encoder
For n input bits has n x 2n possible output
combinations
Note The specific substitution is selected by
the input key. In the example above, only one
substitution is shown for a 3-bit input, while 23
are possible.
48P-Box
Permutation Box (P-Box)
0
1
0
1
0
0
1
1
0
0
0
1
1
1
1
0
Output 01010110
Input 10010011
P-Box for n input bits has n! key
49- S P - Box Functions
- S-boxes perform Shannons confusion function
and - p-boxes perform the diffusion function
(Stallings pg. 67). - The purpose is to remove the statistics of the
original - plaintext by having the following properties
- Changing one input bit results in changes to
half, or - more, of the output bits - called the avalanche
effect. - 2. Each output bit is a complex function of all
the input - bits called the completeness effect.
50S P - Box Functions Diffusion dissipates the
statistical structure of the plaintext by having
each input bit affect many output
bits. Confusion hides or minimizes the
relationship between the ciphertext and the key
to avoid key structure attacks that would allow
the key to be recovered.
51Confusion Example Consider a simple EXOR of a
key operating on two blocks (AB, AB) of
identical plaintext Plain AB AB 11000000
11000001 11000000 11000001 Key
10110100 01101011 10110100 01101011 EXOR
Ciphertext 01110100 10101010 01110100
10101010 The characters in the two blocks are
the same. They encode to the same result. As AB
appears in cipher blocks, the statistics of the
language would emerge. If a pair can be guessed,
key recover is the EXOR of the plaintext and the
ciphertext.
52Remedy Block-to-Block Chaining Plaintext AB
AB 11000000 11000001 Key 10110100
01101011 EXOR Ciphertext 01110100
10101010 Now shift ciphertext and EXOR with
the next block of plaintext and then EXOR the
key Ciphertext 11101001
01010100 Plaintext AB AB 11000000
11000001 EXOR 00101001 10010101 Key
10110100 01101011 EXOR 2
10011101 11111110
53Diffusion Diffusion Makes the statistical
relationship between the plaintext and the
ciphertext complex by having each input bit
affect many output bits. Example For a message
M m1, m2, m3mn characters Encrypt using an
averaging operation such that k yn
? mn i mod 26 i 0 This
adds k successive letters starting with m to
create the ciphertext character yn
54Diffusion - Example Consider the string 1, 4, 5,
3 7, 9, 1, 4, 6, 8, 3
k yn ? mn i mod 10 k 3
i 0 y1 1 4 5 3 3 (encrypted
1) y2 4 5 3 7 9 (encrypted 4) .. y7
1 4 6 8 9 (encrypted 1) y8 4 6 8
3 1 (encrypted 4) Some statistical structure
has been dissipated.
55Reading Stallings, Chapter 2 Mahan , Chapter 3
(Lecture Notes)