Title: Private%20Information%20Retrieval
1Private Information Retrieval
2Contents
- What is Private Information retrieval (PIR) ?
- Reduction from Private Information Retrieval
(PIR) to Smooth Codes - Constructions (Achieving the Barrier)
- Construction (Breaking the Barrier)
3Private Information Retrieval (PIR)
- Query a public database, without revealing the
queried record. - Example A broker needs to query NASDAQ database
about a stock, but doesnt want anyone to know he
is interested.
4PIR
- The Single Server Case
- Chor et all have shown in their 1995 paper that
for a single server the it is necessary to send
the whole content of the database.
5PIR
- A k server PIR scheme of one round, for database
length n consists of
6PIR definition
- These functions should satisfy
7Simple Construction of PIR
- 2 servers, one round
- Each server holds bits x1,, xn.
- To request bit i, choose uniformly A subset of
n - Send A to the first server.
- Send the second server Ai (add i to A if it is
not there, remove if it is there) - Servers return the xor of the bits in the indices
of the requests. - Xor the answers.
8Smoothly Decodable Code
C0,1n??m is a (q,c,?) smoothly decodable code
if there exists a prob. algorithm, A, such that
?x ? 0,1n and ?i ? 1,..,n, Pr A(C(x),i)xi
gt ½ ?
The Probability is over the coin tosses of A
A has access to a non corrupted codeword
A reads at most q indices of y (of its choice)
Queries are not allowed to be adaptive
?i ? 1,..,n and ?j ? 1,..,m, Pr A(,i)
reads j c/m
9LDC is Smooth
- Claim Every (q,d,e) LDC is a (q,q/ d, e) smooth
code. - Intuition If the code is resilient against
linear number of errors, then no bit of the
output can be queried too often (or else
adversary will choose it)
10Smooth Code is LDC
- A bit can be reconstructed using q uniformly
distributed queries, with e advantage , when no
errors - With probability (1-qd) all the queries are to
non-corrupted indices.
Remember Adversary does not know decoding
procedures random coins
11Reduction from PIR to SDC Gol,Ka,Sch,Tr 02
- A codeword is a Concatenation of all possible
answers from the servers - A query procedure is made of k queries to the
codeword corresponding to the answers of k
servers on the requested bit (for queries
generated as in the PIR) - From the PIR properties it follows that the
distribution of queries to the indices of the
codeword are independent of the requested bit
12Reduction from PIR to SDC
- Let a be the length of an answer from a server, k
the number of servers and q the length of a query - Let l be the length of a codeword
- Let Pj be the probability of querying bit j. Note
that - Set . And duplicate bit j Nj times. When
querying for bit j choose at random one of the Nj
bits
13Reduction from PIR to SDC
- The probability of accessing each bit is now less
than 1/l - The new length of the encoding is less than
(k1)l - We have a (ka,k1,1/2) LDC
14Achieving the Barrier
- Ingredients
- X the database string
- E
- Px(Z1,,Zm) A polynomial in m?(nd) variables
of degree d s.t. Px(E(i))xi - s.t.
15Achieving the Barrier
- The user generates the Yj and sends all Yq q!j
to server j - We can view Px as a polynomial in the km
variables Yjl where the Yjl sum to Zj - Each server knows the value of (k-1)m variables
- Let dk-1, hence each monomial of Px has at most
k-1 different variables
16Achieving the Barrier
- Each variable is known to k-1 servers, hence
there exists a server who knows the values of all
the variables in the monomial. - Assign each monomial to one of the servers who
know all its variables.
17Achieving the Barrier
- Each server calculates the xor of the monomials
assigned to it and sends to the user - The user calculates the xor of all the answers.
18Achieving the Barrier
- Security - each server received k-1 vectors
which are random independent strings of length m - Communication Complexity each server received
k-1 vectors, each of length mO(n(1/d))
O(n(1/(k-1)) by choice of m and d.
19Achieving the Barrier
- Now take d1/(2k-1)
- Each monomial has a server who misses at most 1
variable, assign the monomial to that server - Each server sends the 1-bit coefficients of the
polynomial which is the sum of all monomials
assigned to it - The user evaluates the polynomial on the
variables Y
20Achieving the Barrier
- The query complexity is the same O(n(1/d))
- The answer complexity is (k2)mO(n1/d)
- Total complexity O(n1/d)
- O(n(2k-1)) by choice of d
21Breaking the Barrier
- The first idea that comes to mind is to try and
increase the degree d even further. - Unfortunately this does not work due to the
increasing size of the polynomials the servers
return. - The novelty of the paper is how to go around this
difficulty.
22Breaking the Barrier
- Assume that each polynomial is known not to one
server but to a group of servers. - Now we do not need to receive the polynomials
themselves but can use the PIR scheme (on those
servers) to evaluate them on the required input.
23Breaking the Barrier
- Suppose that we could write Px as a sum of Pv
where v ranges over all subsets of the servers.
The problem of evaluating Px reduces to
evaluating each Pv which (we hope) is of lower
degree. - On the other hand, also the number of servers is
smaller which is a disadvantage. - The paper comes to find such Pv with good
properties
24Breaking the Barrier
- Define k to be a lower bound on the size of the
sets V and ? the maximum number of variables a
server misses in Pv. - All together V misses at most ?V variables in
Pv.
25Breaking the Barrier
- We will choose an encoding E such that the
hamming weight of E(i) (and therefore the number
of monomials) will be bounded by d (the number of
monomials is bounded by 2d). - If we had Pv as specified then we could apply the
PIR recursively on all sets of size more than k
with communication complexity
26Breaking the Barrier
- Let E be an encoding to all strings of length m
and weight d. - We can encode different values thus
is sufficient to encode n values. - Define it holds that
- Define V(M) to be all servers who miss at most ?
variables in M
27Breaking the Barrier
- Lemma for ?,kltk and
dlt(?1)k-(?-1)k(?-2) and M a monomial of
degree d in Yj,h then either there is a server
who misses at most one variable or V(M)gtk - Proof Counting argument
28Breaking the Barrier
- Claim Let k,?,k be as before then there are
polynomials Pv,Pj for every V?k s.t. Vgtk
and j?k s.t. - Pv is of degree ?V and can be computed from Px
and Yjj?V - Pj is of degree 1 and can be computed from Px and
Yjj?i -
29Breaking the Barrier
- Proof It is sufficient to prove for P consisting
of a single monomial, then we can sum over all
monomials. - Denote
- Define ?(M) to be the number of variables in M
for which
30Breaking the Barrier
- WLOG take
- Define a
polynomial in mk variables. - Q has kd monomials each of the form
31Breaking the Barrier
- Set QQ, for all V Pv0
- Find VV(M) for some monomial M in Q s.t. V is
of maximal size, if Vltk stop. - While there is M s.t. V(M)V
- Pick M from Q which maximizes ?(M)
- PvPvT(M), QQ-T(M)
- Goto 2
32Breaking the Barrier
- If the algorithm halts then the Pv are of the
desired degree and their sum is equal to P-Q for
Q at the end of the execution. - Likewise, for each M in Q there exists a server
j who misses at most one variable, add M to Pj
33Breaking the Barrier
- Define M??M if V(M)V(M) and
-
- for all qltd either or
- If M is a monomial in T(M) then
- V(M)?V(M)
- ?(M)lt?(M)
- Equality in 1,2 implies M?M
- M1?M2 implies either both are in T(M) of both
arent
34Breaking the Barrier
- Each time step 3 is applied we either add to Q
monomials M with smaller V(M) or ?(M) which
will be dealt with later. - Or M??M so it already exists in Q and is
removed.
35Breaking the Barrier
- Lemma For all igt0 and kgt(i-1)! there exists a
PIR protocol Pi with communication complexity O
(n2/ik) - Corollary there exists a PIR protocol with
communication complexity
36Summary
- For every PIR scheme we have a related smooth
code - Upper bound for PIR is raised to
- Likewise the upper bound for smooth codes is
raised to
37Related Topics
- T-collusion PIR, the protocol must maintain
security against collusions of T servers. General
results appear in Information-Theoretic Private
Information Retrieval A Unified Construction
Beimel, Ishai - CPIR Computational PIR in which the security
definition is relaxed to a computational one. - There exist polylog single server CPIR protocols
Cachin, Micali, Stadler