Title: Smart Content Delivery in Large Networks: EnRoute Caching
1Smart Content Delivery in Large Networks
En-Route Caching
- Hong Shen
- School of Computer Science
- University of Adelaide, Australia
- Dept. of Computer Sci. Tech.
- University of Sci. Tech. of China
2Outline of the Talk
3Content Distribution Network
- Sits between content providers and content
consumers. - Contains hundreds of servers throughout Internet.
- Replicates and maintains customers content in
CDN servers.
4CDN Example Google platform
- Maintains over 450,000 CDN servers, arranged in
racks located in clusters in cities around the
world
- Allows users to access its content most rapidly
by sending them lightly loaded and geographically
proximate servers.
5Bottleneck of CDNs
- Multiple transmission flows for the same object.
- Solution caching the object in selected nodes.
WHEN and HOW?
Challenges
6En-Route Object Caching
- Object caching Store most commonly accessed
objects close to clients
- En-route object caching Objects are cached at
selective nodes on the access path from client to
server
7En-Route Object Caching (cont.)
Why en-route?
- Important observation
- Users normally have regular access patterns
- Storing object at en-route nodes during delivery
does not consume extra bandwidth.
8Caching Performance
- The performance of en-route object caching
depends mainly on two factors - The locations of the caches (Cache Location)
- The management of the cache contents (Content
Replacement)
Coordinated Caching Consider both factors when
making cache decision.
9Our Work
- Web object en-route caching in tree networks
ACM Transactions on Internet Technology, Vol. 5,
No. 3, 2005, p. 480-507.
- Multimedia object en-route caching in tree
networks
ACM Transactions on Multimedia Computing,
Communications and Applications, Vol. 1, No. 3,
2005, p. 289-314.
- Multimedia object placement for transparent data
replication in linear array
IEEE Transactions on Parallel Distributed
Systems, Vol. 18 , No. 2, 2007, p. 212-224.
- Multiserver en-route web caching
- IEEE Transactions on Computers (under review),
2007.
10Definitions and Notations
- G(V,E) is a graph, where V is the set of nodes
and E is the set of links. - Cost saving s(v) the cost saving of storing a
new object in node (cache) v. - Cost loss l(v) the cost loss of removing other
objects from node v in order to accommodate the
new object. - Cost gain g(v) g(v)s(v) l(v).
11Problem Formulation
Find a node set P to store the object s.t. the
total cost gain is maximized
G(P)
12Problem Formulation for Tree Networks
w
Server
v
Hold no copy
Hold a copy
f(v)
13Constraints
The different cases of C include
- C is null (unconstrained).
- The cost gain for each node is greater than zero,
i.e., g(v)gt0 for all v in P. - The number of copies is exactly k, i.e., Awk.
- The number of copies is no more than k, i.e.
Aw? k.
14Solution for Unconstrained Case
- Main idea
- Decompose the tree level by level recursively to
a set of lines or singletons (nodes) whose
solutions are known. Solution (Aw) to tree Tw is
obtained by combining (union of) the solutions
(Aw,x) to Tws subtrees.
15Tree Decomposition (1)
C(w) set of all children of node w.
16 Decomposition of
A
w
w
Aw
w1
w2
17Tree Decomposition (2)
18 Decomposition of
A
x
w,
w
x
x
x
1
2
1.
19Algorithm 1
20Algorithm 1 Continued
21Time Complexity
The algorithm runs in time
- tw O( ?v?C(w) ( ?C(v)?tv) )
- O(?v?V?D(v)?)
- O(n2),
- where n is the total number of nodes in the
network.
22Solution for Constrained Case I
Non-negative cost gain per node
(1)
23Transformation
- The optimal solution for Problem (1) is
equivalent to
(2)
24Algorithm 2
25Algorithm 2 (Continued)
Time Complexity
O(n2)
26Solution for Constrained Case II
Placing exactly k copies
(3)
27Algorithm 3
Time Complexity
O(n2log(fn)), where fmaxf(v).
28Solution for Constrained Case III
Placing at most k copies
(4)
29Algorithm 4
Time Complexity
O(kn2log(fn)), where fmaxf(v).
30Extension to ASes
System Model
31Solution
- Dividing the whole system into two parts and one
part is a tree. - Continuing to divide the other part in the same
way until there is only one tree left. - Applying the methods for tree network.
32More General Setting m-Sever En-route Caching
- A set of servers Ssj, 1 j m located at
leaves of a tree. - Cost saving for node w, s(w, dj), under the
condition that the distances from w to the
nearest high level node towards server sj that
holds a copy is dj. - Find a node set P to store the object, s.t. the
total gain is maximized (v?P serves nodes g(v,S))
33The Challenge
We cant get optimal solution to multi-server
problem by simply combining solutions to 1-server
problem.
A Simple 2-Server Problem
Solve 1-server problem
Optimal Solution
?
Hold a copy No copy
34A More General Definition
- Condition Dw, Dwd1,dj,dm, dj is the distance
from node w to the nearest node towards sj, for
example u, that hold a copy of object O. - G(w, Dw), is the objective value of (6) in Tw
under condition Dw, - A(w, Dw) is the solution corresponding to G(w,
Dw).
35Lemma 1
For tree Tr containing m servers at leave nodes,
the distances from wi to the nearest node
towards sj that holds a copy are denoted by
e(wi,dj) and k(wi,dj) for the cases that node wi
holds a copy and no copy respectively, then we
have
r
s2
s1
s3
wi ? pathr, sj means server sj is in the
sub-tree twi, because servers are located at
leaves.
An example of multi-server network
36Theorem 3
For tree Tr containing m servers at leave nodes,
the optimal solution of (6) is A(r, Dr) and
corresponding objective value is G(r, Dr), where
Dr is the vector of distances from root node to
servers and
37Theorem 3 (cont.)
38The Algorithm
- Main idea
- Problem is split top-down and solution
A(r, Dr) is generated bottom-up according to
Theorem 3, with corresponding objective value
G(r, Dr). - Time complexity
- Algorithm computes all G(w, Dw), where w? V,
Dw d1,dj,dm, 0 dj hw, hw is the
distance from w to sj, hw 2h. - Time complexity of the algorithm is O(nhm).
39Conclusion
- New tree decomposition techniques for en-route
web caching. - Polynomial-time algorithms for the first time for
1-server en-route web-caching in tree networks.
- p-server en-route web caching in tree networks
O(nhm ) time.
40Questions?
41Calculating cost loss l(v)
Cost loss l(v) The additional cost caused by
removing some objects from v to make room for the
new object
Holding no copy
Server
Missing penalty m(v) The additional cost of
accessing the object if it is not cached at v.
E.g. m(3)c(3,0), m(7)c(7,4).
0
Holding a copy
2
1
c(3,0)
5
f(3)0 f(4)f(6) f(5)f(8) f(9)
3
4
c(9,5)
6
7
8
9