Title: Designing, improving, and installing a collaborative cache for disconnected villages
1Designing, improving, and installing a
collaborative cache for disconnected villages
- Sibren Isaacman
- Group Talk Feb 25, 2009
2Closing the Digital Divide
- Connectivity problems in rural regions
- Large infrastructure overhead
- High cost/bit and low bandwidth
- 3000/Mbs/month
- Extremely high latency
- 80 of North American adults have access to
internet, compared to 5 of Africans - Technology for developing regions has been named
a Millennium Development Goal by the UN - Increased access to the information on the
internet and digital classrooms can change lives
3Our solution
- Collaborative caching and predictive prefetching
increase usability - Decreases number of roundtrips
- Reduces miss rates by up to 89
- Decrease sent bits (often directly related to
cost) by as much as 6x - Latency for average page access time reduced up
to 90
4Outline
- Our previous work
- C-LINK
- Simulation
- Current efforts at improvement
- The real deployment
5Related work
- DTNs and network connectivity
- DakNet Pentland, 2004
- KioskNet Seth, 2006
- DTLSR Demmer, 2007
- TEK Thies, 2002
- Collaborative Caching and Web Proxies
- Summary cache Fan, 2000
- Cache Digests Rousskov, 1998
- Squid Wessels, 1998
- DitTorrent Saif, 2006
DTNs are thought to have too much delay for
interactive Web access Caching strategies thus
far have not looked at disconnected networks
6Design Goals of C-LINK
- Allow web access over any network layer
- DTN, VSAT, cellular
- Miss rates must be brought down
- Data requested by one node may be used by another
- Possibly serving stale data in the short term
- Must adhere to constraints imposed by the
environment
7Environmental constraints
- Severe storage constraints
- 10s of GB per machine
- Heterogeneous devices
- Multiple transport options
- Frequent power interruptions
- At both node and system level
8C-LINK
Page A, please
Node 1 has it
Page A, please
A
B
A
Page B, please
Page A, please
Page A, please
A
B
9Interface
- Handles requests from a generic web browser
- Captures socket dump from browser
- Returns pages or waiting message
- Searches for files locally first
- Notifies Kiosk if file is found
- Contacts Kiosks and Load Managers on behalf of
Web browser
10Kiosk daemon
- Point of contact for Interfaces
- DHCP server
- Maintain hash map of URLs to machines
- Returns last known IP address when URL is
requested - Determine which network requests should be sent
on - Send files to Load Managers
- Or temporary local storage
- Note cacheability of pages to refetch
non-cacheable pages
11City Fetch-Engine
- Makes connections to internet servers
- Puts browsers socket dump in to file
- Dumps response into file to send back
- Selects network connection over which to send
data back - Prefetch pages
- Simple parsing for imbedded files/links
- More complex models
12Notifier
- Woken up by network when pages reach village
- Determines whether prefetched pages should be
retained or discarded - Informs Kiosk daemon of pages arrival and name
of original requestor
13Load Manager
- Serves requested pages to Interfaces
- Notifies Interfaces if page has been removed
- Maintains and enforces storage quotas
- May be dynamically tuned or statically
apportioned - Separate cache space for prefetched and
explicitly requested pages - Determines eviction policies
- LRU queue maintained
- Notifies Kiosk daemon on page eviction
- Inserts pages in to local cache
14Prototype system
- 3 Pentium 3 computers running KioskNet
- The proxy runs the City Fetch-Engine
- The kiosk runs the Kiosk daemon and Notifier
- 1 Core Duo laptop configures as KioskNets
ferry - 3 Pentium 4 computers running standard Hardy
Heron Kubuntu - Run both Interface and Load Manager
- 1 GB caches
- No prefetching
15Observations
- 19628 requests made by students
- 44 were local hits
- 20 were collaborative hits
- 60 of hits were for non cacheable content
- Average time to display a page was 300 msec
- Local pages could be served in 5 ms
- Pages elsewhere in the village may take up to 1
second to display
16Outline
- Our previous work
- C-LINK
- Simulation
- Current efforts at improvement
- The real deployment
17Simulator
- Trace Driven
- Cambodia, Blackboard, and Prototype traces
- Page accesses in trace are assumed to be requests
- Nodes enter the cache on their first access
- Track hits, misses, and latencies
- Accurate model of the system previously described
- Collaborative caching turned off results in
nodes maintaining individual caches
18Tunable parameter Cache Size
- Size of cache at user nodes
- Based on previous work, range from 0-100 KB per
node - Sizes must be scaled appropriately for number of
users - Multiplicative factor for cache size at Kiosk
- The kiosk may be slightly better then a user
machine - Use 1.25x, 10x, 100x, and infinite space
19Tunable parameter Network Delay
- leave time how frequently requests leave the
village - length of trip the length of a round trip
- Define three networks
- Instant leave time0 length0
- Bus leave time60 length60
- Hybrid - leave time60 length30
20Tunable parameter User Connectivity
- Use random connections or traces obtained from
CRAWDAD - Random connections may be balanced with means
of 90 min or unbalanced with means matching the
CRAWDAD distribution - Nodes use selected distribution to determine
length of time connected or disconnected
Time in range (CRAWDAD)
Time out of range (CRAWDAD)
21Exploring the network layer
7-14x reduction in miss rates Saving 75 of bytes
transmitted
2x reduction in miss rates Saving 28 of bytes
transmitted
22Exploring Cache Size
Average latencies are less than 10
minutes Greater than 5X improvement at low cache
sizes
23Exploring Kiosk Limits
Average latencies reduced 2x with little extra
space at kiosk Additional storage provides little
benefit
24Exploring node movement
Previously examined cases are worst
cases Unbalanced motion shows improvements of 3x
25Prototype system
13 improvements in the best case Reflect
abnormal usage patterns and general web traffic
26Outline
- Our previous work
- C-LINK
- Simulation
- Current efforts at improvement
- The real deployment
27Prefetching
- Need intelligent prefetching
- Coding is nearly completed to deliver all
embedded content on the page
28Replication
- Data accessed more than N times should be
replicated in the network - Protects against failures
- Speeds up page delivery
- Modifications to the simulator complete
- Comparisons of N999 (very little replication)
and N3 (the average number of requests) - 10 improvements in miss rate and number of
reachable pages - Numbers may be better because of forced
replication when node was out of range - Previously, only one copy was known by the kiosk
29Resiliency
Need a Kiosk
Im back!
Random back off
Page list
Web Browser
Im the new Kiosk
Kioskdaemon
Interface
Random back off
Notifier
Load Manager
Page list
Node n
30Outline
- Our previous work
- C-LINK
- Simulation
- Current efforts at improvement
- The real deployment
31Cinco Pinos, Nicaragua
32Library
- attract users from general population
- students from school others
- alternative to pay-to-use, full-service internet
will likely draw less wealthy - just off bus route
- owned by CODER, Comision de Desarrollo Rural
- Local NGO, enclosed and protected space
- already have 1 computer in library which they
used to give computer lessons to local youth - already invested in creating development programs
for locals - Other libraries nearby allow for exploration of
multi-kiosk effects later
33Equipment
- 6 computers with Kubuntu
- 2 single board computers from Soekris Engineering
- 433 to 600 MHz AMD Geode LX single chip processor
with CS5536 companion chip - 128-1024 MB DDR-SDRAM, soldered on board
- 2 Serial ports, DB9 and 10 pins internal header
- Power LED, Disk LED, Error LED, Network LED's
- Mini-PCI type III socket.
- Atheros 802.11 card inserted
- 1 mini-box M300-LCD
- Intel mini-ITX Atom 1.6 GHz
- 1GB RAM
- VGA, Serial, PS/2
- PCI card with mini-PCI adapter
- Atheros 802.11 card inserted
- 3 antennas
- 2 outdoor and 1 vehicle mount
- 2.4GHz omni-directional
34Measurement
- Quantitative
- Similar to measurements taken on prototype
- May pre-load some pages
- Include measurements of number of evictions and
cache readjustments - Qualitative
- User experiences
- Suggested improvements
- Perceived value of system
- Future features
35The End