Title: Aruna Balasubramanian, Yun Zhou, W Bruce Croft,
1Web Search From a Bus
- Aruna Balasubramanian, Yun Zhou, W Bruce Croft,
- Brian N Levine and Arun Venkataramani
- Department of Computer Science,
- University of Massachusetts, Amherst
2Why web search from a bus?
- Open access point commonly available
- Intermittent internet connectivity from vehicles
possible - no subscription cost
- useful when no other connectivity is available
- Web search 2nd most common web activity (survey
by pewinternet.org) -
3Connectivity characteristics of testbeds
Goal Build web search in the presence of
frequent disconnections and small connectivity
duration
4Web search process
ltyour favorite search enginegt
Retrieving web.
Retrieving images
Retrieving.
5Adapting to vehicular network
6Why challenging?
- Interactive
- several exchanges between user and search engine
needed - Results imprecise
- response may not be relevant
- difficult to measure relevance
Thedu Proxy Architecture sustain
interaction IR contribution increase usefulness
of returned response
7Thedu proxy
- Between vehicle and search engine
- When proxy receives query request from vehicle
- retrieves urls and snippets
- prefetches URL contents including images
- stores responses and maintains state
- When vehicle connects to proxy
- downloads pending responses
8Client and proxy architecture
Server-side Proxy
Client-side Vehicle
Search engine
Queries for vehicle
New queries
Web interface
USE R
Queries
Fetch URL/images
Store query
Intermittent connectivity
Response bundles
Process response
Prioritize response
Responses
9How to prioritize?
- Search engines use relevance scores to rank
responses - scores not comparable across queries
- Even if response is relevant it may not be useful
- Query chants 2007 needs only one response
- Thedu
- Normalize relevance scores Comparable across
queries - Classify query-type To capture user intent
http//www.netlab.hut.fi/chants-2007/
10Query-Type classification
- Query-type classification
- Homepage query cnn, chants 2007
- Non-homepage query Harry potter review
- Thedu classifies using URL, snippet and title
field - E.g., chants 2007 on Google
- lturlgt http//www.netlab.hut.fi/chants-2007
lt/urlgt - ltsnippetgt Welcome to the home page of the ACM
MobiCom workshop on Challenged Networks (CHANTS
2007). lt/snippetgt - lttitlegt chants workshop lt/titlegt
11Relevance score normalization
- Modified language model framework
- D Document, Q Query, C Collection
-
- Normalized score
- Kullback-Leibler divergence (distance between Q
and D)
Probability of word occurring in collection
Probability of word occurring in document
12Thedu protocol
- 1. Sort responses in the order of normalized
score - 2. For response r for query q,
-
- 2a. Update
- 2b. If q is homepage query and do
not send - 2c. Else send response to vehicle
expected relevance of all response sent for a
query q
probability that r is relevant for q
13Evaluation goals
- What is the delay in getting search results?
- How many results were relevant to the user?
14Evaluation Tools
- DieselNet
- Indri search engine
- TREC (Text Retrieval Conference)
- Predefined web data collection (10G)
- Predefined set of queries (100 homepage 50
content) - Relevance judgments (which documents are relevant
for query)
Thedus query-type classifier accuracy 88
15Deployment on DieselNet
16Thedu vs Proxy-less server
- Thedu
- March 26 to March 30
- Bundle responses
- Returns responses in prioritized order
- Maintains state
- Proxy-less server
- April 30 to May 5
- Bundle responses
- Returns responses as FIFO
- No state
17Connectivity duration
Mean connection duration 35 sec Mean
disconnection duration 8 min
18Thedu vs Proxy-less architecture
Thedu
Stateless proxy
19Delay until first relevant response
20Extending Thedu
- Can we use connectivity among buses to improve
throughput? - Are we limited to academic search engines?
- Convince commercial search providers to provide
relevance scores - Or, assign scores based on ranking
- Are users really happy with search results and
delay?
traces.cs.umass.edu
21Simulation Results
22Inter-meeting times