Title: Francisco Corella and Karen P. Lewison
1Searching the Web More Effectively With Multiple
Simultaneous Queries
- Francisco Corella and Karen P. Lewison
- www.pomcor.com
- Use Noflail? Search at noflail.com
- April 2009
2Search Engines Are Effective
- First result is often what we are looking for
- But there are difficult search problems
- Example finding a conference on search, but not
on SEO, is hard - There are many conferences on SEO
- Search is used with different meanings in many
different contexts - Conference has several synonyms
- Conference on search vs. searching for
conferences - Search may be used to label a site-search box
3Semantics May Address Many of These Difficulties
- by allowing for more precision in query
specification and content encoding, through - Context specification
- Information Retrieval vs. Marketing vs. Search
Rescue - Concept specification
- Concept of conference vs. the many synonyms of
the word conference - Specification of relationship between two
concepts - Conference on search vs. searching for
conferences - Exclusion of non-semantic words from the index
- E.g. the word Search used as label of a
site-search box
4Semantic Technologies Are Promising but Difficult
- Semantics require
- Sophisticated content encoding
- A different way of indexing the encoded content
- A different way of formulating queries about the
encoded content
5While Semantic Technologies Mature, It Is Worth
Asking
Is it possible to help users with difficult
search problems, today, with the current Web,
using an existing index and ordinary queries?
6The Answer YES
- In Noflail Search (available at noflail.com), we
provide substantial help with difficult search
problems, using a better user interface with an
unmodified backend accessed through a Web API
7An Unusual Architecture
2. Queries / results
api.search.live.net (Web API)
Noflail Search Flex Platform Flash Player
Browser Users Laptop
1. Flex code
noflail.com
8How Can a User Interface Help with a Difficult
Search Problem?
- What does a user do when confronted with a
difficult search problem? - The user may issue several queries and look at
the first page of results of each query, e.g. - search conference nothing
- web search conference nothing
- web search meeting nothing
- search technology conference finds Semantic
Technology Conference, not broad enough - search technology meeting nothing
9What does a user do?(Continued)
- The user may go back to some of the queries and
view one or two more pages of results - Still nothing found in the example
- The user may issue more queries or dig deeper
into the result sets of some of the previous
queries - Finally, the query web search meeting yields
the Search Engine Meeting in page 4
10What does a user do?(Continued)
- This is a laborious process
- Going back to earlier queries requires retyping
them, or using the back arrow to backtrack past
intervening pages, or finding the queries in the
browser history - We could think of bookmarking queries, but this
is cumbersome---more on this below - To put it informally, when we face a tough search
problem - WE FLAIL
11Four Ideas for Helping the User
- Let the user save queries
- Let the user browse multiple result sets at once
- Provide cooperative responses to queries that
have zero results - Issue multiple queries simultaneously
121. Saving Queries
- Queries are saved in Flex local storage
- Saved queries are shown in a left panel, on the
search page itself
13(No Transcript)
141. Saving Queries
(Continued)
- Note Ordinary bookmarks also allow users to save
queries (actually, pages of results), but this is
cumbersome - Three clicks to save a bookmark
- Bookmarks not visible from search page
- Bookmarks of result pages mixed with bookmarks of
ordinary Web pages
15Saving Queries(Continued)
- Each new query is placed automatically in the
left panel, together with possible follow-up
queries - Follow-up queries may include
- A respelling of the original query
- Related queries
- Subqueries (queries with fewer search terms) if
there are no related queries
16Saving Queries(Continued)
- Queries are inserted into the left panel with
checkmarks that cause them to be deleted
automatically when the user issues another query - Thus the user can safely ignore the left panel
- The user can save a query by removing its
checkmark before issuing the next query - The user can build a database of useful queries,
reordering them as needed by drag-and-drop
172. Browsing Multiple Result Sets
- The user can click on any query in the left panel
to see its result set in a center panel (there is
also a right panel for ads) - The center panel has the usual page menu at the
bottom, which lets the user browse the result set
18(No Transcript)
192. Browsing Multiple Result Sets(Continued)
- Noflail Search remembers the last page visited in
each result set, and keeps it in memory - Thus the user can switch from one result set to
another with one click and zero delay, and resume
browsing where he/she left off - This is what we mean by browsing multiple result
sets at once - This lets the user do a manual breadth-first
search on multiple result sets effortlessly
203. Cooperative Responses to Queries with Zero
Results
- Queries with zero results are infrequent in the
Web at large - But they are important, e.g., in site search
- What do traditional search engines do when there
are no results?
21(No Transcript)
223. Cooperative Responses (Continued)
- If the query-response were a natural language
question-answer, this would be called an
uncooperative response, or, informally,
stonewalling
233. Cooperative Responses(Continued)
- It is possible to provide instead a cooperative
response that gives - The maximal subqueries (subqueries with the most
terms, hence most specific) that have results, as
possible follow-up queries - The minimal subqueries (with the fewest terms,
hence most general) that do not have results, as
so-to-speak reasons for the failure - These subqueries are listed in the left panel,
and the user can immediately browse the result
sets of all the possible follow-up queries
24Suggested follow-ups
Two separate reasons for the failure
25Algorithm for Computing the Cooperative Response
- It operates on the graph of subqueries of the
original query - Notations
- PPaella, MMussels , SSquid,
EEscargots - sitemyrecipes.com has been factored out for
simplicity
26(No Transcript)
27Algorithm(Continued)
- It explores the subgraph of zero-results
subqueries - It submits subqueries in parallel to the Web API
- It submits a subquery as soon as it knows that
all its parents have zero results hence when a
subquery with results is found, it must be
maximal - It collects subqueries with zero results, but
when a new one is found, it throws away its
parents, so that, at the end, the collection
contains the minimal subqueries with zero results
28Example
- The animation in the next slide illustrates a
possible run of the algorithm - OrangeIssued, no response yet GreenHas
results RedZero results Red-but-crossedZer
o results but not minimal - When multiple subqueries are outstanding, their
results may come back in any order the order in
the animation is just an example
29Click to animate
30Example(Continued)
- Results
- Maximal subqueries with results (green)
- PM Paella Mussels
- PS Paella Squid
- MS Mussels Squid
- Minimal subqueries with zero results (red, not
crossed) - PMS Paella Mussels Squid
- E Escargots
314. Issuing multiple queries simultaneously
- Weve just used that idea for the previous
algorithm! - Cooperative responses would not be practical
without the ability to issue queries in parallel
against the Web API - Parallel queries time linear in number of terms
just a few seconds even when forty or fifty
subqueries have to be submitted - Sequential queries time exponential in number of
terms could take 15 or 20 seconds for a 6-term
query that requires submitting 30 or 40 subqueries
324. Issuing multiple queries simultaneously
(Continued)
- Multiple simultaneous queries may also be used to
prefetch follow-up queries, even when the
original query has results - Recall that follow-up queries may include
- A respelling of the original query
- Related queries
- Subqueries if there are no related queries
- Prefetching would mean zero delay even the first
time the user clicks on a follow-up query
334. Issuing multiple queries simultaneously
(Continued)
- But there is a downside resource consumption at
the backend! - Queries with results
- If a query has N follow-up queries in the
average, prefetching would take (N1) more
resources - Limited benefit
- Queries with zero results
- Big benefit cooperative responses
- Cooperative responses are expensive, but queries
with zero results are very rare
344. Issuing multiple queries simultaneously
(Continued)
- After discussion with the Microsoft (the backend
provider) - No prefetching of follow-up queries when the
original query has results - Simultaneous queries OK to compute cooperative
responses when the query has zero results - We only do the zero-results analysis for queries
with no more than 6 terms.
35General Boolean Queries (Queries with AND, OR,
NOT)
- Not implemented yet in Noflail Search
- The white paper at http//www.pomcor.com/whitepape
rs/multisearch.pdf proposes a method for
suggesting follow-up queries to Boolean queries,
and an algorithm that provides cooperative
responses to Boolean queries that produce zero
results. - Here we just give an example of a possible run of
that algorithm for a particular query
36Illustration of the General Algorithm
- Query
- PMEPSE, i.e.
- (Paella AND Mussels AND Escargots) OR
- (Paella AND Squid AND Escargots)
- with site constraint factored out as before
- Key idea
- The user is only interested in subqueries of PME
and PSE not interested in subqueries with both M
and S. - The subquery graph now has an area of interest
- Black subquery of interest
- Grey subquery not of interest
37(No Transcript)
38Illustration of the General Algorithm (Continued)
- Since PMEPSE has zero results, both PME and PSE
must have zero results
39(No Transcript)
40Illustration of the General Algorithm (Continued)
- The algorithm produces
- The maximal subqueries among those of interest
that have results - The minimal subqueries among those of interest
that have zero results - The animation in the next slide illustrates a
possible run of an algorithm
41(No Transcript)
42Illustration of the General Algorithm (Continued)
- Results
- Maximal subqueries of interest with results
(green) - PM Paella AND Mussels
- PS Paella AND Squid
- Minimal subqueries of interest with zero results
(red, not crossed) - E Escargots
43For More Information
- White paper http//www.pomcor.com/whitepapers/mul
tisearch.pdf - Presentation in PDF format (without animation)
available at the Search Engine Meeting Web site - http//www.infonortics.eu/searchengines/index.html
- For comments, questions or to get the animated
PowerPoint file, send email to - Francisco Corella ltfcorella_at_pomcor.comgt