Search and Data Management - PowerPoint PPT Presentation

About This Presentation
Title:

Search and Data Management

Description:

Understand the virtuous cycle between search and data and ways to ... Internet J. Cardiology. 2001. Collaborative Knowledge Creation (Educational Material) ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 13
Provided by: alanhal
Learn more at: https://dsf.berkeley.edu
Category:

less

Transcript and Presenter's Notes

Title: Search and Data Management


1
Search and Data Management
  • Rakesh Agrawal
  • MSR Search Lab

2
Current Focus Direction
  • Understand the virtuous cycle between search and
    data and ways to accelerate it
  • New search-centric applications
  • Personal data mining (Health)
  • Distributed Knowledge creation (Education)

3
Search Data Virtuous Cycle
Intents Behaviors Connections Popularity Trends
Web Pages Feeds
Better Search Results ? More Data ?Greater
Insights ? Better Search Results
4
Related Searches (aka Query Suggestions)
  • Most popular queries containing the current query
  • Analysis of how users reformulated their queries
  • Query click graph to find related queries

5
Result Diversification
  • Ideas from portfolio theory to allocate space to
    different result types
  • Marginal utility of adding a document decreases
    if the result set already contains high quality
    documents of the same type
  • Query and document classification using merged
    click logs

6
Classification Using Click Graph
ANIMALS documents
ANIMALS queries
Seed documents
Algorithm Random walk with absorbing states
7
Changing Nature of Disease
Infectious Diseases
  • New Challenge chronic conditions illnesses and
    impairments expected to last a year or more,
    limit what one can do and may require ongoing
    care.
  • In 2005, 133 million Americans lived with a
    chronic condition (up from 118 million in 1995).

8
Technology Trends
  • Tremendous simplification in the technologies for
    capturing useful personal information
  • Dramatic reduction in the cost and form factor
    for personal storage
  • Cloud Computing

9
Personal Health Analytics
10
Personal Data Mining
Charts for appropriate demographics?
Optimum level for Asian Indians 150 mg/dL (much
lower than 200 mg/dL for Westerners) Due to
elevated levels of lipoprotein(a)
Computation and selection across millions of
data sources Privacy and security
Enas et al. Coronary Artery Disease In Asian
Indians. Internet J. Cardiology. 2001.
11
Collaborative Knowledge Creation(Educational
Material)
  • Inspired by Wikipedia
  • But multiple viewpoints rather than one consensus
    version!
  • How to personalize search to find the material
    suitable for ones own style of teaching?
  • Management of trust and authoritativeness?
  • More than 3.5 million articles in 75 languages
  • Fashioned by more than 25,000 writers
  • 1 million articles in English (80,000 in
    Encyclopedia Britannica)

12
Summary
  • Web search is a data management and creating
    value from data problem
  • New search-centric applications can provide rich
    fodder for future database research.
Write a Comment
User Comments (0)
About PowerShow.com