Title: Presentazione di PowerPoint
1Search Computing Stefano Ceri
2Who helped me in doing the proposal
3Candidate ideas
- Web Engineering
- Software Engineering for the Semantic Web
- Stream reasoning
- Search computing
4How I picked the SeCo idea.
5Time to delivery of the proposal
- Paraboschi Thursday evening
- Rinaldi Friday morning (in the shower)
- Week-end Main Body Written
- Monday-Wednesday State-of-the-art, Burocracy
- Thursday First submission
- Friday_at_5PM Deadline
6Motivating Examples
- Who are the strongest candidates in Europe for
competing on software ideas?
- Who is the best doctor who can cure insomnia in
a close-by hospital?
- Where can I attend an interesting scientific
conference in my field and at the same time relax
on a beautiful beach nearby?
This information is available on Internet, but no
software system is capable of computing the
answer. Queries span over multiple semantic
domains and require composing ranking of results.
7Their Common Aspect
- Multi-domain queries
- The answers are on the Web
- A knowledgeable user would do the query
step-by-step - Search database conferences, get their city
- Check that the city average temperature is warm
enough - Search low-cost flights via a broker for that
city - Search luxury hotels via another broker
- After hours of painful search the user might
actually succeed! - Can this be done better?
8Genesis of Search Computing
- 2003 LOWELL WORKSHOP CHALLENGE Find an ethnical
restaurant in a nice place close to Milano .
- Logically a composition of domains
- Restaurants (ethnical)
- Geo-locations (nice place close to Milano)
- Composing maps with geo-located information is
now almost solved by many services, i.e. on top
of yahoo local, google-local
9Camorra-Controlled Locations near Naples
10Genesis of Search Computing
- 2003 LOWELL WORKSHOP CHALLENGE Find an ethnical
restaurant in a nice place close to Santa Clara
- Logically a composition of domains
- Restaurants (ethnical)
- Geo-locations (nice place close to Santa Clara)
- Composing maps with geo-located information is
now almost solved by many services, i.e. on top
of yahoo local, google-local
- Composing maps with geo-located information is
now almost solved by many services, i.e. on top
of yahoo local, google-local
- but in general no system is capable of
composing arbitrary semantic domains
11Search Computing
- Search computing is a new multi-disciplinary
science which will provide the abstractions,
foundations, methods, and tools required to give
answer multi-domain queries on the Web
- Emphasis on
- Search services. These are software services
producing ranked information. Ranking is
essential for search service composition.
- Search Integration. The objective is not to build
new search systems but instead to integrate a
world-wide network of search systems.
- Web site http//home.dei.polimi.it/ceri/seco/
12 Research Chapters of Search Computing, 1
- Foundational theories, rooted into formal
disciplines such as mathematics and optimization
theory. - Statistical models for estimating the number and
qualities of the results produced by a search
service. - Optimization methods for determining efficient
plans for service integration - Software paradigms for designing and constructing
search computing systems. - Interaction paradigms to help user-friendly
expression of queries and ranking. - Framework for search-oriented software
architectures and their instrumentation.
13Research Chapters of Search Computing, 2
- Semantic domain knowledge for dealing with
terminological aspects in composing search engine
results. - Higher-order rankings for prioritizing search
objects and services. - Personal and social aspects for setting ranking
in relationship to individuals and context. - Business models for pushing and developing search
computing economy. - Legal and privacy issues concerned with search
sources integration. - Advanced computational architectures for search
computing.
14Some Challenging Issues
- Definition of best multi-facet,
multi-disciplinary problem cutting across the
various areas - understanding and translation of encoding systems
for ranking, - contextualization relative to the situation in
which the query is presented, - knowledge about the individual who is asking the
question, - societal behaviours such as the presence in
preferred lists or in authoritative sources, such
as blogs. - Dynamic ranking problem dynamically ranking and
selecting search objects and services at query
execution time - Includes object-based ranking,
- Driven by user interaction.
15How does it relate to
- Semantic Web?
- At least in the first part of SeCo we define and
solve a simpler problem, where sources are
pre-defined and their semantics is know and
therefore join patterns are given. - Dynamic selection will be slowly and carefully
injected. - Social Web?
- We use uses societal information to improve the
search, selection, and ranking of objects and
sources. But we dont plan to do specific
research, rather to reuse results.
16How does it relates to other projects?
- Pharos (large IP with little DEI involvment)
- Deals with extraction of metadata from new media
(video, audio files) and their Internet search
(find all videos where Obama speaks about Italy) - It is orthogonal, but we may use early results.
- larKC (large IP with DEI medium involvment)
- Deals with developing new languages, methods and
tools for massive reasoning we are involved in
stream reasoning. - Streams are yet another form of ranked
information (by their timestamps) where recent
data are more valuable, so we may reuse some of
the research approaches and methods. - Bio-informatics
- An important scientific domain with lots of open
issues
17Preliminary Results in Search Computing
- FUNDING (2007-08) PRIN NGS (New Generation
Search) - Politecnico Milano (National Coordination)
- University of Roma 3
- Free University of Bolzano
- The brick join of two search services
- Information Systems, March 2008
- The framework multi-domain query optimization
- International Very Large Data Bases Conference,
- Auckland (NZ), August 2008
- The interface mash-up based interaction
- IEEE-Internet Computing, November 2008
18JOIN of Web Services
- Input items resulting from TWO web service
calls, possibly ranked - Output composed items resulting from the
concatenation of matching items, presented in a
global ranking order - Matching condition using
- value equality,
- partial set matching
- term matching within a vocabulary
- ..
- Services are known, their matching function is
predefined this is not service discovery!
19Popupular rock CDs
- Sources Amazon and iTunes
- Score measures ranking
- Match measures similarity
20Relevant news in two newspapers
- Sources Corriere.it and Repubblica.it
21Search Service Integration Framework
- Objective a Web Service Management System
- The system accepts queries, optimize them
transparently to the user, and produces the
result - This is the follow-up of research done at
Stanford VLDB06 but with significant changes - Focus on search services
- Ranking as first-class citizen
- Physical optimization
22Example
Find database conferences in the next six months
in warm locations offering inexpensive flights
from Milano and luxury hotels
23Query Plan
Composition of four services Conferences, Weater
forecasts, Flights, Hotels
24Method
Approach
25Developer-Oriented interface
- Mashing up software services is becoming very
popular among developers - We propose a declarative mashup language for
search services as a simple interface of the Web
Service Management System, hiding all the
optimization
26Mashup interface for Search Service Integration
27Some of the wrapped Web sites in the current
prototype
- Booking.com (www.booking.com) for hotels
- Expedia (www.expedia.it) for flights
- AccuWeather (http//www.dapper.net/) for weather
conditions - TicketOne (www.ticketone.it) for events
- GoogleMaps (maps.google.com) Distance Calculator
Find Businesses - Bed-and-breakfasts(www.bedandbreakfast.it)
- 35mm.it (http//programmazione.35mm.it/) for
movies locations - IMDB(www.imdb.com) for movies descriptions
28Workplan Details
29Research Agenda for 2009
- Building a theory of search computing focus on
services producing ranked lists - Operations, parameters, statistical models,
optimization models - Improving language abstractions and description
formalisms for supporting multi-domain queries - Covering various levels of abstractions
- Requiring various levels of users expertise
- Build a first prototype so far we have some of
its ingredients - Invest on applications more experiments, in
diverse fields (including bio-medical
applications) - Start thinking to many other topics and
organize the team.
30Participants
- DB Group
- Daniele Maria Braga
- Marco Brambilla
- Alessandro Campi
- Stefano Ceri
- Sara Comai
- Emanuele Della Valle
- Piero Fraternali
- Pier Luca Lanzi
- Marco Masseroli
- Maristella Matera
- Giuseppe Pozzi
- Marco Tagliasacchi
- DEI
- Edoardo Amaldi
- DIG
- Roberto Verganti
- Tommaso Buganza
- Phd Students
- Adnan Adib
- Mamoun Abu Helou
- Davide Barbieri
- Alessandro Bozzon
- Davide Mazza
- Stefania Ronchi
- Massimo Tisi
- Post-Docs
- Michael Grossniklaus
- Davide Martinenghi
- Anthony Ventresque (?)
31Advisory Board
- Fabio Casati, Università di Trento, expert on
"Service Computing" - Georg Gottlob, Oxford University, expert on
"Search Computing Theory" - Ioana Manolescu, INRIA - Paris, expert on
"Systems and Performance" - Roberto Verganti, School of Management,
Politecnico di Milano, expert on "Business
Models" - Gerhard Weikum, Max-Planck-Institut für
Informatik, Saarbruecken, expert on "Information
Retrieval for the Web" - Jennifer Widom, Stanford University, expert on
"Languages and Paradigms"
32Continuing.
- Davide Martinenghi Theory of Search Computing
- Alex Campi Join of Search Services
- Daniele Braga Framework for Search Computing
- Piero Fraternali Search for Video-Audio Contents
- Then A round table of first impressions among
SeCo participants (expecially newcomers) - Finally (for the ones who will deserve it) Big
Dinner _at_ Ceris