Title: Web Service Search
1Web Service Search
- Xin Dong, Alon Halevy, Jayant Madhavan, Ema
Nemes, Jun Zhang - CSE Database Group
- University of Washington
2Outline
- What are web services?
- How do we search web services now?
- How can we improve web service search?
- Why is the problem hard?
- How do we solve the problem?
- Woogle web service search engine
3What is a Web Service?
- A web service is a software component, published
and invoked across the Web, using standard
XML-based protocols - Web service functionality and interface are
formally described in WSDL files - Web services are published by registering WSDL
URL and brief description in UDDI business
registries. - Offer large-scaled information sharing and
ubiquitous computing
4(No Transcript)
5(No Transcript)
6(No Transcript)
7How to Search Web Services?
8Experience I
9Experience II
10Experience II
11Experience II
12Experience II
13Experience II
14Experience III
15Experience III
16Experience III
17Experience III
18Experience III
19How to Improve Web Service Search?
- Operation is the unit for remote invocation
- Provide similar operations for a given operation
201) Provide Similar WS Operations
- Op1 GetTemperature
- Input Zip, Authorization
- Output Return
- Op2 WeatherFetcher
- Input PostCode
- Output TemperatureF, WindChill, Humidity
Similar Operations ? Selection
212) Provide Operations with Similar Inputs/Outputs
- Op1 GetTemperature
- Input Zip, Authorization
- Output Return
- Op2 WeatherFetcher
- Input PostCode
- Output TemperatureF, WindChill, Humidity
- Op3 LocalTimeByZipcode
- Input Zipcode
- Output LocalTimeByZipCodeResult
- Op4 ZipCodeToCityState
- Input ZipCode
- Output City, State
Similar Inputs ? Aggregation
223) Provide Composable WS Operations
- Op1 GetTemperature
- Input Zip, Authorization
- Output Return
- Op2 WeatherFetcher
- Input PostCode
- Output TemperatureF, WindChill, Humidity
- Op3 LocalTimeByZipcode
- Input Zipcode
- Output LocalTimeByZipCodeResult
- Op4 ZipCodeToCityState
- Input ZipCode
- Output City, State
- Op5 CityStateToZipCode
- Input City, State
- Output ZipCode
Input of Op2 is similar to Output of Op5 ?
Composition
23Our Goal
- Given an operation, provide similar operations
- Operations with similar functionality (input and
output) - Operations with similar inputs/outputs
- Operations that can compose with the given
operation - Goal
- High recall Return potentially similar
operations - Good ranking Rank closer operations higher
- Operation matching problemInput/output matching
problem
24Why is the Problem Hard?
- Thousands of web servicesv.s. Billions of
webpages? Return all potentially similar
operations
25Why is the Problem Hard?
- Thousands of web servicesv.s. Billions of
webpages? Return all potentially similar
operations - Very brief description a couple of sentences or
paragraphsv.s. Webpages are much longer? Lack
of information
26(No Transcript)
27Why is the Problem Hard?
- Thousands of web servicesv.s. Billions of
webpages? Return all potentially similar
operations - Very brief description a couple of sentences or
paragraphsv.s. Webpages are much longer? Lack
of information - Operation and parameter names are highly varied?
Finding word usage patterns are harder
28Various Parameter Names
- Op1 GetTemperature
- Input Zip, Authorization
- Output Return
- Op2 WeatherFetcher
- Input PostCode
- Output TemperatureF, WindChill, Humidity
- Op3 LocalTimeByZipcode
- Input Zipcode
- Output LocalTimeByZipCodeResult
- Op4 ZipCodeToCityState
- Input ZipCode
- Output City, State
- Op5 CityStateToZipCode
- Input City, State
- Output ZipCode
29Why is the Problem Hard?
- Thousands of web servicesv.s. Billions of
webpages? Return all potentially similar
operations - Very brief description a couple of sentences or
paragraphsv.s. Webpages are much longer? Lack
of information - Operation and parameter names are highly varied?
Finding word usage patterns are harder - Operations have more complex structurev.s. Web
pages are mainly plain text? Finding term
frequency is not enough
30Different Structure
- Op1 GetTemperature
- Input Zip, Authorization
- Output Return
- Op2 WeatherFetcher
- Input PostCode
- Output TemperatureF, WindChill, Humidity
- Op3 LocalTimeByZipcode
- Input Zipcode
- Output LocalTimeByZipCodeResult
- Op4 ZipCodeToCityState
- Input ZipCode
- Output City, State
- Op5 CityStateToZipCode
- Input City, State
- Output ZipCode
Similar use of words, but opposite functionality
31Naive Methods
- Match terms in operation functionality
descriptions - TF/IDF weights
- Considering information from other sources
together as a bag of words - Input/output parameters
- Web service descriptions
32Our Strategy(1) Exploit Structure Information
- Compare each component separately
- host web services
- operation functionalities
- Inputs
- outputs
- Combine the results by setting different weights
to different components
33Our Strategy v.s. Naïve Methods
34Our Strategy v.s. Naïve Methods
35Our Strategy(2) Clustering Parameters into
Concepts
- Heuristics Parameters occurring together tend to
express the same concepts - Strategy Cluster parameters into concepts based
on their co-occurrences - Compare parameters and concepts separately, and
combine the results.
36Our Strategy v.s. Naïve Methods
37Woogle
- A collection of 790 web services431 active web
services, 1262 operations - Version1.0
- Web service category browse
- Keyword search on web service descriptions
- Keyword search on inputs/outputs
- Web service on-site try
- Web service status report
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42(No Transcript)
43Woogle
- Version 2.0
- Provide similar operations
- Web service template searchInput
zipOutput city Functionality
address transformReturn
ZipCodeToCityState GetCity
44Woogle
- Version 3.0
- Automatic web service compositionInput
cityOutput
temperatureFunctionality get weather - ReturnCityStateToZipCode GetTemperatureCitySta
teToZipCode GetWeather
45Conclusions
- Propose new web service search interface
- Provide Lists of Similar Operations
- Strategies
- Exploit structure information
- Cluster parameters into concepts
- Woogle
- http//barb.cs.washington.edu8080/won/wonServlet
46Conclusions
- Propose a new search method
- Search by parameters
- Define two fundamental problems
- Input-to-input matching
- Keyword-to-input matching
- Matching Algorithm Corpus-based matching
- Experimental results show that it has the
potential to return a list of appropriate web
services - Woogle
- http//barb.cs.washington.edu8080/won/wonServlet