Implementation of One Stop Search by XSLT - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Implementation of One Stop Search by XSLT

Description:

xmlns:HKUL='http://www.lib.hku.hk/java/hkul.apps.web.Browser' exclude-result-prefixes='HKUL' ... Palm OS. Sun's Java 2 Platform, Micro Edition (J2ME) http: ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 20
Provided by: dav5285
Category:

less

Transcript and Presenter's Notes

Title: Implementation of One Stop Search by XSLT


1
Implementation of One Stop Search by XSLT
  • By Dave Low
  • University of Hong Kong
  • 9-Dec-2003

2
Agenda
  • Flow of One Stop Search
  • Reason to use Extensible Stylesheet Language
    Transformation (XSLT)
  • Difficulties on implementation of One Stop Search
    by XSLT
  • Our solution
  • Our implementation
  • Summary

3
Flow of One Stop Search
  1. Capture the search keyword
  2. Issue the search to different search engines
  3. Collect the result and click on next button until
    we got all the records
  4. Compile the search results from different search
    engines
  5. Present the result to the user

4
Flow of One Stop Search
ProQuest
Science Direct
Kluwer Online
One Stop Search
Capture Keyword
Search and next
Search and next
Search and next
Compile Result
Present Result
5
Reason to use XSL
  • Simple
  • XSL is plain text
  • Multiplatform
  • Can run on any machine with XSLT Engine
  • Easy to maintain
  • When the output layout of target search engine
    change
  • Just change the content of XSL file
  • No recompilation is needed

6
Two main problems when using XSL
  • XSLT engine requires well formatted XML files as
    input
  • Web based search engine output in HTML only
  • HTML is not well formatted XML
  • HTML allows open tag only for some tags
  • E.g. ltbrgt

7
Solution
  • Use HTML tidy (http//tidy.sourceforge.net/) to
    convert HTML to well-format XML
  • A HTML syntax checker and pretty printer. It can
    be used as a tool for cleaning up malformed and
    faulty HTML. In addition, it provides a DOM
    interface to the document that is being
    processed, which effectively makes you able to
    use it as a DOM parser for real-world HTML
  • It is open source
  • It has many implementations such as Java, Perl
    and Python

8
Solution
  • Sample code in Java
  • StringReader strReader new StringReader(html)
  • Tidy tidy new Tidy()
  • return tidy.parseDOM(strReader, null)
  • HTML gt XML

9
Two main problems when using XSL
  • There is no browse function in XSL
  • In one-stop search, we need to click the next
    button several times to collect all the result
  • We need to tell the program to find the next
    button and then issue a browse request based on
    the URL of the next button

10
Solution
  • Add browse function to XSL by XSL extension
  • XSLT allows two kinds of extension, extension
    elements and extension functions
  • Type of extension depends on XSLT implementations
  • Detail can be found http//www.w3.org/TR/xsltexte
    nsion

11
Solution
  • Our implementation
  • Select a java based XSLT Engine
  • Use java to write the function
  • Compile it into classes and then jar
  • Include the jar file into the classpath of the
    XSLT Engine
  • Run it

12
Sample code on XSL extension
Define Class to be used
  • lt?xml version"1.0" encoding"UTF-8"?gt
  • ltxslstylesheet version"1.1"
  • xmlnsxsl"http//www.w3.org/1999/XSL/Transform"
  • xmlnsHKUL"http//www.lib.hku.hk/java/hkul.apps.w
    eb.Browser"
  • exclude-result-prefixes"HKUL"gt
  • ltxsltemplate match/"gt
  • ltxslvariable name"url"gthttp//www.lib.hku.hk
    /lt/xslvariablegt
  • ltxslvariable name"browser"
    select"HKULnew(url)" /gt
  • ltxslvariable name"content"
    select"HKULbrowse(browser,url)" /gt
  • ltxslapply-templates select"content/html/"
    /gt
  • lt/xsltemplategt

Create it
Call the browse function
13
Our Implementation
Browse
Next
Tidy
Parse
Result
14
Our Implementation
  • Both client and server programs are written by
    Java
  • Client and server program communicated by HTTP
  • Making use of wireless network

15
Our Implementation (Client side)
  • Palm OS
  • Suns Java 2 Platform, Micro Edition (J2ME)
    http//java.sun.com/j2me/
  • Mobile Information Device Profile (MIDP)
    http//java.sun.com/products/midp

16
Our Implementation (Server side)
  • Application Server (Running on Sun Solaris with
    JDK1.4)
  • Jakarta Tomcat (http//jakarta.apache.org/tomcat)
  • Jakarta Struts Framework (http//jakarta.apache.or
    g/struts)
  • Xerces XSLT Engine (http//xml.apache.org/xerces)
  • MySQL database (http//www.mysql.com)

17
Summary
  • Implement the one stop search by XSLT
  • Simple
  • Multiplatform
  • Easy to maintain
  • Two problems
  • HTML is not well formatted XML
  • No browse function in XSL

18
Summary
  • Solutions
  • HTML Tidy
  • XSL Extension
  • Implementation
  • J2ME
  • Jakarta Tomcat Struts
  • Xerces
  • MySQL

19
Questions?
  • Thank you
Write a Comment
User Comments (0)
About PowerShow.com