Technical Developments Related to Quality Issues - PowerPoint PPT Presentation

About This Presentation
Title:

Technical Developments Related to Quality Issues

Description:

AltaVista results include sites selling medical services. Distinct pages found using Clever ... Integrated with search engines such as Yahoo ... – PowerPoint PPT presentation

Number of Views:10
Avg rating:3.0/5.0
Slides: 14
Provided by: brian89
Category:

less

Transcript and Presenter's Notes

Title: Technical Developments Related to Quality Issues


1
Technical Developments Related to Quality Issues
  • Brian Kelly
  • UK Web Focus
  • UKOLN University of Bath
  • Bath, BA2 7AY
  • B.Kelly_at_ukoln.ac.uk
  • http/www.ukoln.ac.uk/
  • Contents
  • Application-based Developments
  • Protocol Developments
  • Conclusions

UKOLN is funded by the British Library Research
and Innovation Centre, the Joint Information
Systems Committee of the Higher Education Funding
Councils, as well as by project funding from the
JISCs Electronic Libraries Programme and the
European Union. UKOLN also receives support from
the University of Bath where it is based.
1
2
Application-Based Solutions
  • Sophisticated search engines are being developed
  • Google
  • Large-scale search engine for the research
    community (now commercial)
  • Clever
  • IBM research project
  • Direct Hit!
  • Records how users make use of search engines
  • Alexa
  • Allows end users to vote on resources

2
3
Google
  • Google uses a "PageRank" technique - important
    resources are pointed to from many sites and
    important sites (e.g. Yahoo).
  • See ltURL http//www.google.com/gt

Following the link to the first hit
Search for Digital Libraries
3
4
Clever
See ltURL http//www.almaden.ibm.com/cs/k53/cleve
r.htmlgt)
  • Aims to find small set of documents the most
    authoritative information on the requested
    subject.
  • Uses a standard search engine to gather a "root
    set" of pages matching the query. Next, adds all
    pages pointing to or pointed to by the root set.
    Thereafter, it uses only the links between these
    pages to distill the best authorities and hubs.

AltaVista results include sites selling medical
services.
Distinct pages found using Clever
Clever finds the key Baseball sites.
4
5
Direct Hit
  • Direct Hit
  • Integrated with search engines such as Yahoo
  • Ranks results based on clicking profile from
    other users of the search service

http//www.directhit.com/
Users searching for Dublin Core typically click
on links related to metadata. Therefore put
these at the top of the search results.
5
6
Alexa
  • Alexa
  • Enables end users to "rate" site when surfing
  • Includes access to related links
  • Based on central archive of the web (see ltURL
    http//www.archive.org/gt
  • See also Netscape's What's Related facility

http//www.alexa.com/
  • Possibilities
  • Signed votes
  • Use Alexa model with UK database of resources

6
7
Summary
  • Good News
  • New generation of experimental search engines are
    being developed
  • Algorithms include
  • Making use of link information
  • Making use of end users input
  • Collaborative bookmarks (cf FireFly - You like
    "Sex" and "Drugs". So does he, and he also likes
    "Rock'n'Roll")
  • But such techniques make use of "brute strength"
    approach
  • Is there a more elegant solution?

7
8
We Need Metadata!
  • Web originally based on 3 architectural
    components.
  • Metadata is the missing component.

The W3C is developing a machine-understandable
metadata framework which can automate a variety
of tasks (resource discovery, content filtering,
etc.)
8
9
RDF
  • RDF (Resource Description Framework)
  • Provides a metadata framework ("machine
    understandable metadata for the web")
  • Based on ideas from content rating (PICS),
    resource discovery (Dublin Core), etc.
  • Based on a formal data model (direct label
    graphs)
  • Applications include
  • cataloging resources resource discovery
  • intellectual property rights content rating
  • digital signatures
  • privacy

RDF Data Model
9
10
Certificates
  • Certificates can be provided for
  • Services Users
  • Code (Java, ActiveX)
  • Certificate Authorities (CAs) can distribute
    certificates
  • Global CAs (Verisign, Thawte)
  • National CAs (Post Office, central University
    body, British Library, etc)
  • Government legislation this session related to
    digital signatures

10
11
Certificates Within An Organisation
  • Digital signatures will enable publishers (e.g.
    Universities) to give an authoritative stamps to
    digital resources

Staff and students can be given a certificate
which is used for authentication
Admissions
The CVCP could give certificates to Universities,
who would then be authorised to distribute
certificates within the university
Within the University, the Research Office and PR
Office can allocate legally-binding signatures to
authorised publications
11
12
Developments for Gateways
  • Quality information gateways
  • Can make use of signed resources to help
    cataloguing
  • Can provide input to sophisticated search engines
    (similar to Google)

Signed gateway this gateway follows xx quality
conventions
A central organisation could give certificates to
approved information gateways
12
13
Conclusions
  • Automated Indexing
  • AltaVista approach
  • Comprehensive
  • Junk indexed
  • Too may hits
  • Manual Indexing
  • Subject Gateway approach
  • Quality
  • Value-added services
  • Incomplete
  • Expensive
  • A Third Way
  • Combination of automated and manual approaches
  • Involvement from SBIG, author and end user
  • Exciting possibilities
  • Uncertainty of timescales and success
  • Coordination required - political issues
    (ownership of metadata, selling ads, etc.)

13
Write a Comment
User Comments (0)
About PowerShow.com