SLA Annual Meeting - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

SLA Annual Meeting

Description:

spy. feature. creepy-crawler. Coleoptera. might mean. Also called ... Needs contextual search capabilities with passage retrieval so I know what I'm getting ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 40
Provided by: sla
Category:
Tags: sla | annual | meeting | search | spy

less

Transcript and Presenter's Notes

Title: SLA Annual Meeting


1
SLA Annual Meeting June 8st, 2004 JR Gardner,
Phd., Sun Microsystems Inc.
2
Agenda
  • Semantic Web Primer a big pair of shoes to
    fill...
  • A bug in my bonnet
  • Search for the solution
  • Sharing the solution
  • Securing the solution
  • Q A

3
Semantic Web
  • Ontologies (to describe the data with varying
    degrees of formality)
  • Classifiers/reasoners (to infer new relationships
    from existing assertions)
  • 'annotation' -- this is a process whereby the
    content of a document is annotated with a concept
    (or relationship) from an ontology. Also known as
    'tagging'. (The annotation doesn't have to be in
    the original document.)

4
Semantic Web, cont'd.
The goal of the Semantic Web is to develop
enabling standards and technologies designed to
help machines understand more information on the
Web so that they can support richer discovery,
data integration, navigation, and automation of
tasks. With Semantic Web we not only receive more
exact results when searching for information, but
also know when we can integrate information from
different sources, know what information to
compare, and can provide all kinds of automated
services in different domains from future home
and digital libraries to electronic business and
health services Berners-Lee 2001.
5
The Vision of how it works
6
An information sciences take
Policies Profession- alism
Z39.50 Dublin Core, OPACS
Software Engineering
7
What bugs me
  • Promises . . . but ...
  • No deployable solutions
  • Context-specific solutions
  • Black box solutions
  • Ontology Holy Grails
  • Semantic schemantics
  • Too many flavors and brands of everything

8
Epistemology for Etymology with Entomology-the
Business of Beetle Bugs fear and loathing
Beetle
Bug
might mean
Meanings
surveillance
error/fault
Also called
Also called
creepy-crawler
spy
Also called
feature
Alias/AKA
bad ambiguity
VW
Coleoptera
SunRay
Names
Is a
Is a
Is a
security
insect
car
product
Concepts
9
Ontology is not enough (or can be too much)
AIAG Automotive Ind. Action Grp.
STAR Std. For Tech. For Auto Retail
swoRDFish Sun RDF product/svc. Ontology
Meanings
surveillance
error/fault
creepy-crawler
spy
feature
Alias/AKA
Coleoptera
VW
SunRay
Names
security
insect
car
product
Concepts
10
Ummmmm......
Semantic Web
Babel?

11
What bugs the system-
  • Promises . . . but ...
  • No deployable solutions (cost, support)
  • Context-specific solutions (my sandbox or the
    world?)
  • Black box solutions (can I change it, grow with
    it, enhance, integrate, expand?)
  • Ontology Holy Grails (what of plethora?)
  • Semantic schemantics (format freakout)
  • Too many flavors and brands of everything

12
Ontology solutions can limit, not solve ...
AIAG Automovtive Ind. Action Grp.
STAR Std. For Tech. For Auto Retail
swoRDFish Sun RDF product/svc. Ontology
Meanings
surveillance
error/fault
creepy-crawler
spy
feature
Alias/AKA
Coleoptera
VW
SunRay
Names
product
security
insect
car
Concepts
13
What if we start with a search engine?
  • It should possess ability to ingest ontologies
  • Take advantage of ontological rigor yet employ
    soft reasoning/brain-type associations
  • Needs contextual search capabilities with passage
    retrieval so I know what I'm getting
  • Integrated ID management context is everything
  • Ease of integration with knowledge tools
  • Ease of deployment (no clients required)
  • Mature and well-supported platform

14
Well, search is fine, but libraries have always
had this ... what of MARC? Z39.50?
  • Information Sciences, especially the special
    libraries are on the brink of a MARC/RLIN-like
    revolution
  • Z39.50 is no longer confined to ASN.1 XML and
    Web services are brought to bear
  • The best of MARC (in MarcXml), Dublin Core, and
    ISO23950
  • Simplified, smaller scale, smaller footprint
    less hardware, software less
  • New possibilities as global information hub
  • OPACs on steriods

15
SRW/SRU new Info-sci Discovery-speak
  • SRW Search/Retrieve Web Service
  • A note on Web Services
  • Lighter, nimbler, more open integration
  • SRU Search/Retrieve Using URLs
  • Even lighter, palm-sized subscribable
  • Z39.50 concepts intact
  • Result Sets
  • Abstract Access points
  • Abstract Record schemas
  • Explain (simplified!!)
  • Diagnostics

16
Look familiar?

01173cam 2200289 a 4500
1668735
19981110163141.5
tag"008"980227s1998 njua b 001 0
eng " ind2" " (DLC)
98004504 tag"906" ind1" " ind2" " code"a"7 code"b"cbu code"c"orignew code"d"1 code"e"ocip code"f"19 code"g"y-gencatlg
... and so on, remainder snipped. This is SRW
SRU is even slimmer, removing SOAP (Web Service)
wrapper and presenting single record.
17
The next phase of the revolution......
NOW...
Then...
SRW/U, WS, Inference Search with the Semantic
Web
Z39.50, MARC EDI Changed Info and
Economic Commerce

18
Back to search, and about NOVA
NOVA is a language-independent, semantic search
engine that can be trained on any ontology to
yield custom classification of any data form.
Based on 9 years of research in SunLabs, it
employs a patented, unique passage retrieval and
semantic analysis algorithm for superior
inference with automatic stemming and Natural
Language Processing procedures to infer matches
on context and proximity. Cf. April, 2004 ACM
Queue Cover Article on NOVA Searching versus
Finding Why Systems Need Knowledge to Find What
You Really Want
19
NOVA's Semantic Inferencing/Clustering
NOVA enables phrasing and predicate matching for
language structures. These variants can, in
turn, be browsed as a component of a given search
result set, with clustering
20
NOVA's Semantic Inferencing
Results can then be browsed in context, with each
semantic node revealing relative number of hits.
Upon selection of the occurrences, a jump to
passage feature is dynamically inserted in the
target enabling quick review and assessment of a
given result set.
21
RDF and NOVA Integration
RDF Parse
S1 Portal Search Robot
Robot uses in crawl
1. Updates taxonomy dynamically
2. Robot metadata auto-classification Rules
Sun Portal Server NOVA
NOVA Rules stored as RDFS
22
Taxonomy Introspection Mapping
XSLT conversion to Search Database Records (RDs),
include UIDs
XSLT conversion to Search Database Records (RDs),
include UIDs
23
NCI Oncology
24
(No Transcript)
25
Integrated, Searchable Weighted Annotation
26
Browse Annotation weighting and Subscription,
Access-controlled
After indexing, searches use the GO/NCI links
27
1. Aggregative loading of taxonomies (
creation, RDF for mgmt)
2. Iterative inferencing builds core associative
knowledge base, export to RDF compilable
3. Apply across applications, taxonomy systems
for a VIB ontology, with access management and
interest subscription
28
Identity-Enabled Secure Knowledge Network
Information in various representations
CAD Applications
DSAtlas Data Info. Acecss
Ontologies, Dictionaries
OPACs
Identity Attributes Functions
29
Applying SemWeb Tools
30
Annotation
  • Definition Ability to attach a comment with
    weighting to a specific item and to share it with
    other users
  • Annotation is done through search result sets,
    other search tools' data, federated with NOVA,
    can be annotated and subscribed to, weighted,
    access mng'd
  • Tools are supported out-of-the-box with Java
    Enterprise System and NOVA search
  • API's in C and Java enable easy federation of
    diverse data sources, including web service'd
    JDBC calls
  • Annotation, combined with browsing, can allow
    subscription to different and merged taxonomies

31
(No Transcript)
32
Dynamic Meta-markup in NOVA using KAIN or COHSE
Based on COHSE, KAIN uses proxy and an ontology
to auto-markup pages with non-intrusive link-lets
(little orange icons inline with text), flagging
terms in a given ontology (DAMLOIL, OWL, RDF,
etc.), rollover gives highlight of matched term,
original site layout untouched.
33
What is KAIN?
KAIN is Suns tool for Knowledge Assisted
Intranet Navigation, based on COHSE (Conceptual
Open Hypermedia) project from the Univ
Manchester. The tool provides users with an
enhanced web navigation and browsing experience.
Webpages are automatically marked up - virtually
only to link users to all material relevant to
the context of what was queried. KAIN has not
been productized yet.
34
Integrated Metadata with any Query or URI
accept
edit
35
Additional info on tools
  • NOVA search, together with your distributed and
    managed content leverages taxonomies and browse
    for annotation, group/individual description,
    aggregation (e.g., NOVA search and management of
    annotations themselves)
  • Working with RDF via IFC-Host Java Libraries
    enables native RDF integration with NOVA which
    federates services and other data/search sources
  • Extend beyond the immediate collaborative
    consortia via Z39.50-based SRW/SRU (LOC ifx now
    available)
  • Harvest taxonomies and infer HourGlass view of
    new collaboration areas
  • StarOffice and OpenOffice render all content in
    XML, presentations in SVG, export XHTML and
    inline XSLT conversions with WebDAV support

36
Re-Cap
  • Immediate Results on Semantic Web Scale
  • NOVAs out-of-the-box ability to ingest
    ontologies
  • Taxonomy introspection and mapping ability to
    establish relationships amongst taxonomies
  • Immediate access available to authorized
    personnel (internal external)
  • NOVA provides semantic inferencing
  • Ease of integration with knowledge tools
  • All components are interoperable, addressable by
    multiple APIs, with WSDL, RSS, XML/XSLT all of
    which in turn are identity/auth managed
  • Ability to manage who sees what, target things to
    who most wants to see which

37
Contact for Resources
John.Robert.Gardner_at_sun.com
38
ebXML
  • ebXML Registry/Repository (sourceforge.net) from
    Sun manages services, visual taxonomy browsing
  • Used internnal at Sun for managing corporate Web
    Services Respository
  • Enables services (e.g., database reports, etc.)
    to be managed according to a variety of
    taxonomies
  • Provides graphical browsable display of
    taxonomies
  • Part of forthcoming Sun product line

39
Taxonomy/WS/KM Management with ebXML R/R
Write a Comment
User Comments (0)
About PowerShow.com