Title: Using a Semantic Wiki as a Knowledge Source for Rich Modeling and Question Answering
1Using a Semantic Wiki as a Knowledge Source for
Rich Modeling and Question Answering
- Vinay K. Chaudhri1, Mark Greaves, Daniel Hansch3,
Anthony Jameson4, Frederik Pfisterer3, - Aaron Spaulding1, and Moritz Weiten3
- 1. SRI International, Menlo Park, CA, USA.
- 2. Vulcan Inc, Seattle, WA 3. ontoprise GmbH
- 4. DFKI
2Symbiosis between KE SW
- Knowledge Engineering
- Expressive knowledge representation
- Sophisticated testing and debugging
- High training requirement
- Semantic Web
- Simpler knowledge representation
- Works by creating links and references
- Almost walk up and use
3Symbiosis between KE SW
- AURA
- Acquires knowledge for deductive Q/A that can be
used for answering AP questions in sciences - Uses a DL style class taxonomy, and logic
programming style rules with many extensions - Requires 40 hours of training for knowledge
formulation - Semantic Media Wiki
- Tool for online authoring of semantic web content
- Captures knowledge at the level of RDFS
- Almost walk up and use system
4Symbiosis between KE SW
- Can we use the Semantic Media Wiki to capture
knowledge that could be used for Q/A in AURA? - Factual knowledge
- The atomic number for hydrogen is 1
- The solubility constants
- Taxonomic knowledge
- Eukaryotic and Prokaryotic are two types of cells
5Symbiosis between SW KE Payoffs
- We can make use of contributions from users with
much less training than 40 hours needed for AURA - Knowledge creation is faster, distributed, and
cheaper - Manage the evolution of concepts and link types
in the Semantic Wiki - Collaboration and consensus building ontology tool
6Outline
- The AURA System
- Media Wiki
- AURA/Media Wiki Symbiosis
7Outline
- The AURA System
- Media Wiki
- AURA/Media Wiki Symbiosis
8Introduction
- Long term goal to build a Digital Aristotle an
application that can answer questions on a
variety of topics and provide user and domain
appropriate explanations. - Mid-term goal to pass Advanced Placement tests
in Chemistry, Biology, and Physics - Short-term goal Enable scientists to author
knowledge and high school students to ask
questions - Limited to 50 pages from each domain
9AURA
- Automated User-Centered Reasoning and
Acquisition System - Aura is a tool to help users formalize knowledge
- Aura can then reason with that knowledge
- So users can ask questions and understand the
answers.
10AURA Design
- Extensive domain analysis identified four classes
of frequently occurring textbook knowledge - Conceptual knowledge
- Equations
- Tables
- Diagrams
- User surveys revealed three classes of user
requirements - Blank slate problem
- Support for full life cycle
- Training and usability
11Textbook Knowledge Types
2Ca(s) O2(g) ? 2CaO(s) h ½gt2
Conceptual
Equations
Diagrams
Tables
12AURA Desktop
13Document Rooted Interface
14Knowledge Capture Conceptual Knowledge
15Knowledge Capture Conceptual Knowledge
- Based on prior work with Shaken CLIB
- Clark et. al., KCAP2001,
- Chaudhri, et. al. EKAW, 2003
- Barker et. al., KCAP2001
- (forall ?c
- (gt (instance-of ?c Eucaryotic-Cell)
- (exists ?x ?y ?z
- (and
- (instance-of ?x Nucleus)
- (instance-of ?y Chromosome)
- (instance-of ?z Plasma-Membrane)
- (has-part ?c ?x) (has-part ?c ?y)
- (has-part ?c ?z) (is-inside ?y ?x)))))
16Knowledge Capture Equations
17Knowledge Capture Equations
18Knowledge Capture Tables
19Question Formulation Controlled English
An alien measures the height of a cliff by
dropping a boulder from rest and measuring the
time it takes to hit the ground below. The
boulder fell for 23 seconds on a planet with an
acceleration of gravity of 7.9 m/s2. Assuming
constant acceleration and ignoring air
resistance, how high was the cliff?
?
A boulder is dropped. The initial speed of the
boulder is 0 m/s. The duration of the drop is 23
seconds. The acceleration of the drop is 7.9
m/s2. What is the distance of the drop?
20Question Formulation Interface
21Explanation Interface
22Efficacy of knowledge in answering questions
Domain Number of Questions Correct Correct
Domain Number of Questions 2006 2007
Biology 146 38 38
Chemistry 86 38 70
Physics 131 19 71
Knowledge Formulation 6 domain experts, 2 per
domain, 40 hours of AURA training, 80 hours of
knowledge formulation
Question Formulation 6 new domain experts, 2 per
domain, 6 hours of AURA training, 16 hours of
question asking
23Outline
- The AURA System
- Media Wiki
- AURA/Media Wiki Symbiosis
24Wikipedia Article on Organelles
25Source Text of That Article
26Fact Box Summarizing the Annotations
27The Query Interface
28Auto-completion in the Query Interface
29Table Showing the Result of a Query
30The Source Text of the Result Table
31Adding Annotations
32Ontology Browser
33Open Source Availability
http//sourceforge.net/projects/halo-extension/
34Outline
- The AURA System
- Media Wiki
- AURA/Media Wiki Symbiosis
35Technical Design Issues
- The knowledge may not be clean
- Use a Wiki Gardener who can serve as an editor of
the online contributions - Set up negative feedback loops that encourage and
help users to correct problems as they stumble
upon them - Empty row for Centrosome in table ? Visit
Centrosome page and add or correct annotations - (Other negative feedback loops not shown in the
slides) - Vocabulary mismatches between AURA Wiki
- Use a mapping tool to relate the two vocabularies
- Prime Wiki with the AURA vocabulary
36Example Use Case
- AURA knowledge formulation engineer searches for
knowledge during knowledge formulation - The KFE notices useful information in Wiki
- The KFE maps the knowledge into AURA
- The knowledge is translated into AURA and
available for querying
37AURA User Searches for Information
38AURA User Notices Useful Information
39AURA user maps knowledge into AURA
40Knowledge is available for Q/a
41Status and Challenges
- Giving guidance on what to annotate
- Identify missing information by queries
- Conceptual KE tasks partonomy
- Suggest annotations
- Representation mismatches
- Concept vs. Individual
- A large scale evaluation is planned for Fall 2008