Title: Knowledge Systems and Project Halo
1Knowledge Systems and Project Halo
In collaboration with SRI (Vinay Chaudhri) and
Boeing (Peter Clark)
2Knowledge Systems
- Knowledge Systems are formal representations of
knowledge capable of answering unanticipated
questions with coherent explanations - Knowledge System KB Q/A
Explanation Generator Knowledge Acq.
tools
3Project Halo
- Funded and administered by Vulcan, Inc a Paul
Allen company - Objective to assess the state of the art of
knowledge systems computer programs that know a
lot and answer tough questions with coherent
explanations - Method administer an AP Chemistry exam to
knowledge systems built by 4 teams of researchers
4A Significant Advance over Expert Systems
- Coverage
- Reasoning
- Explanation
- Rapid construction
5KM A Logic Programming Language
- able to represent
- classes, instances, prototypes
- defaults, fluents, constraints
- (hypothetical) situations
- actions (pre-, post-, and during- conditions)
- and reason about
- inheritance with exceptions
- deductive and abductive inference (with
constraints) - automatic classification (given a partial
description of an instance, determine the classes
to which it belongs) - temporal projection (my car is where I left it)
- affects of actions
6A Simple Example
- When 70 ml of 3.0-Molar Na2CO3 is added to 30 ml
of 1.0-Molar NaHCO3 the resulting concentration
of Na is - 2.0 M
- 2.4 M
- 4.0 M
- 4.5 M
- 7.0 M
7Question Representation
8Background Knowledge
- Chemistry laws
- Concentration of a solute
- Composition of strong electrolyte solutions
- Conservation of mass
- Conservation of volume
- etc.
9Law 1 Concentration of a Solute
Note when this law is applied, using Novaks
code, the quantities are automatically converted
to the units-of-measurement specified here
10Law 1 Quantity of a Solute
- Law 1 (on the previous slide) computed
- Concentration quantity / volume
- Of course, a slight variant computes
- Quantity concentration volume
- Currently, we code this variant as a separate law
(call it 1) because it has a slightly different
explanation template
11Law 2 Composition of Strong Electrolytes
12Law 3 Conservation of Mass
13Law 4 Conservation of Volume
14Step 1 Reclassify Terms
15Step 2 Use Law 1 to Compute Concentration
16The Search is non-deterministic
- Multiple laws might be used to compute a value
for any property. For example, heres another
way to compute concentration - pH - log H, where H is the concentration
of H - Since this applies only to H, this search path
ends quickly
17Step 3 Use Law 4 to Compute Volume
18Step 4 Use Law 3 to Compute Quantity
19Step 5 Use Law 2 to Compute Quantity of Ionic
Parts
20Step 6 Use Law 1 to Compute Quantity
21Step 7 Wind out of Law 2 from step 5
22Step 8-10 Similar to steps 5-7
23Step 11 Wind out of Law 3 from Step 4
24Step 12 Wind out of Law 1 from Step 2
25Question 26 Answer
- When 70 ml of 3.0-Molar Na2CO3 is added to 30 ml
of 1.0-Molar NaHCO3, what is the resulting
concentration of Na?. - The concentration of a chemical in a mixture is
the quantity of the chemical divided by the
volume of the mixture. - By the Law of Conservation of Mass, the
quantity of a chemical in a mixture is the sum of
the quantities of that chemical in - the parts of the mix.
- In the na2co3 strong-electrolyte-solution and
the nahco3 strong-electrolyte-solution - In the na-plus
- Multiply the concentration and the volume
- 3 molar 70 milliliter 0.21 mole.
- The quantity of na-plus in the na-plus is
0.42 mole. - In the co3-2
- The quantity of na-plus in the co3-2 is 0
mole. - Multiply the concentration and the volume
- 1 molar 30 milliliter 0.03 mole.
- In the na-plus
- The quantity of na-plus in the na-plus is
0.03 mole. - In the hco3-
- The quantity of na-plus in the hco3- is 0
mole. - The quantity of na-plus in the na2co3
strong-electrolyte-solution and the nahco3
strong-electrolyte-solution is 0.45 mole. - Therefore, the quantity of na-plus 0.45
mole.
26Results of Project Halo
- After 4 month development effort, the knowledge
systems were sequestered and given a test - 165 novel questions 50 multiple choice 115 free
form response - Questions translated from English to formal
language by each team, then assessed for fidelity
by an independent committee - High likelihood of long term follow on
27Correctness
- The SRIs team correctness score corresponds to
an AP score of 3 high enough for credit at
UCSD, UIUC, and many other universities. - Weve predicted scoring 85 after a 3 month
follow-on project.
28Explanation Quality
29Our Long Term Goal
- to enable distributed communities of domain
experts to build knowledge systems in their area
of expertise - without direct help from knowledge engineers
- working with familiar concepts and without
writing axioms - with little more effort than writing technical
papers
30Our Current Focus
- Insight even domain-specific representations
contain common abstractions - Approach we build a library consisting of
- a small hierarchy of reusable, composable,
domain-independent knowledge units (components) - a small vocabulary of relations to connect them
- then domain experts build representations by
instantiating and composing these components
31Building a Representation Compositionally
Soil
Rate
contains
I-
I-
Q
environment
Q-
rate
agent
Bio- technologist
Bioremediation
Amount
Amount
amount
amount
script
remediator
product
pollutant
agent
Oil
Fertilizer
Microbes
Script
patient
se
se
se
se
patient
agent
absorbed
product
Break Down
Get
Apply
Absorb
then
then
then
32An underlying abstraction...
Soil
Rate
contains
I-
I-
Q
environment
Q-
rate
agent
Bio- technologist
Bioremediation
Amount
Amount
amount
amount
script
remediator
product
pollutant
agent
Oil
Fertilizer
Microbes
Script
patient
se
se
se
se
patient
agent
absorbed
product
Break Down
Get
Apply
Absorb
then
then
then
Rate
I-
I-
Q
Q-
rate
Conversion
Amount
Amount
amount
amount
raw- materials
product
Substance
Substance
33Another abstraction...
Soil
Rate
contains
I-
I-
Q
environment
Q-
rate
agent
Bio- technologist
Bioremediation
Amount
Amount
amount
amount
script
remediator
product
pollutant
agent
Oil
Fertilizer
Microbes
Script
se
se
se
patient
se
patient
agent
absorbed
product
Break Down
Get
Apply
Absorb
then
then
then
Digest
food
script
eater
Substance
Agent
Script
agent
se
patient
se
absorbed
agent
Break Down
Absorb
then
34Another abstraction...
Soil
Rate
contains
I-
I-
Q
environment
Q-
rate
agent
Bio- technologist
Bioremediation
Amount
Amount
amount
amount
script
remediator
product
pollutant
agent
Oil
Fertilizer
Microbes
Script
patient
se
se
se
se
agent
patient
absorbed
product
Break Down
Absorb
Get
Apply
then
then
then
Treatment
script
substance
Script
substance
se
patient
patient
Get
Apply
then
35Examples of Concepts Described Compositionally
- a Fuel-Cell is a Producer of Electricity
- a Bulb is an Electrical Resistor that Produces
Light - a Camera is an Image Recording Device
- a Wire is a Conduit of Electricity
36A Library of Components
small
- easy to learn
- easy to use
- broad semantic distinctions (easy to choose)
- allows detailed pre-engineering
37Library Contents
- actions things that happen, change states
- Enter, Copy, Replace, Transfer, etc.
- states relatively temporally stable events
- Be-Closed, Be-Attached-To, Be-Confined, etc.
- entities things that are
- Substance, Place, Object, etc.
- roles things that are, but only in the context
of things that happen - Container, Catalyst, Barrier, Vehicle, etc.
38Library Contents
- relations between events, entities, roles
- agent, donor, object, recipient, result, etc.
- content, part, material, possession, etc.
- causes, defeats, enables, prevents, etc.
- purpose, plays, etc.
- properties between events/entities and values
- rate, frequency, intensity, direction, etc.
- size, color, integrity, shape, etc.
39Computational Semantics
- Knowledge about Enter
- instances of Enter inherit axioms from Move, such
as the action changes the location of the object
of the Move - before the Enter, the object is outside some
enclosure - after the Enter, the object is inside that
enclosure and contained by it - during the Enter, the object passes through a
portal of the enclosure - if the portal has a covering, it must be open
and unless it is known to be closed, assume that
its open - etc.
40Searching the Library
- browsing the hierarchy top-down
- WordNet-based search
- all components have hooks to WordNet
- climb the WordNet hypernym tree with search terms
- assemble Attach, Come-Togethermend Repairinfil
trate Enter, Traverse, Penetrate,
Move-Intogum-up Block, Obstructbusted Be-Broke
n, Be-Ruined
41First Challenge Problem
- To enable biologists to encode college-level
textbook knowledge about cells - A small example mRNA-Transport
- mRNA is transported out of the cell nucleus into
the cytoplasm - Transport Move-Out-Of
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)
47unify
48(No Transcript)
49Evaluation
- Can Domain Experts learn to use the library to
encode domain knowledge? - Can sophisticated knowledge be captured through
composition of components?
50Methodology
- train biologists (4 graduate students) for six
days - have them encode knowledge from a college
textbook, Essential Cell Biology by Bruce Alberts - supply end-of-the-chapter-style Biology questions
- have the biologists pose the questions to their
knowledge bases and record the answers - have another biologist evaluate the answers on a
scale of 0-3 - qualitatively evaluate their KBs
51Some Example Questions
- What nucleotide base pairs with adenine in RNA?
- How is uracil in RNA like thymine in DNA?
- What is the relationship between thymine and
uracil? - For a given bacterial gene, how are bacterial
RNA and DNA molecules different? - Describe RNA as a kind of polymer.
- What are the four bases/nucleotides of RNA?
- What is the relationship between a DNA gene and
its RNA transcription product?
52Evaluation Productivity
53Evaluation Question Answering
54Summary
- Knowledge Systems offer significant benefits
compared with expert systems - Multi-functional knowledge bases can be built
- by domain experts, almost
- and they will be, with or without sound
principles of ontological engineering - and ontologists can significantly improve the
results
55Discussion
- Will the idiosyncrasies of specific domains
overshadow the commonalities coded in the
component library? - How can NLP be used to pull information from text
to build knowledge systems? - How can knowledge acquisition systems use machine
learning?