Interpreting Loosely Encoded Questions - PowerPoint PPT Presentation

1 / 39

About This Presentation

Title:

Interpreting Loosely Encoded Questions

Description:

as part of Project Halo. What is the conductivity of the following ... Halo ... two sets of questions from Project Halo (150 questions in total). Types of ... – PowerPoint PPT presentation

Number of Views:81

Avg rating:3.0/5.0

Slides: 40

Provided by: labwa

Category:

more less

Transcript and Presenter's Notes

Title: Interpreting Loosely Encoded Questions

1
Interpreting Loosely Encoded Questions

James Fan and Bruce Porter
University of Texas at Austin

Full support for this research was provided by
Vulcan Inc. as part of Project Halo
2
Problem
KB
English question
end user
question encoding
3
Task

Context end users pose questions to
knowledge-based question-answering systems
without intimate knowledge of the structure of
the knowledge base
Task translate end users encodings so that they
align with the KB.

4
Input and output

naïve encoding a question encoded without regard
for the structure of the knowledge base. Naïve
encodings are often literal translations from the
original English expressions, i.e. the form of
questions we should expect from end users.
correct encoding a question encoding that aligns
with the structure of the knowledge base.

5
Loose speak

loose speak the part of an encoding that fails
to align with the knowledge base.
Not meant to be pejorative. Loose refers to the
imprecise way that people form English
expressions.

6
Project Halo phase I

Three systems for Advanced Placement chemistry.
(Barker et al. 2004 Angele et al. 2003)
A chemistry KB is built.
The best KB answers enough questions to score 3.
Knowledge engineers encode 160 English test
questions (10 man-weeks).

7
Project Halo phase II

Develop a knowledge-acquisition tool that will
enable domain experts in the sciences to
independently formulate and debug high quality,
reusable knowledge modules.
Develop a knowledge-based question-answering
system that allows an untrained end-user to pose
questions and problems to those underlying
knowledge modules.

8
Examples (continued)

When dilute nitric acid was added to a solution
of one of the following chemicals, a gas was
evolved. This gas turned a drop of limewater,
Ca(OH)2 cloudy, due to the formation of a white
precipitate. The chemical was
(a) household ammonia, NH3
(b) baking soda, NaHCO3
(c) table salt, NaCl
(d) epsom salt, MgSO4? 7H2O
(e) bleach, 5 NaOCl

9
Examples (continued)

Which of the following aqueous solutions has the
lowest conductivity?
(a) 0.1 M CuSO4
(b) 0.1 M KOH
(c) 0.1 M BaCl2
(d) 0.1 M HF
(e) 0.1 M HNO3

10
Burden of interpreting loose speak

Without interpreting loose speak, question
encodings will not yield right answers.
Without intimate knowledge of the knowledge base
structure, no loose speak will be interpreted.
Obtaining such intimate knowledge about the
knowledge base and interpreting loose speak is a
heavy burden.

11
Previous approaches

Restrict expressiveness
keywords
question templates, such as what happens to
during ?.
Unsuitable for questions, such as the previous
examples.
Educate users
But for different KBs require different
education.
Unsuitable for untrained end-users.

12
Project goal

To improve knowledge-based question-answering
systems by automating the interpretation of loose
speak to produce correct encodings of questions
Input a naïve encoding of a question.
Output an encoding of the input question that
conveys the intended semantics of the input, and
does not contain loose speak.

13
Project Goal
Goal
Interpreter
KB
English question
end user
question encoding
14
Study 1 types of loose speak

Purpose since a naïve encoding may differ from a
correct encoding in many ways, we need to
discover types of loose speak.
Methodology compare naïvely encoded questions
with the correct encodings.
Data two sets of questions from Project Halo
(150 questions in total).

15
Types of loose speak
16
Types and frequencies
17
Algorithm

Overview reuse the knowledge in the KB being
queried.
Made of a test and repair function.
Test check to see if an input contains loose
speak based on constraint violation and the
knowledge in the KB.
Repair finds a list of interpretations based on
spread activation on the KB using the input as
anchor points.

18
Example

Question Hydrolysis of NaCH3COO yields?
a strong acid and a strong base
a weak acid and a weak base
a strong acid and a weak base
a weak acid and a strong base
none of the above

19
Example (continued) test

There is no constraint violation because the
domain of raw-material is Event, the range of
raw-material is Tangible-Entity, Hydrolysis is an
Event, and NaCH3COO is a Tangible-Entity.
However, it detects a loose speak because there
is no super or subclass of Hydrolysis whose
raw-material is a super or subclass of NaCH3COO
in the KB.

Hydrolysis
raw-material
result
?
NaCH3COO
intensity
?
20
Example (continued) repair

Breadth-first search starting from Hydrolysis.
Spread activation terminates when it finds a
super or subclass of NaCH3COO.

Time-Interval
Chemical-Entity
time
has-basic-structural-unit
Hydrolysis
raw-material
site
Place
Chemical
Halo KB
21
Example (continued)
Hydrolysis
result
raw-material
?
Chemical
intensity
?
has-basic-structural-unit
NaCH3COO
22
Study 2 interpreter performance

Data
50 multiple choice questions from AP chemistry
practice tests.
Distinct from the data used in frequency study.
Users
3 users with different background in knowledge
engineering and chemistry.
Given a brief 3-page tutorial on encoding
question. Not complete tutorial on using the KB.
Measurements
precision and recall.

23
Experimental results
24
Discussion and analysis

Loose speak is very common on average 91.3 of
the encodings by the users contain loose speak.
None of the encodings that contain loose speak
would be correctly answered by our knowledge
base.
The loose speak interpreter works well in our
test precision 95, recall near 90.

25
Related work

Metonymy
Based on a set of rules (Weischedel Sondheimer
1983 Grosz et al. 198 Lytinen, Burridge
Kirtner 1992 Fass 1997).
Based on KB-search (Browse 1978 Markert Hahn
1997 Harabagiu 1998).
KB-search in knowledge acquisition (Davis 1979
Kim Gil 1999 Blythe 2001).

26
Summary

Defined loose speak as the part of a question
encoding that misaligns with existing knowledge
base structures.
Preliminary evaluation shows that loose speak is
common.
The interpreter can detect and interpret most
occurrences of loose speak correctly in our test.

27
Future work

Expand the investigation of loose speak into
other aspects of knowledge base interaction, such
as knowledge acquisition.

28
Why doesnt traversal order matter? (most of the
time).

Interpretation of an edge does not affect other
edges because the interpreter does not alter the
original head and tail, and does not depend on
the interpretation of other edges.
Except
overly generic concept type of loose. Use
backtracking.
queries process them last, and process in the
direction of the edges in the queries.

29
Example (continued)
30
Why didnt you use a more sophisticated search?

Deeper search is not better. A very deep search
will return encodings that are not closely
related to the input, therefore they are less
likely to convey the intended meaning of the
input.
If only shallow search is needed, then a brute
force is sufficient.

31
Isnt everything related to something in the
taxonomy? So most search results must be useless.

The semantic relations in the searches do include
subclasses relation, but they do not include
superclasses relation.
If both superclasses and subclasses are
included, then any concept can be found from
another by climbing up and down the taxonomy, and
a large number of spurious interpretations may be
returned.

32
Precision and recall definition

Measurements (Jurafsky Martin 2000)
Precision of correct answers given by system
/ of answer given by system
Recall of correct answers given by system /
total of possible correct answers
of correct answers given by system is the
of question encodings interpreted correctly .
of answer given by system is the of
question encodings the interpreter detects loose
speak and finds an interpretation.
total of possible correct answers is the
number of all question encodings that contain
loose speak.

33
Experiment details

tp inputs contain LS, and they are interpreted
correctly
fp inputs don't contain LS, but they are
interpreted
tn inputs don't contain LS, and they are not
interpreted
fn inputs contain LS, but they are not
interpreted
Special cases
If an input has syntax mistakes, such as missing
paren, use set filler instead of single inst.
fixed versions are used
If an input causes interpreter to crash, then it
counts as no interpretations found (hence fn) no
matter what the cause of the crash is (could be
KB or really really bad encoding)
If the interpretation solves the LS in an input
correctly even if the result isn't the perfect
encoding for the question, it counts as true
positive
If the input has LS and the interpretation is
incorrect or partial correct, then it counts as
fn

34
Test repair test

Constraint violation
If the edge violates structural constraints, then
it must contain loose speak (because correct
encodings are consistent with the structure of
the knowledge base)
Returns many true positives and false negatives.
Resemblance test
If the input does not resemble any existing
knowledge, then it may contain loose speak
because studies have shown that one frequently
repeats similar versions of general theories
(Clark, et al. 2000).
Returns many false positives and true negatives.

35
Test repair test (continued)

Constraint violation implemented as a test for
domain and range violation of the relation in an
edge
Resemblance test
if the edge represents a query, then it passes
the test only if the KB can compute one or more
fillers for the tail.
Otherwise, passes if the KB
contains an edge such that
Headkb subsumes or is subsumed by Headq and
Tailkb subsumes or is subsumed by Tailq.

36
Test and repair repair

Given , repair is implemented as two
breadth-first procedures
search_head start at C1, traverse all semantic
relations, stop when a suitable instance C3 is
found. C3 is suitable if does not
contain loose speak. The successful search path
is returned.
search_tail similar search starting from C2.

37
Example (continued) interpreting loose speak

the domain of intensity is Thing, and the range
is Intensity-Value. Because the result of
Hydrolysis is a Chemical, which is a Thing, it
passes the constraint violation test.
However because the query about intensity does
not return any value, it fails the resemblance
test.

Hydrolysis
raw-material
result
?
Chemical
intensity
has-basic-structural-unit
?
NaCH3COO
38
Example (continued) interpreting loose speak

search_head finds that the Base-Role played by
the resulting Chemical has an intensity value.
Its a role type of loose speak.

Chemical-Entity
has-basic-structural-unit
Chemical
Intensity-Value
plays
intensity
Base-Role
Halo KB
39
Example (continued) interpreting loose speak
Hydrolysis
result
raw-material
?
Chemical
plays
?
has-basic-structural-unit
NaCH3COO
intensity
?

Write a Comment

User Comments (0)