A Proposal for an Automatic LooseSpeak Interpreter - PowerPoint PPT Presentation

1 / 61

About This Presentation

Title:

A Proposal for an Automatic LooseSpeak Interpreter

Description:

Incorporate LIPS to a controlled language question answering system (Clark and Robinson 2003) ... based classification approaches unsuitable for LIPS because ... – PowerPoint PPT presentation

Number of Views:62

Avg rating:3.0/5.0

Slides: 62

Provided by: csUt8

Category:

more less

Transcript and Presenter's Notes

Title: A Proposal for an Automatic LooseSpeak Interpreter

1
A Proposal for an Automatic Loose-Speak
Interpreter

James Fan

2
Outline

What is loose-speak
Thesis
Frequency study
LIPS
Case study noun compound interpreter
Future work
Related work

3
Knowledge-Base (KB) Interaction

Knowledge acquisition (KA) knowledge engineers
adding assertions to a KB
Question answering (QA) users posing queries to
a KB

4
The Difficulties in KB Interaction

One cause of KB interaction difficulties KB
misalignments

5
KB Misalignment Example
encodings
assertions/queries
KB
HCl is yellow-greenish
H
has-part
HCl
Cl-
6
KB Misalignments Are Common

Misalignments are not limited to our KB.
The arbitrary nature of KR makes misalignment
unavoidable.
Misalignment problem gets worse when SMEs encode
knowledge.

7
KB Misalignments Are Difficult to Fix

Aligning encodings with a KB requires lots of
intimate knowledge about the KB.
Example In the Halo Project, a team of KEs spent
two weeks encoding 150 questions, and aligning
encodings was a significant part.

8
Naïve Encodings

Encodings without regard for the KB being
interacted with.
Pros straightforward, faithful and literal.
Cons often misaligned with KBs.

9
Correct Encodings

Encodings conveys the meaning of the input and
compatible with KBs.
Pros suitable for KB reasoning
Cons unintuitive, requires extensive knowledge
about the idiosyncrasies of KB.

10
Loose Speak

The KB interaction that results in a discrepancy
between a naïve encoding and a correct encoding
of the same input.

11
Loose Speak Example 1

Input assertion
The civil defense office has reported a clash
between policemen and demonstrators.

12
LS Example 1 (continued)
Naïve encoding

Discrepancies
metonymy
role
aggregate

Correct encoding
13
Loose Speak Example 2

Input query
What is the equilibrium constant of the
following reaction given that H2C2O4 is a
diprotic acid with K1 5.36?10-2 and K2
4.3?10-5, H2C2O42H2O?2H3O C2O42-?

14
LS Example 2 (continued)
Naïve Encoding

Discrepancies
Too generic concepts
Aggregate

Correct Encoding
15
Outline

What is loose-speak
Thesis
Frequency study
LIPS
Case study noun compound interpreter
Future work
Related work

16
Thesis

It is possible to have the best of both naïve
encodings and correct encodings by allowing users
to speak loosely, then automatically interpreting
the naïve encodings into correct encodings.

17
Challenges in Interpreting LS

Countless occurrences of KB misalignments
Each occurrence could constitute a different type
Challenge how to interpret so many different
types of KB misalignments?

18
Four Hypotheses

Frequency hypothesis LS concentrates on a few
types.
Bootstrapping hypothesis LS usually can be
interpreted using ONLY the knowledge from KBs
interacted with. No new LS knowledge is needed.
Prior knowledge hypothesis LS occurs when an
input is unrelated to any prior knowledge.
Related knowledge hypothesis LS can be
interpreted by searching the space around the
input.

19
Outline

What is loose-speak
Thesis
Frequency study
LIPS
Case study noun compound interpreter
Future work
Related work

20
Evaluation of Frequency Hypothesis

Goal find common LS types and estimate their
frequencies
Data
Corpus study 100 randomly chosen sentences from
each of the 3 corpora Brown, MUC Alberts.
Halo questions two sets of AP test questions
(200 questions) in the form of English sentences.

21
Methodology

Encode the sentences literally without regard for
the idiosyncrasies in our KB.
Encode the sentences according to the
idiosyncrasies of our KB.
Compare the two encodings and record any
misalignments.

22
LS Types

Metonymy attribute used for thing itself
Causal factor causes used for results
Too generic concepts generic concepts used for
specific ones
Roles things that are in the context of events
Aggregate individual used for a set
Spatial relations spatial relations used between
objects instead of locations
Temporal relations temporal relations used
between events instead of time
Noun compounds sequences of nouns without
semantic relations explicitly specified.
Metaphors one thing figuratively used to refer
another based on similarity

23
Metonymy

An attribute (metonym) is used in place of the
thing itself (referent)
Example Pearl Harbor caused US to declare war
against Japan.
Commonly used metonymic relations (Lakoff and
Johnson 1980)
PART-FOR-WHOLE
PRODUCER-FOR-PRODUCT
OBJECT-FOR-USER
CONTROLLER-FOR-CONTROLLED
INSTITUTION-FOR-PEOPLE-RESPONSIBLE
PLACE-FOR-INSTITUTION
PLACE-FOR-EVENT

24
Causal Factor

Results in a causal chain are described by their
causes.
Example What is the result of mixing NaOH and
HCl?
Not viewed as metonymy because causal relation is
excluded from Lakoff Johnsons list.

25
Frequency Study Results
26
Analysis

Frequency hypothesis validated because LS does
concentrate on a few types.
Frequencies of different types of LS vary across
different domains.

27
Outline

What is loose-speak
Thesis
Frequency study
LIPS
Case study noun compound interpreter
Future work
Related work

28
Loose-speak Interpreter Requirements

Automaticity intrude upon the KB interaction as
little as possible.
Coverage cover all the important LS types, and
be easily extended to handle additional types
should they occur.

29
Loose-speak InterPretation System (LIPS) Overview
The traditional KB interaction model.
The LIPS KB interaction model.
30
Outline

What is loose-speak
Thesis
Frequency study
LIPS
Case study noun compound interpretation
Future work
Related work

31
Noun Compound Interpretation

Noun compounds a sequence of nouns composed of a
head noun and one (or more) modifiers, such as
concrete floor.
Noun compound interpretation find a sequence of
semantic relations that links the nouns in a
compound,
Example animal virus ?
Noun compound in KA given a new concept made of
a noun compound, and its constituent nouns
knowledge is often skeleton.

agent
object
basic-structural-unit
Virus
Invade
Cell
Animal
32
Bootstrapping Hypothesis in Noun Compound
Interpretation

Bootstrapping hypothesis
Supported if noun compounds can be interpreted
without much knowledge about the constituent
nouns.
Example concrete floor interpreted without
knowing that concrete is made of sand, gravel and
cement

33
Ablations

Impact of each level of the ontology measured
through a series of ablations.
When a level is ablated, the concepts on that
level and all their axioms are deleted from the
KB.

Ablation of level 1 on a sample taxonomy
34
Related Knowledge Hypothesis in Noun Compound
Interpretation

Related knowledge hypothesis
Validated if a good percentage of noun compounds
are interpreted correctly by searching the space
around the constituent nouns.

35
Noun Compound Interpreter Algorithm

Given noun compound
A breadth-first search starting from C1, stops
when C2 or any super/subclass of C2 is found
A breadth-first search starting from C2, stops
when C1 or any super/subclass of C1 is found
Example

36
KBs

Three domains
Biology textbook
Small engine repair manual
Sparcstation manual
Share top level ontology, but few other concepts
in common

37
Measurements

P precision and R -- recall
Csystem -- of correct answers by the system,
i.e. the number of correctly interpreted noun
compounds
Asystem -- of answers by the system, i.e. the
number of interpreted noun compounds
Cpossible -- total of possible correct answers,
i.e. the number of noun compounds tested

38
Measurements (continued)

If R ? and P ? then
Csystem ? because Cpossible remains the same
Asystem ? ?
Cpossible - Asystem number of uninterpreted
inputs ?

39
Results
40
Results (continued)

Precisions are 93.8, 85.2 and 84.5 recalls
are 93.8 74.5 and 73.2 without ablations
Ablating level 1 causes a big drop in both
precision recall.
Ablating level 2 causes a gap between precision
and recall, which indicates no interpretations
are found for many noun compounds.
As lower levels are ablated, the impact
diminishes.

41
Analysis

Related knowledge hypothesis validated because
good percentage of noun compounds are interpreted
correctly.
Why are the top levels of the ontology the most
important? Two possible reasons
Because they contain more knowledge?
Because their knowledge is more important?

42
Axiom-per-Level Count

Use axiom counts as measurement of knowledge
amount
Count the local axioms only

43
Axiom-per-level Count Results
44
Analysis (2)

Therefore top levels knowledge is more important
than lower levels.
Bootstrapping hypothesis validated because only
the top levels are needed for noun compound
interpretation task.

45
Outline

What is loose-speak
Thesis
Frequency study
LIPS
Case study noun compound interpreter
Future work
Related work

46
Future Work

Extend LIPS
Evaluate LIPS

47
Extending LIPS

Implement interpreters for other types of LS.
Based on our hypotheses, other types of LS can be
detected and interpreted using similar KB-search
methods.

48
Extending LIPS (continued)

Example given assertion HCl is
green-yellowish,
Naïve encoding
LS Detection
HCl doesnt have any knowledge of color and
The domain of color relation does not apply to
HCl

49
Extending LIPS (continued)

LS interpretation

H
has-part
HCl
Cl-
Is-basic-structural-unit-of
Chemical
color
Color
50
LIPS Evaluation Environments

Incorporate LIPS to SHAKEN
A knowledge acquisition system
Used in RKF project
To be used in Halo II and CALO projects.
Incorporate LIPS to a controlled language
question answering system (Clark and Robinson
2003).
A question answering system
Proposed for Halo II

51
New Applications

Halo II
Follow on of the Halo project
Three science domains
Aimed to enable domain experts to encode
knowledge modules and untrained end-user to pose
questions and problems to those knowledge
modules.
EPCA
Office procedure domain
Naïve users interacting KB

52
Outline

What is loose-speak
Thesis
Frequency study
LIPS
Case study noun compound interpreter
Future work
Related work

53
Ontology Merging

Combine several ontologies into one standard
ontology. Niles and Pease 2001 McGuinness et
al. 2000 Noy and Musen 1999 Chalupsky 2000
Similar to LS
Concerned with resolving representational
differences.
Different from LS because
Objects being replaced are different
Need for automation is different

54
Semantic Interpretation

Map the representation of utterances in natural
languages into formal representation of meaning.
Ratnaparkhi 1997 Yarowsky 1992 Gardent and
Webber 1998
Similar to LS
To translate the ambiguities in natural
languages.
Different from LS because
Emphasis on structural, semantic and scope
ambiguities.

55
Noun Compounds

Noun compounds in NLP classify noun compounds
into different categories.
Similar to LS
Try to disambiguate the underlying semantic
relation in noun compounds.
Different in interpretation
Semantic category vs. sequence of semantic
relations, e.g. animal virus

56
Noun Compounds (continued)

Rule based approaches (Rosemary 1984)
Example If modifier is a material, then its a
made-of category such as in marble statue.
Machine learning based classification approaches
(Lauer 1994, Barker 1998)
KB search based approaches (Vanderwende 1994)

57
LIPS and Noun Compounds

Rule based approaches unsuitable because
Not as flexible
Requires additional knowledge, which may erase
the KA gains by automatically interpreting LS.
Machine learning based classification approaches
unsuitable for LIPS because
Lack of training examples
Abundance of knowledge
Need semantic relations instead of semantic
category as the end goal

58
Metonymy Interpretation

Rule based approaches (Weischedel and Sondheimer
1983, Grosz et al. 1987, Lytinen et al. 1992,
Fass 1997)
Meta rules on how to interpret metonymy
Pros
Easy to implement
Cons
Can only handle a fixed number of metonymy types
May need different rules for different domains

59
Metonymy Interpretation (continued)

KB-search based (Browse 1978, Markert and Hahn
1997, Harabagiu 1998)
Rely on searches in general purpose knowledge
bases.
Pros
No restriction on the types of metonymy it can
interpret.
No changes needed for different domains
Cons
Requires a large KB

60
LIPS and Previous Metonymy Approaches

LIPS similar to KB-based approaches
Difference
Uses prior knowledge hypothesis instead of type
violation to detect metonymy
Example
KB Car has-part Engine, Engine has-part
Carburetor
Query the carburetor part of a car?
Other metonymy systems no metonymy found because
of no type violations
LIPS metonymy found because no prior knowledge
of carburetor part of a car is found.

61
Conclusion