A Proposal for an Automatic LooseSpeak Interpreter - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

A Proposal for an Automatic LooseSpeak Interpreter

Description:

Incorporate LIPS to a controlled language question answering system (Clark and Robinson 2003) ... based classification approaches unsuitable for LIPS because ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 62
Provided by: csUt8
Category:

less

Transcript and Presenter's Notes

Title: A Proposal for an Automatic LooseSpeak Interpreter


1
A Proposal for an Automatic Loose-Speak
Interpreter
  • James Fan

2
Outline
  • What is loose-speak
  • Thesis
  • Frequency study
  • LIPS
  • Case study noun compound interpreter
  • Future work
  • Related work

3
Knowledge-Base (KB) Interaction
  • Knowledge acquisition (KA) knowledge engineers
    adding assertions to a KB
  • Question answering (QA) users posing queries to
    a KB

4
The Difficulties in KB Interaction
  • One cause of KB interaction difficulties KB
    misalignments

5
KB Misalignment Example
encodings
assertions/queries
KB
HCl is yellow-greenish
H
has-part
HCl
Cl-
6
KB Misalignments Are Common
  • Misalignments are not limited to our KB.
  • The arbitrary nature of KR makes misalignment
    unavoidable.
  • Misalignment problem gets worse when SMEs encode
    knowledge.

7
KB Misalignments Are Difficult to Fix
  • Aligning encodings with a KB requires lots of
    intimate knowledge about the KB.
  • Example In the Halo Project, a team of KEs spent
    two weeks encoding 150 questions, and aligning
    encodings was a significant part.

8
Naïve Encodings
  • Encodings without regard for the KB being
    interacted with.
  • Pros straightforward, faithful and literal.
  • Cons often misaligned with KBs.

9
Correct Encodings
  • Encodings conveys the meaning of the input and
    compatible with KBs.
  • Pros suitable for KB reasoning
  • Cons unintuitive, requires extensive knowledge
    about the idiosyncrasies of KB.

10
Loose Speak
  • The KB interaction that results in a discrepancy
    between a naïve encoding and a correct encoding
    of the same input.

11
Loose Speak Example 1
  • Input assertion
  • The civil defense office has reported a clash
    between policemen and demonstrators.

12
LS Example 1 (continued)
Naïve encoding
  • Discrepancies
  • metonymy
  • role
  • aggregate

Correct encoding
13
Loose Speak Example 2
  • Input query
  • What is the equilibrium constant of the
    following reaction given that H2C2O4 is a
    diprotic acid with K1 5.36?10-2 and K2
    4.3?10-5, H2C2O42H2O?2H3O C2O42-?

14
LS Example 2 (continued)
Naïve Encoding
  • Discrepancies
  • Too generic concepts
  • Aggregate

Correct Encoding
15
Outline
  • What is loose-speak
  • Thesis
  • Frequency study
  • LIPS
  • Case study noun compound interpreter
  • Future work
  • Related work

16
Thesis
  • It is possible to have the best of both naïve
    encodings and correct encodings by allowing users
    to speak loosely, then automatically interpreting
    the naïve encodings into correct encodings.

17
Challenges in Interpreting LS
  • Countless occurrences of KB misalignments
  • Each occurrence could constitute a different type
  • Challenge how to interpret so many different
    types of KB misalignments?

18
Four Hypotheses
  • Frequency hypothesis LS concentrates on a few
    types.
  • Bootstrapping hypothesis LS usually can be
    interpreted using ONLY the knowledge from KBs
    interacted with. No new LS knowledge is needed.
  • Prior knowledge hypothesis LS occurs when an
    input is unrelated to any prior knowledge.
  • Related knowledge hypothesis LS can be
    interpreted by searching the space around the
    input.

19
Outline
  • What is loose-speak
  • Thesis
  • Frequency study
  • LIPS
  • Case study noun compound interpreter
  • Future work
  • Related work

20
Evaluation of Frequency Hypothesis
  • Goal find common LS types and estimate their
    frequencies
  • Data
  • Corpus study 100 randomly chosen sentences from
    each of the 3 corpora Brown, MUC Alberts.
  • Halo questions two sets of AP test questions
    (200 questions) in the form of English sentences.

21
Methodology
  • Encode the sentences literally without regard for
    the idiosyncrasies in our KB.
  • Encode the sentences according to the
    idiosyncrasies of our KB.
  • Compare the two encodings and record any
    misalignments.

22
LS Types
  • Metonymy attribute used for thing itself
  • Causal factor causes used for results
  • Too generic concepts generic concepts used for
    specific ones
  • Roles things that are in the context of events
  • Aggregate individual used for a set
  • Spatial relations spatial relations used between
    objects instead of locations
  • Temporal relations temporal relations used
    between events instead of time
  • Noun compounds sequences of nouns without
    semantic relations explicitly specified.
  • Metaphors one thing figuratively used to refer
    another based on similarity

23
Metonymy
  • An attribute (metonym) is used in place of the
    thing itself (referent)
  • Example Pearl Harbor caused US to declare war
    against Japan.
  • Commonly used metonymic relations (Lakoff and
    Johnson 1980)
  • PART-FOR-WHOLE
  • PRODUCER-FOR-PRODUCT
  • OBJECT-FOR-USER
  • CONTROLLER-FOR-CONTROLLED
  • INSTITUTION-FOR-PEOPLE-RESPONSIBLE
  • PLACE-FOR-INSTITUTION
  • PLACE-FOR-EVENT

24
Causal Factor
  • Results in a causal chain are described by their
    causes.
  • Example What is the result of mixing NaOH and
    HCl?
  • Not viewed as metonymy because causal relation is
    excluded from Lakoff Johnsons list.

25
Frequency Study Results
26
Analysis
  • Frequency hypothesis validated because LS does
    concentrate on a few types.
  • Frequencies of different types of LS vary across
    different domains.

27
Outline
  • What is loose-speak
  • Thesis
  • Frequency study
  • LIPS
  • Case study noun compound interpreter
  • Future work
  • Related work

28
Loose-speak Interpreter Requirements
  • Automaticity intrude upon the KB interaction as
    little as possible.
  • Coverage cover all the important LS types, and
    be easily extended to handle additional types
    should they occur.

29
Loose-speak InterPretation System (LIPS) Overview
The traditional KB interaction model.
The LIPS KB interaction model.
30
Outline
  • What is loose-speak
  • Thesis
  • Frequency study
  • LIPS
  • Case study noun compound interpretation
  • Future work
  • Related work

31
Noun Compound Interpretation
  • Noun compounds a sequence of nouns composed of a
    head noun and one (or more) modifiers, such as
    concrete floor.
  • Noun compound interpretation find a sequence of
    semantic relations that links the nouns in a
    compound,
  • Example animal virus ?
  • Noun compound in KA given a new concept made of
    a noun compound, and its constituent nouns
    knowledge is often skeleton.

agent
object
basic-structural-unit
Virus
Invade
Cell
Animal
32
Bootstrapping Hypothesis in Noun Compound
Interpretation
  • Bootstrapping hypothesis
  • Supported if noun compounds can be interpreted
    without much knowledge about the constituent
    nouns.
  • Example concrete floor interpreted without
    knowing that concrete is made of sand, gravel and
    cement

33
Ablations
  • Impact of each level of the ontology measured
    through a series of ablations.
  • When a level is ablated, the concepts on that
    level and all their axioms are deleted from the
    KB.

Ablation of level 1 on a sample taxonomy
34
Related Knowledge Hypothesis in Noun Compound
Interpretation
  • Related knowledge hypothesis
  • Validated if a good percentage of noun compounds
    are interpreted correctly by searching the space
    around the constituent nouns.

35
Noun Compound Interpreter Algorithm
  • Given noun compound
  • A breadth-first search starting from C1, stops
    when C2 or any super/subclass of C2 is found
  • A breadth-first search starting from C2, stops
    when C1 or any super/subclass of C1 is found
  • Example

36
KBs
  • Three domains
  • Biology textbook
  • Small engine repair manual
  • Sparcstation manual
  • Share top level ontology, but few other concepts
    in common

37
Measurements
  • P precision and R -- recall
  • Csystem -- of correct answers by the system,
    i.e. the number of correctly interpreted noun
    compounds
  • Asystem -- of answers by the system, i.e. the
    number of interpreted noun compounds
  • Cpossible -- total of possible correct answers,
    i.e. the number of noun compounds tested

38
Measurements (continued)
  • If R ? and P ? then
  • Csystem ? because Cpossible remains the same
  • Asystem ? ?
  • Cpossible - Asystem number of uninterpreted
    inputs ?

39
Results
40
Results (continued)
  • Precisions are 93.8, 85.2 and 84.5 recalls
    are 93.8 74.5 and 73.2 without ablations
  • Ablating level 1 causes a big drop in both
    precision recall.
  • Ablating level 2 causes a gap between precision
    and recall, which indicates no interpretations
    are found for many noun compounds.
  • As lower levels are ablated, the impact
    diminishes.

41
Analysis
  • Related knowledge hypothesis validated because
    good percentage of noun compounds are interpreted
    correctly.
  • Why are the top levels of the ontology the most
    important? Two possible reasons
  • Because they contain more knowledge?
  • Because their knowledge is more important?

42
Axiom-per-Level Count
  • Use axiom counts as measurement of knowledge
    amount
  • Count the local axioms only

43
Axiom-per-level Count Results
44
Analysis (2)
  • Therefore top levels knowledge is more important
    than lower levels.
  • Bootstrapping hypothesis validated because only
    the top levels are needed for noun compound
    interpretation task.

45
Outline
  • What is loose-speak
  • Thesis
  • Frequency study
  • LIPS
  • Case study noun compound interpreter
  • Future work
  • Related work

46
Future Work
  • Extend LIPS
  • Evaluate LIPS

47
Extending LIPS
  • Implement interpreters for other types of LS.
  • Based on our hypotheses, other types of LS can be
    detected and interpreted using similar KB-search
    methods.

48
Extending LIPS (continued)
  • Example given assertion HCl is
    green-yellowish,
  • Naïve encoding
  • LS Detection
  • HCl doesnt have any knowledge of color and
  • The domain of color relation does not apply to
    HCl

49
Extending LIPS (continued)
  • LS interpretation


H
has-part
HCl
Cl-
Is-basic-structural-unit-of
Chemical
color
Color
50
LIPS Evaluation Environments
  • Incorporate LIPS to SHAKEN
  • A knowledge acquisition system
  • Used in RKF project
  • To be used in Halo II and CALO projects.
  • Incorporate LIPS to a controlled language
    question answering system (Clark and Robinson
    2003).
  • A question answering system
  • Proposed for Halo II

51
New Applications
  • Halo II
  • Follow on of the Halo project
  • Three science domains
  • Aimed to enable domain experts to encode
    knowledge modules and untrained end-user to pose
    questions and problems to those knowledge
    modules.
  • EPCA
  • Office procedure domain
  • Naïve users interacting KB

52
Outline
  • What is loose-speak
  • Thesis
  • Frequency study
  • LIPS
  • Case study noun compound interpreter
  • Future work
  • Related work

53
Ontology Merging
  • Combine several ontologies into one standard
    ontology. Niles and Pease 2001 McGuinness et
    al. 2000 Noy and Musen 1999 Chalupsky 2000
  • Similar to LS
  • Concerned with resolving representational
    differences.
  • Different from LS because
  • Objects being replaced are different
  • Need for automation is different

54
Semantic Interpretation
  • Map the representation of utterances in natural
    languages into formal representation of meaning.
    Ratnaparkhi 1997 Yarowsky 1992 Gardent and
    Webber 1998
  • Similar to LS
  • To translate the ambiguities in natural
    languages.
  • Different from LS because
  • Emphasis on structural, semantic and scope
    ambiguities.

55
Noun Compounds
  • Noun compounds in NLP classify noun compounds
    into different categories.
  • Similar to LS
  • Try to disambiguate the underlying semantic
    relation in noun compounds.
  • Different in interpretation
  • Semantic category vs. sequence of semantic
    relations, e.g. animal virus

56
Noun Compounds (continued)
  • Rule based approaches (Rosemary 1984)
  • Example If modifier is a material, then its a
    made-of category such as in marble statue.
  • Machine learning based classification approaches
    (Lauer 1994, Barker 1998)
  • KB search based approaches (Vanderwende 1994)

57
LIPS and Noun Compounds
  • Rule based approaches unsuitable because
  • Not as flexible
  • Requires additional knowledge, which may erase
    the KA gains by automatically interpreting LS.
  • Machine learning based classification approaches
    unsuitable for LIPS because
  • Lack of training examples
  • Abundance of knowledge
  • Need semantic relations instead of semantic
    category as the end goal

58
Metonymy Interpretation
  • Rule based approaches (Weischedel and Sondheimer
    1983, Grosz et al. 1987, Lytinen et al. 1992,
    Fass 1997)
  • Meta rules on how to interpret metonymy
  • Pros
  • Easy to implement
  • Cons
  • Can only handle a fixed number of metonymy types
  • May need different rules for different domains

59
Metonymy Interpretation (continued)
  • KB-search based (Browse 1978, Markert and Hahn
    1997, Harabagiu 1998)
  • Rely on searches in general purpose knowledge
    bases.
  • Pros
  • No restriction on the types of metonymy it can
    interpret.
  • No changes needed for different domains
  • Cons
  • Requires a large KB

60
LIPS and Previous Metonymy Approaches
  • LIPS similar to KB-based approaches
  • Difference
  • Uses prior knowledge hypothesis instead of type
    violation to detect metonymy
  • Example
  • KB Car has-part Engine, Engine has-part
    Carburetor
  • Query the carburetor part of a car?
  • Other metonymy systems no metonymy found because
    of no type violations
  • LIPS metonymy found because no prior knowledge
    of carburetor part of a car is found.

61
Conclusion
  • Defined Loose-speak
  • Studied the frequencies of different types of LS
  • Proposed an automatic loose-speak interpreter
  • Analyzed how noun compound interpreter works
  • Propose to further develop evaluate the
    automatic loose-speak interpreter
Write a Comment
User Comments (0)
About PowerShow.com