Title: Using Non-Taxonomic Knowledge to Improve Semantic Matching
1Using Non-Taxonomic Knowledge to Improve Semantic
Matching
2Talk Outline
- Introduction
- Analysis of Existing Techniques
- Our Approach
- Initial Evaluation
- Proposed Work
3Introduction
- Many AI tasks require determining whether two
knowledge representations encode the same
knowledge.
4Information Retrieval
- Match queries with documents.
Q A car with a bumper made of gold.
A Acme makes a car made of Gold.
5Knowledge Acquisition
- Match new knowledge with existing knowledge.
KB
KB Are you trying to encode a conversion?
6Rule-based Classification
- Match rule antecedents with working memory. For
example, Course of Action (COA) critiquing.
Pattern
COA
This COA has a rating of good for enemy maneuver
engagement.
7The Core Problem
- Solving this matching problem is hard because
multiple encodings of the same knowledge rarely
match exactly. - Representations dont match exactly because
- Expressive Ontology.
- Knowledge is encoded by different sources.
- Knowledge being encoded is complex.
8Types of Mismatches
- Informal examination of a knowledge-base
containing - Patterns.
- COAs.
- Knowledge-base was built by two Subject Matter
Experts (SMEs) participating in DARPAs RKF
project. - Looked for cases of mismatch.
9Types of Mismatches (cont.)
an armored brigade engaging an armored
battalion.
10Types of Mismatches (cont.)
One military unit attacking another unit.
- Taxonomic Differences
- Equivalent Alternatives
11Types of Mismatches (cont.)
Mechanized infantry brigade engaging mechanized
infantry battalion.
- Taxonomic Differences
- Equivalent Alternatives
- Omissions
12Types of Mismatches (cont.)
Support attack occurs before main attack.
- Taxonomic Differences
- Equivalent Alternatives
- Omissions
- Granularity
13Analysis of Existing Techniques
- Analogy
- Inexact Matching
- Semantic Matching
- Conceptual Indexing
- Ontology Merging
14Analogy
- Analogy mapping of knowledge from a base domain
to a target domain. - Structure Mapping Engine (Forbus et. al. 89)
- Maps relational knowledge (mappable systems).
- Systematicity Principle used to select best
analogy. - Analogy based on common generalizations (Leishman
92) - Maps both relational knowledge and object
attributes. - Prefers minimal common generalization.
15Analogy Structure Mapping Engine
16Inexact Matching
- Inexact Matching tries to address mismatches
between representations - Graph Editing (Tsai et. al. 83, Shapiro and
Haralick 81, Messmer et. al. 93, Wolverton et.
al. 2003) - Uses edit distance parameters.
- Similarity based on shortest sequence of edits.
- Partial Matching
- Does not require representations to be
isomorphic. - Similarity based on amount of structural overlap.
- Minimal Common Supergraph (Bunke et. al. 2000)
and Maximal Common Subgraph (Bunke and Shearer
98).
17Inexact Matching MCS
18Semantic Matching
- Semantic Matching uses knowledge to match
representations. - Projection
- Uses taxonomic knowledge.
- Ontoseek (Guarino et. al. 99) and ELEN (Huibers
et. al. 96). - Projection Projection alone is too restrictive
- ??-projection (Genest and Chein 97).
- Common generalization, graph splitting, regular
expressions (Fargues 92, Buche et. al. 2000,
Martin et. al. 2001). - Semantic Overlap
- Maximal Joins and Generalizations (Myaeng 92,
Poole et. al. 95). - Shared Semantic Structures (Zhong et. al. 2002).
19Semantic Matching Semantic Overlap
20Conceptual Indexing
- Conceptual indexing how to organize and index
knowledge. - Requires so form matching.
- Generalization hierarchy (Bournard et. al. 95,
Ellis 92, Levinson 82, Woods 97). - Knowledge indexed by common generalizations.
- Generalizations organized hierarchically by
subsumption relationships. - Retrieve Most Specific Subsumer (MSS) of a query.
- Match procedure is similar to Projection -
suffers the same problems.
21Ontology Merging and Translation
- Ontology Merging merge multiple ontologies built
by different sources - Chimaera (McGuinness et. al. 2000)
- SMART (Noy and Musen 99).
- Ontology Translation translates a representation
from one language to another - Ontomorph (Chalupsky 2000).
- Goals are different but share some of the same
problems.
22Our Approach
- The goal of this research is to solve the
matching problem. - We believe existing semantic approaches can be
extended with additional knowledge to
significantly improve matching. - What kinds of additional knowledge?
- Transformations
- Handle mismatches.
- Improve matching.
- Not taxonomic knowledge.
23Our Approach (cont.)
- Generality and domain-independence.
- Want additional knowledge (e.g. Transformations)
to be useful across domains. - We believe domain-independence is possible given
a reusable domain-neutral upper ontology. - Contains a small set of general concepts.
- SMEs use this upper ontology to build KBs on
specialized topics (e.g. chemistry, biology,
battle space planning). - No training in logic or knowledge representation.
24Illustration of Our Framework
Ontology
Domain-independent KB for the task of matching.
KB can be viewed as a domain-specific matcher
(e.g. match symptoms to diseases).
25Our Prototype
- Extend semantic matchers with transformations.
- Apply transformations in a forward-chaining
manner. - Use existing techniques for reasoning with
Conceptual Graphs (Corbett et. al. 99, Salvat et.
al. 96, Willems 95) - Projection.
- Unification.
- Graph rules.
- Two caveats because existing techniques lead to
promiscuous matches.
26Transformations that Retains Semantics
Projection
27Transformations that Retains Semantics
28Rule Applicability
29Rule Applicability
30Enumerating Transformations
- Transformations derived from our domain-neutral
upper ontology. - Enumerated all ways that a relation can be
legally used to encode information in a
conceptual graph. - Considered whether the same information can be
expressed differently. - Enumeration was possible because
- Small upper ontology.
- Each concept had well-defined semantics.
31Transformations Enumerated
- We were able to enumerate about 300
transformations. - Resulting transformations fall into three general
categories - Transitivity
- Part Ascension
- Transfers Through
32Transformations Enumerated (cont.)
relation Transitive Part Ascension Transfers Through
causes X - subevent, resulting-state
caused-by X subevent-of resulting-from
defeats - - -
defeated-by - subevent-of caused-by
enables X - causes, resulting-state, subevent
enabled-by X subevent-of caused-by, resulting-from
inhibits - subevent-of resulting-state
inhibited-by - subevent-of caused-by, resulting-from
by-means-of X - -
means-by-which X - -
prevents - subevent-of -
prevented-by - subevent-of caused-by, resulting-from
resulting-state - - causes
resulting-from - - -
33Example Our Approach
l1
l1 (1,A)
M
(1,A)
34Example Our Approach
35Example Our Approach
36Example Our Approach
37Initial Evaluation
- Used our matcher in an application in the domain
of battle space planning (DARPA's RKF Project). - The task is to analyze COAs.
- Battle space ontology built by extending our
upper ontology. - Two military analysts used this ontology to build
KBs containing - Patterns.
- COAs.
- Our matcher matched the patterns to COAs.
38Example Output
39Experiment 1
- Evaluates our first hypothesis.
- How significant is the improvement?
- Compared our matcher to
- Maximal Common Subgraph (MCS).
- Semantic Search Lite (SSL).
- Methodology
- 300 domain-neutral transformations 80
domain-specific transformations. - Matched the patterns to the COAs.
- A pattern matches a COA if the match score meets
or exceeds a pre-specified threshold. - Used metrics of precision and recall.
40Experiment 1 Precision
41Experiment 1 Recall
42Experiment 2
- Initial evaluation of our second hypothesis.
- Assesses the domain independence of using
transformations. - Limited - conducted in only one domain, but can
still offer some insight. - Methodology
- Divided transformations into 2 groups
(domain-neutral vs. domain-specific). - Used domain-neutral transformations to construct
DN - Used domain-specific transformations to construct
DS - Everything else is the same as Experiment 1.
43Experiment 2 Precision
44Experiment 2 Recall
45Proposed Work
- More Comprehensive Evaluation.
- Use background knowledge.
- Incorporate indexing to make matching more
efficient.
46Comprehensive Evaluation
- Evaluate our approach in several applications in
four domains. - Four data sets
- Chemistry (Halo).
- Biology (RKF).
- Battle Space Planning (RKF).
- Office Procedures (EPCA).
- Three Applications
- Elaboration Chemistry and Office Procedures.
- Question Answering Biology and Battle Space.
- Plan Evaluation Battle Space and Office
Procedures.
47Background Knowledge
- Background Knowledge.
- Can be used to normalize new knowledge at
acquisition time via a join (Mineau et. al. 93). - Idea can be applied to matching.
- Increase similarity.
- Two problems
- When should a join be performed?
- How to better control the join?
48Background Knowledge
- Background Knowledge.
- Can be used to normalize new knowledge at
acquisition time via a join (Mineau et. al. 93). - Idea can be applied to matching.
- Increase similarity.
- Two problems
- When should a join be performed?
- How to better control the join?
49Indexing
- Need indexing to make matching more efficient.
- A common technique is a generalization hierarchy
- Overhead for storage can be expensive.
- Finding the MSS can also be expensive.
- We intend to study
- How to index knowledge by content?
- Other index structures that are more parsimonious.