Title: Answering%20complex%20questions%20and%20performing%20deep%20reasoning%20in%20advance%20QA%20systems:%20ARDA%20AQUAINT%20Program%20Phase%202
1Answering complex questions and performing deep
reasoning in advance QA systems ARDA AQUAINT
Program Phase 2
- Chitta Baral
- Arizona State university
2Participants and other students
- Arizona State University
- PI Chitta Baral
- Chittas student participants Luis Tari, Jicheng
Zhao, Hiro Takahashi, Saadat Anwar, Ryan Weddle
(during summer), Nam Tran, Xin Zhang, Piyun Chang
(during summer) - Chittas other students Le-Chi Tuan
- Other students Deepthi Chidambaram, Toufeeq
Ahmed - Texas Tech University
- PI Michael Gelfond
- Student Participants Marcello Balduccini, Greg
Gelfond - Monmouth University
- PI Richard Scherl
3AQUAINT Program goals (from BAA 03-06-FH)
- seeks proposals for innovative, creative and
high-risk research, which will continue to
advance the state-of-the-art in technologies and
methods for advanced, automated question
answering. - Phase 1 Research goals and accomplishments
focused on the following functional components
and enabling technologies - 1. Question understanding and interpretation
- 2. Determining the answer
- 3. Formulating and presenting the answer
- 4. Cross-cutting/Enabling/Enhancing technologies
that directly and materially support the goals of
the AQUAINT program and one or more of the areas
1-3 listed above.
4AQUAINT Program goals (from BAA 03-06-FH) --
cont
- Phase 2 Research Goals In addition to pursuing
the goals identified in Phase 1 of the program,
Phase 2 will encourage efforts, ehich focus on
the following challenges - 1. Question answering as part of a larger
information-gathering process (2 implications) - Increasing complexity of questions
- Synthesis of Information found in multiple data
sources - 2. Accessing, Retrieving and Integrating diverse
data sources - 3. Exploring boundaries/combinations of
knowledge-based, statistical and linguisitic
approaches to question answering - 4. Evaluating, Validating and Presenting an answer
5Some focal points of our project excerpts from
BAA 03-06-FH
- Increasing Complexity of Questions In addition
to the more factually based, who, what, when,
where type of questions that todays state of the
art Q A systems tackle, the ultimate, advanced
Q A systems must be able to successfully
respond to the far more complex why and how types
of questions. These complex questions will likely
involve judgment terms involving intent, motive,
meaning, reason, purpose, aim, objective,
implications, etc. or the questions might require
the advanced Q A system to compare, contrast,
examine, inspect, match, size up, weigh, etc. two
or more different yet related entities, objects
or positions. And finally the questions asked of
this ultimate system will at times tend to be
somewhat vague, open-ended and abstract.
6Some focal points of our project as excerpted
from BAA 03-06-FH (cont.)
- The advanced Q A system needs to recognize when
it can not find or does not know the answer to
the original question. - Clearly, systems perform deep reasoning and
complex chains of inference. - Although the focus of ARDAs AQUAINT program is
to tackle and research unsolved technical
problems, it is important to remember that the
ultimate goal of the program is to develop Q A
systems which can be made available as automated
tools to the intelligence analyst.
7Focus of our proposal (from our statement of
work)
- Develop various component elements focused on the
following issues - Answering increasingly complex questions
- Figuring out whether a particular question can be
answered with the given information and if not,
either giving qualified or tentative answers or - Developing ways to meaningfully present answers
8Further elaboration of our goals take QA to
another level, beyond simple querying
- Answer hypothetical queries, narrative queries,
counterfactual queries, planning queries etc. - Reasoning with incomplete information, defaults,
normative statements, etc. - Formulating deep reasoning notions such as when
is a behavior or event abnormal or suspicious,
when a statement is a lie, what is an
explanation, what is a diagnosis, what is a
cause, etc.
9QUERIES
- Prediction, explanation, planning, cause,
counterfactual, etc.
10Queries and Answers
- Answering queries with respect to databases
various query languages - Relational databases SQL3
- Object-Oriented Databases OQL
- Web databases, XML Databases XML-QL
- Prolog queries
- Natural language queries
- Often translated to one of the above
- Complex Queries!
- Need knowledge beyond that is present in the
given data (or text) to answer. - Need reasoning mechanisms that can not be
expressed in standard database query languages or
classical logics.
11Complex Query example predictive query
- Text/Data John is at home in Boston and has not
bought a ticket to Paris yet. - Query
- What happens if John tries to fly to Paris?
- What happens if John buys a ticket to Paris and
then tries to fly to Paris? - Missing knowledge
- When can one fly?
- What is the result of flying?
12Complex Query example explanation query
- Text/Data On Dec 10th John is at home in Boston
and does not have a ticket to Paris yet. On Dec
11th he is in Paris. - Query
- Explain what might have happened in between.
- Bought a ticket gone to the Boston airport
taken a flight to Paris.
13Complex Query Example planning query
- Text/Data On Dec 10th John is at home in Boston
and does not have a ticket to Paris yet. - Query What does John need to do to be in Paris
on Dec 11th. - He needs to buy the ticket get to the airport
fly to Paris.
14Complex Query ExampleCounterfactual Query
- Text/Data On Dec 10th John is at home in Boston.
He made a plan to get to Paris by Dec 11th. He
then bought a ticket. But on his way to the
airport he got stuck in the traffic. He did not
make it to the flight. - Query What if John had not gotten stuck in the
traffic?
15Complex Query Example query about narratives
- Text/Data John, who always carries his laptop
with him, took a flight from Boston to Paris on
the morning of Dec 11th. - Queries
- Where is John on the evening of Dec 11th?
- In which city is Johns laptop that evening?
16Complex Query Example Causal queries
- Text/Data On Dec 10th John is at home in Boston.
He made a plan to get to Paris by Dec 11th. He
then bought a ticket. But on his way to the
airport he got stuck in the traffic. He reached
the airport late and his flight had left. - Queries
- What are the causes of John missing the flight?
17Complex Query Example Unusual behavior
- John flew from Boston to Paris. He did not check
in any luggage in Boston. When he got out of the
plane in CDG he did not have anything in his
hand. - Was there anything unusual about Johns behavior
when he checked in? - Need information on normal behavior of people who
check in for an international flight - Normal inertia with respect to hand luggage (from
checking in to getting out of the plane)
18Our approach and progress
19Basic thesis
- The documents on which Q A is to be based often
does not contain the general knowledge necessary
to answer deep questions. - This knowledge has to be written for a system to
be able to do deep reasoning. - Basic questions
- How to write this knowledge (in which language)
- How to do various kinds of deep reasoning with
this knowledge together with information embedded
in the given documents?
20Starting Point
- Past research in knowledge representation and
reasoning. - The book on the left.
- Initial article was by Gelfond and Lifschitz.
(Rannked 17th in the most cited list
http//citeseer.ist.psu.edu/source.html) - http//citeseer.ist.psu.edu/allcited.html
- Gelfond (268)
- Baral (2757)
- Scherl (4948)
21Post-contract plan of action
- 1. Use the existing knowledge representation
theory and systems to do deep reasoning - 2. Enhance theory
- 3. Enhance systems
- 4. Prepare for bridging with other projects to
lead to an end-to-end system
22Work in progress and todays agenda
- Morning session
- (1) Progress on using existing theory and systems
to encode common-sense knowledge and use it to
answer difficult queries. (Richard Scherl) - (3) Adding a GUI to the Smodels reasoning system
(Hiro Takahashi) - (2,3) CR-Prolog (Marcello Balduccini)
- (2) Enhancing AnsProlog to reason with
probabilities (Chitta Baral) - Afternoon session
- (4) A simple QA system (Piyun Chang)
- (4) A Text extraction system used with respect to
Bio-medical texts (Deepthi Chidambaram, Toufeeq
Ahmed -- students of my colleague Hasan Davulcu)
- (4) NLP module to translate English questions to
our representation (Richard Scherl) - (2) Goal Language (Jicheng Zhao) if there is
time - (2) Modules (Luis Tari) if there is time
- Overview of other work in Chittas Lab.
23NEXT
24Our approach to answer such queries
- Develop various knowledge modules in an
appropriate knowledge representation and
reasoning language. - Travel module (includes flying, driving)
- Geography Module
- Time module
- Reasoning about actions module
- Planning module
- Explanation module
- Counterfactual module
- Cause finding module
- Most of the above modules with defaults and
exceptions.
25Knowledge Representation and Reasoning
26What properties should an appropriate KR R
language have
- Should be non-monotonic. So that the system can
revise its earlier conclusion in light of new
information. - Should have the ability to represent normative
statements, exceptions, and default statements,
and should be able to reason with them. - Should be expressive enough to express and answer
problem solving queries such as planning queries,
counterfactual queries, explanation queries and
diagnostic queries. - Should have a simple and intuitive syntax so that
domain experts (who may be non-computer
scientists) can express knowledge using it. - Should have enough existing research (or building
block results) about this language so that one
does not have to start from scratch. - Should have interpreters or implementation of the
language so that one can indeed represent and
reason in this language. (I.e., it should not be
just a theoretical language.) - Should have existing applications that have been
built on this language so as to demonstrate the
feasibility that applications can be indeed built
using this language.
27AnsProlog a suitable knowledge representation
language
- AnsProlog Programming in logic with answer sets
- Language (and semantics) was first introduced in
the paper The Stable Model Semantics For Logic
Programming - Gelfond, Lifschitz (1988), among
the most cited source documents in the CiteSeer
database. http//citeseer.ist.psu.edu/source.html - Syntax Set of statements of the form
- A0 or or Ak ? B1, , Bm, not C1,
not Cn. - Intuitive meaning of the above statement
- If B1, , Bm is known to be true and C1, , Cn
can be assumed to be false then at least one of
A0 ,, Ak must be true. - It satisfies all the properties mentioned in the
previous slide (and much more)! - Details in my Book Knowledge Representation,
Reasoning and Declarative Problem Solving.
Cambridge University Press, 2003. -
-
28AnsProlog vs Prolog
- Differences
- Prolog is sensitive to ordering of rules and
ordering of literals in the body of rules. - Inappropriate ordering leads to infinite loops.
(Thus Prolog is not declarative hence not a
knowledge representation language) - Prolog stumbles with recursion through negation
- No disjunction in the head (less power)
- Similarities For certain subclasses of AnsProlog
Prolog can be thought of as a top-down engine.
29AnsProlog vs other KR R languages
- AnsProlog has a simple syntax and semantics
- Syntax has structure that allows defining
sub-classes - More expressive than propositional and
first-order logic propositional AnsProlog is as
expressive as default logic. Yet much simpler. - It has a very large body of support structure
(theoretical results as well as implementations)
among the various knowledge representation
languages - Description logic comes close. But its focus is
somewhat narrow, mostly to represent and reason
about ontologies. -
30Illustration of Complex Query Answering
- John flying to Baghdad to meet Bob example.
31The extracted text and the queries
- Extracted Text
- John spent Dec 10 in Paris and took a plane to
Baghdad the next morning. He was planning to meet
Bob who was waiting for him there. - Queries
- Q1 Was John in the Middle East in mid-December?
- Q2 If so, did he meet Bob in the Middle East in
mid-December?
32Required background and common-sense knowledge
- Knowledge about geographical objects and their
hierarchy. (M1) - Baghdad is a city in Iraq. Iraq is a country in
the middle east region. - A city in a country in a region is a city in that
region. - Knowledge about travel events. (M2)
- If someone is in a city then she is in the
country where the city is in and so on. - Executability conditions and effect of travel
events - Inertia
- Duration of flying
- Knowledge about time units. (M3)
- Relation between various time granularities
- Knowledge about planned events, meeting events.
(M4) - People normally follow through their plans
- Executability condition of meeting events
33M1 The geography Module
- List of places
- is(baghdad,city).
- is(iraq,country).
- ...
- Relation between places
- in(baghdad, iraq).
- in(iraq,middle_east).
- in(paris,france).
- in(france,western_europe).
- in(western_europe,europe).
- ...
- Transitive closure
- in(P1,P3) ? in(P1,P2), in(P2,P3).
- Completeness assumption about in
- -in(P1,P2) ? not in(P1,P2)
34M2 The traveling module
- Based on theory of dynamic systems
- Views world as a transition diagram
- States are labeled by fluents
- Arcs labeled by actions
- Various types of traveling events
- instance_of(fly,travel).
- instance_of(drive,travel).
- Generic description of John flying to Baghdad
- event(a1).
- type(a1,fly).
- actor(a1,john).
- destination(a1,baghdad).
- Actual event is recorded as
- occurs(a1,i)
35M2 The traveling module (cont.)
- Representation of transition Diagram
- State Constraints
- loc(P2,X,T) ? loc(P1,X,T), in(P2,P1).
- disjoint(P1,P2) ? -in(P1,P2), -in(P2,P1),
neq(P1,P2). - -loc(P2,X,T) ? loc(P1,X,T),disjoint(P1,P2).
- Causal Laws
- loc(P,X,T1) ? occurs(E,T),
type(E,travel), actor(E,X),
destination(E,P), -interference(E,T). - -interference(E,T) ? not interference(E,T).
- Executability Conditions
- -occurs(E,T) ? cond(T).
- Inertia Rules (frame axioms)
- loc(P,X,T1) ? loc(P,X,T), not -loc(P,X,T1).
- -loc(P,X,T1) ? -loc(P,X,T), not loc(P,X,T1).
36Reasoning with M1 and M2
- Given
- loc(paris,john,0).
- loc(baghdad,bob,0).
- occurs(a1,0).
- And with M1 and M2 AnsProlog can conclude
- loc(baghdad,john,1), loc(baghdad,bob,1),
- loc(middle_east,john,1), -loc(paris,john,1)
37M3 Time and durations
- Duration of actions (additional ones needed for
month etc.) - time(T1,day,D) ? occurs(E,T), type(E,fly),
- time(T,day,D), not -time(T1,day,D).
- Basic measuring units
- day(1..31). month(1..12). part(start). part(end).
part(middle). - Rules translating between one granularity to
another - time(T,part,middle) ? time(T,d,D), 10 lt
D lt 20. - time(T,season,summer) ? time(T,month,M), 5 lt M
lt 9. - Missing elements from the module
- next(date(10,12,03),date(11,12,03)).
- next(date(31,12,03),date(1,1,04)).
38Reasoning with M1, M2 and M3
- Given information about Johns flight
- loc(paris,john,0).
- loc(baghdad,bob,0).
- occurs(a1,0).
- time(0,day,11).
- time(0,month,12).
- The query Q1
- ? loc(middle_east,john,T), time(T,month,12),
time(T,part,middle). - AnsProlog gives the correct answer yes with T
1.
39M4 planning to meet and meeting
- Describing the event meet
- event(a2). type(a2,meet).
- actor(a2,john). actor(a2,bob).
- place(a2,baghdad).
- Executability conditions of the meeting event
- -occurs(E,T) ? type(E,meet), actor(E,X),
place(E,P), -loc(P,X,T). - Planned meeting planned(a2,1).
- Planned actions and their occurrence People
normally follow their plans - occurs(E,T) ? planned(E,T), not -occurs(E).
- People persist with their plans until it happens
- planned(E,T1) ? planned(E,T), -occurs(E,T).
- Second query
- ? occurs(E,T), type(E,meet), actor(E,john),
actor(E,bob), loc(middle_east,john,T),
time(T,month,12), time(T,part,middle). - Answer Yes.
40Conclusion
- Answering complex queries need a lot of knowledge
and reasoning rules that are not present in the
text or data. - These reasoning rules and knowledge need to be
encoded as modules in an appropriate knowledge
representation and reasoning language.
41Ongoing and Future Work
- Further development of Modules (examples)
- Travel duration
- Time period representation issues (eg. time
zones) - Dealing with the case when a planned event fails
- Further development of the AnsProlog language
- Not good when dealing with time or similar
features that result in large instantiations. - Taking advantage of Prolog execution engine when
necessary - Necessity of set notations, aggregates etc.
42Acknowledgements
- Steve Maiorano, Jean-Michel Pomarede
- Ryan Weddle, Jicheng Zhao, Saadat Anwar, Luis
Tari (all from ASU), Greg Gelfond (from Texas
Tech University)