Title: Data Integration using Argumentation
1Data Integration using Argumentation
- Shekhar Pradhan
- Marist College
- shekhar.pradhan_at_marist.edu
2Two Integration Perspectives
- Integrating data from different sources to answer
a query which cannot be answered by a single
source. - E.g., Tsimmis, Hermes, Information Manifold, etc.
- Integrating data, possibly from different
sources, to determine how much confidence should
be placed in an answer to a query. - E.g., Argumentation databases (ICLP, 03).
- We will call this credibility integration.
3Universe of Information
p1c1
p2c2
p5c5
p7c7
p3c3
p6c6
pc4
- Confidence value may depend on
- the certainty associated with the information by
a source - perceived reliability of the source on a given
topic - the probability of the information being true.
4Universe of Information with support and attack
relationships
p1c1
p2c2
p5c5
p7c7
p3c3
p6c6
pc4
5Credibility Integration
Credibility integration of information is the
process by which the confidence value associated
with information is adjusted taking into account
all the information that is relevant to this
information through the relationships of support
and attack.
Example How much confidence should be placed in
the claim that Cat Stevens is a terror threat to
the US taking into account all the evidence for
and against the claim and all the evidence for
and against the claims that are used to provide
evidence for or against, and so on.
6Decisions
- What should confidence values (c-values)
represent? - What sort of relation among information should
count as a - a supporting relation
- an attacking relation?
- What policies should be adopted for adjusting the
confidence value of p taking into account - the information supporting p?
- the information attacking p?
- How should supporting and attacking relations be
represented?
7C-values as probabilities
- Advantage Probability calculus is
well-understood. - Problems
- A person may not have the relevant information to
determine probabilities. - May not have a way of determining conditional
probabilities given some relevant information. - Confidence of x in p should not imply a
confidence of 1-x in not p. - But probability calculus forces this.
8C-values
- Relative to a user or a community of users of the
information. - A c-value expresses the degree of confidence that
a person has in the information being true. - C-values in this sense can be distinguished from
probability values (ask any gambler). - But rational agents do not have a higher
confidence than warranted by probability.
9Representing confidence values
- A lattice of values.
- No-Confidence High-Confidence
- The set of real numbers between 0.0 and 1.0 with
the usual ordering. - Must not assume that these numbers are anything
more than an ordered set of labels. - Question what constraints should be satisfied by
c-values for them to represent (subjective)
probabilities. - How should such subjective probabilities be
combined with objective probabilities (i.e.,
reliability of an information source).
10Supporting Relations
- What sort of relations should count as supporting
relations? - Corroboration Two or more sources making the
same claim independently. - Evidence One or more claims providing evidence
for another claim. - these claims can be made by the same source or
different sources. - Evidence relations can be cast as arguments.
11Attacking relations
- A claim p can be considered as an attack on a
claim q if it diminishes the confidence one has
in q. - These claims can be made by the same source or
different source. - There can be degrees of attacks.
- There can be attack thresholds.
- How confident does one have to be in p for it to
be permitted as an attack on q.
12Representing supporting relations
- Supporting relations can be represented in terms
of rules using the language of annotated logic
programming (V.S. Subrahmanian).
- Corroboration pf1(v1, v2) ? pv1S1, pv2,S2.
- Evidence pf2(v1, v2) ? qv1X, rv2,Y.
- Inferences can be made in terms of these rules
and appropriate facts.
13Representing Attack Relations
- A tuple consisting of the attacking statement,
the attacked statement, the threshold of attack
and the degree of attack.
- One way of understanding the degree of attack is
the extent to which it should diminish confidence
in the attacked statement, if the attack is
successful.
- And one way of understanding that is that a
successful attack of degree v puts a cap of v on
the attacked statement.
- Should two independent attacks on the same
statement considered jointly be regarded as of a
stronger degree than each taken separately? - If yes, then we need policies for merging
degrees of attacks.
14Attacking Arguments
- An attacking statement can be supported by an
argument, A. - A can be viewed as an argument against the
statement being attacked. - But A can itself be attacked by another argument,
and so on. - Thus, there can be an interacting set of
arguments, each of which can potentially change
the c-value of some statement. - We need a way of computing the effect of all
these interacting arguments on the c-value of
some statement.
15Example
- Claim p Cat Stevens is a terror threat to the
USA.
- Argument for the claim based on the claims that
he is a convert to Islam and he is a supporter of
Hamas. Confers a c-value of 0.7 on p.
- Attack on p of degree 0.2 by q with threshold
0.8, where q C.S. is a man of peace.
- Supporting argument (for q) based on the claim
that he is a peace activist and a critic of
terrorism. Confers a c-value of 0.9 on q.
- Attack on q of degree 0.4 by r with threshold
0.7, where r C.S. supported the fatwa against
Rushdie.
- Assume that the arguments for p, q, and r taken
by themselves establish them with c-values 0.7,
0.9, 0.7, resp. What c-value should p have,
taking into account all the interacting
arguments?
16Contested Annotated Logic Program
- Argumentation Databases (ICLP, 03) describes an
operational semantics for theories (CALP)
consisting of facts and rules annotated with
c-values and attack relations (called
contestations).
- Attack relations are compiled into rules.
- A complete lattice of interpretations for a
compiled theory is described.
- A monotonic operator that computes all the
immediate consequences of the compiled theory
relative to an interpretation is specified.
- Any annotated sentence in the lfp of this
operator is a consequence of theory and its
c-value annotation represents the effect of all
the interacting arguments.
- Based on this semantics we give a bottom-up
procedure for computing the effect of interacting
arguments.
17 Architecture Argumentation Databases
Contains Corroboration and Evidence rules and
attack relations.
Argumentation Manager
Query
Answer to queries
Queries
Mediator
sub-query
sub-query
sub-query
Wrapper
Wrapper
Wrapper
DB 1
DB 2
DB 3
18Argumentation Manager
- Contains the annotated rules and the
contestations (attack relations). - Provided with the capacity for managing a mini
database, which is initially empty. - Equipped with backward (top-down) and forward
(bottom-up) inferencing mechanisms. - Contains knowledge about the schema in the
mediator. - Communicates with the mediator by sending queries
and receiving answers to queries.
19The Query Answering Process
- The initial query is translated by the AM into
logic programming notation.
- The rules and contestations relevant to
answering the query are determined.
- The contestations are compiled into the rules.
- Using top-down inferencing, the original query
is transformed into a set of queries that can be
sent to the mediator.
- The tuples needed to answer the query from the
underlying DBs are retrieved by sending these
queries to the mediator.
- The initial c-values to be assigned to the
tuples are determined.
- Using bottom-up inferencing mechanism on the
annotated rules and using the annotated tuples as
facts, the annotated answers to the initial query
are determined.
20Conclusions
- Proposed a type of integration of information
that consists in modifying the degree of
confidence in some information taking into
account all the information relevant to that
information. - Relevance has been understood in terms of
relations of support and attack. - Briefly described our work on contested annotated
logic programs and argumentation databases.