Title: Horn clauses
1Horn clauses
2Intro
- A Horn clause is a disjunction of literals of
which at most one is positive. E.g. - ?lawyer(x), rich(x)
- Every Horn clause can be written as an
implication whose premise is a conjunction of
positive literals and whose conclusion is a
single positive literal. E.g. - lawyer(x) ? rich(x)
- Horn clauses with exactly one positive literal
are called definite clauses. E.g. the above. - A definite clause with no negative literals
simply asserts a given proposition sometimes
called a fact. - cat(tuna)
3Lawyers Axiomatization
- lawyer(john)
- ?x lawyer(x) ? rich(x)
- ?x rich(x) ? ?y house(x,y)
- ?x,y rich(x) ? house(x,y) ? big(y)
- ?x,y ( house(x,y) ? big(y) ? work(y) )
- Conclusion we want to show John has at least
one house that needs a lot of work. I.e. - ?y house(john,y) ? work(y)
-
- We will negate it and add to the database.
- ?(?y house(john,y) ? work(y))
4Example (concluded)
- Here are all the clauses we got through INSEADO
from the premises and the negated conclusion. - All of them are Horn clauses (more specifically
they are definite clauses). - lawyer(john)
- ?lawyer(x1), rich(x1)
- ?rich(x2), house(x2,houseof(x2))
- ?rich(x3), ?house(x3,y1), big(y1)
- ?house(x4,y2), ?big(y2), work(y2)
- ?house(john,y3), ?work(y3)
- Note We rename the variables, in order to not
have variable name clashes between clauses.
5Resolution by two finger method
- lawyer(john)
- ?lawyer(x1), rich(x1)
- ?rich(x2), house(x2,houseof(x2))
- ?rich(x3), ?house(x3,y1), big(y1)
- ?house(x4,y2), ?big(y2), work(y2)
- ?house(john,y3), ?work(y3)
- rich(john) 1,2 mgu x1?john
- ?lawyer(x2), house(x2,houseof(x2)) 2,3 mgu
x1?x2 - ?lawyer(x2), ?house(x3,y1), big(y1) 2,4
- ?rich(x3), big(houseof(x3)) 3,4
- ?rich(x2), ?big(houseof(x2)), work(houseof(x2))
3,5 - 4,5
- 3,6
- 5,6
- you can continue with the two finger methodbut
its too long.
6Ordered Resolution Strategy
Ordered resolution strat. Each clause is
treated as an ordered set. Resolution is
permitted only on the first literal of each
clause. Intuition To derive empty clause, every
literal must be eliminated. So, work on first
literal till it is gone before starting to work
on other literals. The literals in the conclusion
preserve the order from the parent clauses, with
literals from the positive parent followed by the
literals from the negative parent (i.e., the one
with the negated atom)
- lawyer(john)
- ?lawyer(x1), rich(x1)
- ?rich(x2), house(x2,houseof(x2))
- ?rich(x3), ?house(x3,y1), big(y1)
- ?house(x4,y2), ?big(y2), work(y2)
- ?house(john,y3), ?work(y3)
- rich(john) 1,2
- house(john,houseof(john)) 3,7
- ?house(john,y1), big(y1) 4,7
- ?big(houseof(john)), work(houseof(john)) 5,8
- ?work(houseof(john)) 6,8
- big(houseof(john)) 8,9
- work(houseof(john)) 10,12
- 11,13
Refutation complete for Horn clauses. Not in
general.
7Directed Resolution
- A directed clause is a Horn clause in which the
positive literal occurs either at the beginning
or the end of the clause. - When we order the clause to have the positive
literal at the end, the clause is called forward
clause. - When we order the clause to have the positive
literal at the beginning, the clause is called
backward clause. - We can use a bit of syntactic sugar.
- Write forward clauses using ?
- Write backward clauses using ?
- E.g.
- ??1,, ??n, ? can write it as ?1,, ?n ? ?
(forward) - ?, ??1,, ??n can write it as ? ? ?1,, ?n
(backward) - ??1,, ??n can write it as ?1,, ?n ?
(forward) - ??1,, ??n can write it as ? ?1,, ?n
(backward)
8Positioning the positive literal
- The possibility of controlling the direction of
resolution by positioning the positive literal at
one or the other end of a clause raises the
question of which direction is more efficient. - Example.
- insect(x) ? animal(x)
- mammal(x) ? animal(x)
- ant(x) ? insect(x)
- bee(x) ? insect(x)
- spider(x) ? insect(x)
- lion(x) ? mammal(x)
- tiger(x) ? mammal(x)
- zebra(x) ? mammal(x)
- Assuming that Zeke is a zebra, is Zeke an animal?
I.e. is the following entailed? - zebra(zeke) ? animal(zeke)
- Negated goal zebra(zeke) ?animal(zeke), which
make the clauses zebra(zeke) and
?animal(zeke).
9Lets try considering them forward
- ?insect(x), animal(x)
- ?mammal(x), animal(x)
- ?ant(x), insect(x)
- ?bee(x), insect(x)
- ?spider(x), insect(x)
- ?lion(x), mammal(x)
- ?tiger(x), mammal(x)
- ?zebra(x), mammal(x)
- zebra(zeke)
- ?animal(zeke)
- mamal(zeke) 8,9
- animal(zeke) 2,11
- 10,12
We use ordered resolution on the left. Only
three steps needed!!
10Lets try considering them backward
- animal(x), ?insect(x)
- animal(x), ?mammal(x)
- insect(x), ?ant(x)
- insect(x), ?bee(x)
- insect(x), ?spider(x)
- mammal(x), ?lion(x)
- mammal(x), ?tiger(x)
- mammal(x), ?zebra(x)
- zebra(zeke)
- ?animal(zeke)
- ?insect(zeke) 1,10
- ?mammal(zeke) 2,10
- ?ant(zeke) 3,11
- ?bee(zeke) 4,11
- ?spider(zeke) 5,11
- 16. ?lion(zeke) 6,12
- 17. ?tiger(zeke) 7,12
- 18. ?zebra(zeke) 8,12
- 19. 9,18
- Now we did 9 steps!!
- So, should we conclude that if we do clauses
forward, then the resolution will be more
efficient? - No! Look at the next example.
11Another example
- Consider the following databases of information
about zebras - Zebras are mammals, striped, and medium in size.
Mammals are animals and warm-blooded. Striped
things are nonsolid and nonspotted. Things of
medium size are neither small nor large. - zebra(x) ? mammal(x)
- zebra(x) ? striped(x)
- zebra(x) ? medium(x)
- mammal(x) ? animal(x)
- mammal(x) ? warm(x)
- striped(x) ? nonsolid(x)
- striped(x) ? nonspotted(x)
- medium(x) ? nonsmall(x)
- medium(x) ? nonlarge(x)
- Assuming that Zeke is a zebra, is Zeke nonlarge?
I.e. is the following entailed? - zebra(zeke) ? nonlarge(zeke)
- Negated goal The clauses zebra(zeke) and
?nonlarge(zeke).
12Lets try backward resolution
- mammal(x), ?zebra(x)
- striped(x), ?zebra(x)
- medium(x), ?zebra(x)
- animal(x), ?mammal(x)
- warm(x), ?mammal(x)
- nonsolid(x), ?striped(x)
- nonspotted(x), ?striped(x)
- nonsmall(x), ?medium(x)
- nonlarge(x), ?medium(x)
- zebra(zeke)
- ?nonlarge(zeke)
- ?medium(zeke) 9,11
- ?zebra(zeke) 3,12
- 10,13
So, backward resolution needs only three
steps!! What about forward resolution?
13Lets try forward resolution
- ?zebra(x), mammal(x)
- ?zebra(x), striped(x)
- ?zebra(x), medium(x)
- ?mammal(x), animal(x)
- ?mammal(x), warm(x)
- ?striped(x), nonsolid(x)
- ?striped(x), nonspotted(x)
- ?medium(x), nonsmall(x)
- ?medium(x), nonlarge(x)
- zebra(zeke)
- ?nonlarge(zeke)
- mammal(zeke) 1,10
- striped(zeke) 2,10
- medium(zeke) 3,10
- animal(zeke) 4,12
- warm(zeke) 5,12
- nonsolid(zeke) 6,13
- nonspotted(zeke) 7,13
- nonsmall(zeke) 8,14
- nonlarge(zeke) 9,14
- 11,20
- Forward resolution needs 10 steps!!
14Forward vs. backward resolution
- The fact is that forward resolution is best for
some clause sets, and backward resolution is best
for others. - To determine which is best for a which, we need
to look at the branching factor of the clauses. - E.g. The search space branches backward in the
animal example an forward in the zebra problem. - Consequently, we should use backward resolution
in the animal example problem and forward
resolution in the zebra problem. - Of course, things arent always this simple.
Sometimes, its best to use some clauses in the
forward direction and others in the backward
direction. - Deciding which clauses to use in which direction
is NP-complete.
15Ordered resolution for query optimization
16Databases and queries
- We will use ordered resolution for
fill-in-the-blank queries. - The query is posed as a conjunction of positive
literals, containing some number of variables. - The database consists entirely of positive ground
literals. - The task is to find binding for variables.
- Consider the following DB
- parent(art,john) carpenter(ann) senator(john)
- parent(ann,john) carpenter(cap) senator(kim)
- parent(bob,kim)
- parent(bea,kim)
- parent(cap,lem)
- parent(coe,lem)
Query ?x,y. parent(x,y) ? carpenter(x) ?
senator(y) SQL SELECT x,y FROM
parent, carpenter, senator WHERE
parent.xcarpenter.x AND
parent.ysenator.y
17Lets use ordered resolution
- First lets negate the query and write in clausal
form - ?(parent(x,y) ? carpenter(x) ? senator(y))
- ?parent(x,y), ?carpenter(x), ?senator(y)
- In order to record the binding of variables we
will add an answer literal ans(x,y), which should
stay always at the end - dont confuse here with the forward vs. backward
ans(x,y) is an artificial literal, that will
never resolve). - ?parent(x,y), ?carpenter(x), ?senator(y),
ans(x,y) - Now lets use ordered resolution to derive an
answer, using the DB shown previously.
18Lets use ordered resolution
?parent(x,y), ?carpenter(x), ?senator(y),
ans(x,y) ?carpenter(art), ?senator(john),
ans(art,john) ?carpenter(ann), ?senator(john),
ans(ann,john) ?carpenter(bob), ?senator(kim),
ans(bob,kim) ?carpenter(bea), ?senator(kim),
ans(bea,kim) ?carpenter(cap), ?senator(lem),
ans(cap,lem) ?carpenter(coe), ?senator(lem),
ans(coe,lem) ?senator(john), ans(ann,jon) ?sen
ator(lem), ans(cap,lem) ans(ann,john) Well,
from the standpoint of correctness it doesnt
matter if we change the order of literals, and
(re)write the query as ?senator(y),
?carpenter(x), ?parent(x,y), ans(x,y)
parent(art,john) parent(ann,john) parent(bob,kim)
parent(bea,kim) parent(cap,lem) parent(coe,lem)
carpenter(ann) carpenter(cap) senator(john) sen
ator(kim)
19Efficiency
- From the standpoint of efficiency, one of the key
questions is the order of the literals in the
query. Suppose we do
parent(art,john) parent(ann,john) parent(bob,kim)
parent(bea,kim) parent(cap,lem) parent(coe,lem)
carpenter(ann) carpenter(cap) senator(john) sen
ator(kim)
?senator(y), ?carpenter(x), ?parent(x,y),
ans(x,y) ?carpenter(x), ?parent(x,john),
ans(x,john) ?carpenter(x), ?parent(x,kim),
ans(x,kim) ?parent(ann,john),
ans(ann,john) ?parent(cap,john),
ans(cap,john) ans(ann,john) So, 5 steps
instead of 9 previously.
20Efficiency (continued)
- Previously, we gained 4 steps. Well, 4 is not so
impressive! - But,Lets consider a real census database with
the following properties. - There are 100 senators.
- There are 100,000 carpenters.
- There are 10,000,000 parent-child pairs.
- If we use
- ?parent(x,y), ?carpenter(x), ?senator(y),
ans(x,y) - we will have a search space of gt 10,000,000
possibilities - However, if we use
- ?senator(y), ?parent(x,y), ?carpenter(x),
ans(x,y) - The search space will be at most 1002 because
there are only 100 senators and only 2 parents
for each senator.
21Heuristic?!
- Heuristic Cheapest literal first!
- Unfortunately, the rule doesnt always produce
the optimal ordering. - E.g. Consider that instead of parent(x,y) we
have represents(y,x). - And for x variable and a and b constants suppose
that - represents(a,x) ? 10,000 and
represents(x,b) 1 - Now, the heuristic suggests that we should order
as - ?senator(y), ?represents(y,x), ?carpenter(x),
ans(x,y) - after y is bounded then there are 10,000
possibilities for represents(). In total, we
have 10010,000 1,000,000 possibilities to
check. - The problem is that there is a better ordering
- ?carpenter(x), ?represents(y,x), ?senator(y),
ans(x,y) - because after x is bounded there is only 1
possibility for represents(). In total, we have
100,000 1 100,000 possibilities to check. - One way of guaranteeing the optimal ordering is
to exhaustively search through all possible
orderings (very expensive). Better way exits
22Prolog
23Backward chaining for definite clauses
- zebra(zeke)
- lion(bob)
- tiger(quincy)
- spider(sp)
- animal(x), ?mammal(x)
- animal(y), ?insect(y)
- insect(z), ?spider(z)
- mammal(w), ?lion(w)
- mammal(v), ?tiger(v)
- mammal(o), ?zebra(o)
- ?animal(zeke)
- ?mammal(zeke)
- ?lion(zeke)
- ?tiger(zeke)
- ?zebra(zeke)
-
- Prolog goes recursively in a depth first fashion.
- This is called Backward chaining.
24Binding variables
- zebra(zeke)
- lion(bob)
- tiger(quincy)
- spider(sp)
- animal(x), ?mammal(x)
- animal(y), ?insect(y)
- insect(z), ?spider(z)
- mammal(w), ?lion(w)
- mammal(v), ?tiger(v)
- mammal(o), ?zebra(o)
- ?animal(t)
- ?mammal(x) substt?x
- ?lion(w) substx?w
- substw?bob i.e. t?bob
- ?tiger(v) substx?v
- substv?quincy i.e. t?quincy
25Prolog
- Prolog uses backward chaining as its inference
procedure. - Very fast!!
- Uses the implication (with symbol -) for the
clauses. Commas mean and. - Constants start with lower-case letter while
variables start with upper-case letter. - Variables are implicitly universally quantified.
- All sentences end with period.
- zebra(zeke).
- lion(bob).
- tiger(quincy).
- spider(sp).
- animal(X) - mammal(X).
- animal(Y) - insect(Y).
- insect(Z) - spider(Z).
- mammal(W) - lion(W).
- mammal(V) - tiger(V).
- mammal(O) - zebra(O).
26SWI-Prolog
- Create a file zebra.pl
- Double click on it to open SWI-Prolog, which also
loads the file. - Then, you can ask queries
- E.g. (type to see more bindings)
- ?- animal(X).
- X bob
- X quincy
- X zeke
- X sp
- Yes
27Prolog Search Trees
- The simplest cases are ground queries, those that
do not contain variables, e.g. - book(principia)?
- wrote(gottlob,begriffsschrift)?
- Prolog goes through the data base, starting at
the top, trying to match the query with a fact
listed there (in order to apply resolution). - If a matching fact is found, Prolog replies Yes,
otherwise No. - If the ground query is complex (or conjunctive),
like (3), Prolog tries to match one goal after
the other, going from left to right. - (3) book(principia),wrote(gottlob,begriffsschr
ift)? - In these cases, Prolog also records the variable
binding under which a given goal matches a
certain fact listed in the data base.
- wrote(terry,shrdlu).
- wrote(bill,lunar).
- wrote(roger,sam).
- wrote(gottlob,begriffsschrift).
- wrote(bertrand,principia).
- wrote(alfred,principia).
- book(begriffsschrift).
- book(principia).
- program(lunar).
- program(sam).
- program(shrdlu).
28This tree is to be read from top to bottom and
from left to right as follows. When Prolog is
presented with the conjunctive query
book(What),wrote(bertrand, What), it first
finds a clause in the data base matching
book(What), namely under the assignment
Whatbegriffsschrift. The first goal, which is
now book(begriffsschrift), is therefore taken
care of, as indicated by crossing out
book(begriffsschrift). Also, the assignment
Whatbegriffsschrift has turned the second goal
into the ground goal wrote(bertrand,
begriffsschrift). But since this goal does not
appear in the data base, it cannot be crossed
out, and so Prolog does not report the variable
assignment Whatbegriffsschrift, as this answer
is not correct according to the dataB
29So Prolog backtracks to the initial query in the
topmost box, trying to find an alternative match
for book(What) in the data base. Indeed, Prolog
finds another clause that matches, this time
under the assignment Whatprincipia. The first
goal is again crossed out. Also, the assignment
Whatprincipia has turned the second goal into
the ground goal wrote(bertrand,principia). This
goal does appear in the data base, so it can be
crossed out, too. Therefore, Prolog reports the
variable assignment Whatprincipia.
30Prolog may backtrack to find additional answers
to a given query provided the user types a
semicolon. If every goal in a leaf box is
crossed out, then Prolog has found an answer.
The answer consists of all the assignments that
appear on branches going from the leaf to the top.
31Goal order
- The order of goals can have an effect on the size
of of the search tree. - Suppose, for example, we reverse the order of
goals in the previous example. - In that case, we get this somewhat smaller search
tree.
32Recursion
- parent(katherine,bertrand).
- parent(bertrand,kate).
- parent(kate,sue).
- ancestor(Old,Young) - parent(Old,Young).
- ancestor(Old,Young) - parent(Old,Middle),
ancestor(Middle,Young). - The second ancestor rule exhibits recursion.
33(No Transcript)
34Variable renaming (standartizing variables
appart)
- The previous search tree reveals another feature
of Prologs processing of queries, namely
renaming of variables. - The second time Prolog uses the recursive
ancestor rule, it must change the name of the
variable Middle occurring in that rule. Without
doing so, Prolog would end up recording that
Middlebetrand and Middlekate, which is a plain
contradiction! - Identical variables within a clause stand for the
same objects. - But variables in different clauses, even if they
look the same, stand for possibly different
objects. - Moreover, identical variables in different
applications of the same recursive rule may stand
for different objects. - So the Middle variable in the recursive ancestor
rule stands for betrand in the first application
and for kate in the second. Variable renaming
ensures that this is possible.
35Rule ordering (again)
- ancestor(Old,Young) - parent(Old,Middle),
ancestor(Middle,Young). - ancestor(Old,Young) - parent(Old,Young).
- Reversing rule order never has any effect on the
answers Prolog gives (except possibly for the
order in which multiple answers are reported). - However, reversing rule order has an effect on
search trees. - First, reversing rule order reverses the
left-to-right ordering of the boxes in which the
goals appear. - We get the mirror search tree.
36Clause ordering
- The order of clauses in the body of a rule is of
greater importance than the order of the rules
themselves. - Lets reverse the order of clauses in the body of
the recursive rule. - ancestor(Old,Young) - ancestor(Middle,Young),
parent(Old,Middle). - ancestor(Old,Young) - parent(Old,Young).
- Prolog wont terminate!!
37Prolog as a theorem prover
- Prolog can prove some theorem by luck
- E.g. Suppose we load this database
- link(a,b).
- link(b,c).
- path(X,Y) - link(X,Y).
- path(X,Y) - path(X,Z), path(Z,Y).
- And the query is
- path(a,c).
- We will have
link(a,b) link(b,c) path(X,Y),
?link(X,Y) path(X,Y), ?path(X,Z),
?path(Z,Y) ?path(a,c) ?link(a,c)
Stuck! Try next clause. ?path(a,Z),
?path(Z,c) ?link(a,Y)
subst X?a, Z?b ?link(Z,c)
subst X?a, Z?b, Y?c Ok, we proved it!
38Prolog as a theorem prover
- Prolog is not complete as a theorem prover.
- Suppose we write the previous database by
changing the order of clauses. - link(a,b).
- link(b,c).
- path(X,Y) - path(X,Z), path(Z,Y).
- path(X,Y) - link(X,Y).
- And the query is the same
- path(a,c).
- Now, we will have
link(a,b) link(b,c) path(X,Y), ?path(X,Z),
?path(Z,Y) path(X,Y), ?link(X,Y) ?path(a,c)
?link(a,c) Stuck! Try next clause.
?path(a,Z), ?path(Z,c) ?path(Z,c),
?path(Z,Z), ?path(Z,Y) ?path(Z,Z),
?path(Z,Y), ?path(Z,Z), ?path(Z,Z) forev
er So, by just a simple rearrangement of clauses
we cant prove it anymore.