Title: Logic, Language and Learning
1Logic, Language and Learning
- Chapter 1 Introduction
- Luc De Raedt
2Ingredients
- Logic
- Computational logic
- Language
- Natural language processing
- Learning
- Machine learning
Logic
Language
Learning
3Practical
- Theory
- Tuesday 14-15
- Thursday 9-10
- Exercises
- Thursday 10-11
- Responsible
- Luc De Raedt and Sunna Torge
4Exercises and Exams
- Normal Exercise Sheet
- Each exercise sheet is worth one bonus point
- One bonus points requires half of the marks on
the sheet - To be able to use your bonus points, you need to
have earned at least half of them. - Some mini-projects (1-3)
- Larger exercises, more work, more time and
count for more bonus points - Fair play demanded !
- Teams of two are allowed
5Books and Materials
- Logic and Prolog
- Peter Flach, Simply Logical, John Wiley, 94.
- Other recommended books
- Bratko, Prolog programming for AI
- Sterling and Shapiro, The art of Prolog
- Learning
- Draft by Luc De Raedt, From ILP to Relational
Data Mining, Springer, in preparation - Language
- Covington, NLP for Prolog Programmers 94
- Matthews, An Introduction to NLP through Prolog,
98 - Possibly research papers.
6Logic
Language
Logic
- Computational logic
- Programming language Prolog as a tool for
studying computational logic
Learning
7Why Logic ?
- Logic is (and has always been) popular in AI and
CS - as a tool to represent knowledge
- as a tool for reasoning
- as a tool to formalize
- See the course on Artificial Intelligence
- Field of Computational Logic
8Computational logic
- Using Logic to Compute
- J.A. Robinsons Resolution principle
- A machine oriented logic based on the resolution
theorem prover, JACM, 1965. - Single inference rule for clausal logic
- R. Kowalski Algorithm Logic Control
- (Kowalski, Logic for problem solving, 1979, North
Holland) - Procedural and Declarative Interpretations of
Logic
9Computational logic (2)
- The programming language Prolog
- Kowalski, Colmerauer, Bruynooghe, ...
- Use of Logic Programming to study a variety of
problems in artificial intelligence and computer
science - natural language processing
- e.g. Definite Clause Grammars, Unification Based
Grammars, ... - Planning and knowledge representation
- situation calculus and event calculus,
non-monotonic reasoning, ...
10Computational logic (3)
- Deductive databases
- datalog, constraints, database updating,
recursion, ... - Theory of databases
- Constraint logic programming
- reasoning about constraints
- Abductive logic programming
- applications in diagnostic reasoning, planning
and databases - Inductive logic programming
- data mining and machine learning in computational
logic
11Computational logic (4)
- Meta-programming
- Theory of logic programming
- computation, relation to first order logic,
negation, non-monotonic reasoning, ... - Studies in computer science
- e.g. termination, program transformation, program
synthesis, program analysis, ... - ...
12Deductive Reasoning
- from KB derive h such that KB h
- (KB logically entails h)
- from forall X man(X) -gt mortal(X)
- and man(socrates)
- infer mortal(socrates)
- normal use of logic
- Truth preserving if KB is true, then h is true
as well - applications theorem proving - computations
13Abductive Reasoning
- from KB and o find h such that KB U h o
- from forall X man(X) -gt mortal(X)
- and mortal(socrates)
- infer male(socrates)
- diagnostic/causal reasoning
- (usually) about single observation
- usually only facts are inferred
14Inductive Reasoning
- from KB and o1, o2 find h such that
- KB U h o1 and o2
- from man(socrates), mortal(socrates)
- and man(plato), mortal(plato)
- infer forall X man(X) -gt mortal(X)
- learning
- (usually) about multiple observations
- usually rules / clauses
- Inductive reasoning is falsity-preserving
- If o1 or o2 is false (and KB is assumed to be
true), then h must be false as well.
15Logic
Language
Logic
- Many examples of Prolog programming will be seen
- Theorem provers
- NLP systems
- Eliza
- Abductive reasoning ...
Learning
16Language
Language
Logic
Learning
- Natural language processing
- Focus on Computer Science perspective
- Building systems that understand Nat. Lang.
- Linguistic aspects important too !
17Eliza
18Eliza
19Eliza
- Eliza does not understand
- Relies on your intelligence
- No world/domain knowledge
- Easy to mislead Eliza
20Representation and understanding
- Need to represent meaning
- Disambiguation
- Multiple senses Rice flies like sand
- Representation
- Precise and unambiguous
- Intuitive structure of language
- Syntax
- Semantics
- Meaning
21Ambiguity
22An Architecture for NLP
23Levels of language analysis
- Phonetic and phonological knowledge
- Morphological
- Syntactic
- Semantic
- Pragmatic
- Discourse
- World
24Applications of NLP
- Text based
- Document retrieval
- Information extraction
- Translation
- Summarization
- Story understanding
- Various levels of understanding
25Applications of NLP
- Dialogue based
- Question answering
- Automated customer service
- Tutoring systesm
- Spoken language control of machines
- Cooperative problem solving
- Active participation is different from speech
recognition - Understanding \ speech recognition
26Learning
Language
Logic
- Machine learning
- Aims at improving ones performance on a specific
task with experience - Often related to use of data, data analysis and
data mining
Learning
27Inductive learning and data mining
- Generalizing specific observations into general
laws - Synthesizing new knowledge from sets of examples
28Predictive Induction
- - - -- --
-- -
29Machine Discovery using Induction
30 Quinlans Example
31Decision Tree
Outlook
sunny overcast rain
Humidity
Windy
plus
high normal
yes no
neg plus
neg plus
32Algorithm
- Recursively split imperfect nodes
- contain examples of multiple classes
- Select best attribute to split upon
- using statistical criteria
- information gain
- Until perfect nodes or further splitting
uninteresting (not significant) - Numberhandling
- Noise handling
33Descriptive Induction
34 Basket-Analysis Example
35Descriptive Data Mining
Table in Relational Database
Association Rules
IF mustard and sausage THEN beer support 7
confidence 65
IF bread and butter THEN cheese support 20
confidence 66
Selected, Preprocessed, and Transformed Data
36Mining Association Rules ...
2-Phased Algorithm (Agrawal et al., 93)
Phase 1 Queries with high frequency
customers
bread and butter
cheese
30 of customers buy bread and butter
40 of customers buy cheese
20 of customers buy butter and bread and cheese
Phase 2 Association Rules with high confidence
support
IF bread and butter THEN cheese support 20
confidence 66
confidence
37Association rules
- Available in ALL commercial data mining tools
- Typical applications in marketing, basket
analysis, sequence analysis, networking, - E.g. alarms in a network
38Unification Based Grammars
Inductive Logic Programming
Language
Logic
Language
Logic
Learning
Learning
Language
Logic
Learning
Empirical/Statistical Natural Language Processing
39Inductive logic programming
- The study of (inductive) machine learning and
data mining using computational logic - Why ?
- The need for an expressive representation
language - Usual learning/mining techniques employ flat
table - Logical aspects of learning
- Induction/generalization and Deduction/Specializat
ion
40Examples labeled neg
Examples labeled pos
41QSAR - a real-life example
Helma et al. JCICS, 2004 fragment -
ccccccccc
42Molecular Feature Mining
- Fragment is substructure
- Here linear fragment - sequence of atoms and
bonds - Works with SMILES / SMARTS
- Computational chemistry format
- Kramer et al KDD 01
43Notice
- General purpose systems
- Declarative knowledge
- structural alerts
- readily understandable to human experts
- Published in application domain
- Real discoveries / understanding
- Towards a Discovery Science
44Why in logic ?
- Limitations of classical machine learning and
data mining techniques - essentially a propositional logic
- Imagine a neural net / statistical package infer
the structural alert ! - PL1 is more expressive and declarative
- PL1 is well-understood
- Inference rules
45Linked Bibliographic Data
Examples
Logistic Regression, SVMs,
Naive Bayes,
Real World
46Linked Bibliographic Data
P2
paper
P1
author
institution
P3
I1
citation
co-citation
A1
author-of
P4
author-affiliation
Attributes/classes
- Multi-relational, heterogeneous
- and semi-structured
Real World
47Linked Bibliographic Data
Authors
Text/attributes
Citations
Real World
48Representing bibliographic information
- author(ottmann, algods)
- author(wiedemayer, algods)
- author(wirth, algodsprog)
- institute(ottmann, iif)
- institute(wiedemayer, eth)
- cites(algods,algodsprog)
49A first order representation
triangle(e1,o1,up). triangle(e1,o2,up). ...
in(e1,o1,o2). in(e1,o3,o4). ...
square(e1,o3). square(e1,o4). ...
50A first order representation
atom(nitro,a1,carbon,-4). atom(nitro,a2,hydrogen,-
2). atom(nitro,a3,carbon,-2). ...
bond(nitro,a1,a2,double). bond(nitro,a2,a3,single)
. ...
benzene-ring(nitro,a1,a2,a3,a4,a5,a6). could
be defined as a view predicate/relation
51Clause Rule
- pos(E) - triangle(E,X),in(E,X,Y),triangle(E,Y)
- pos(E) - circle(E,X),in(E,Y,X),config(E,Y,up)
- pos(E) - circle(E,X),in(E,Y,X),square(E,Y),triang
le(E,Z) -
52covers(H,pos(e)) iff KB U H pos(e)
Pos if here are two distinct triangles with
identical configuration, and a circle
pos(E) - triangle(E,X,C),triangle(E,Y,C),not(XY)
,circle(E,Z).
E
square(e1,o1). triangle(e1,o2,up). in(e1,o2,o1). c
ircle(e1,o3). triangle(e1,o4,up). in(e1,o4,o3).
?-pos(e1) Yes
triangle(e5,o51,up). square(e5,o52). in(e5,o52,o51
).
?-pos(e5) No
53Inductive logic programming
- Representational issues
- how to represent examples, hypotheses and
background knowledge ? - Problem settings and algorithms
- predictive
- descriptive
- distance based
- k-nearest neighbor clustering
54Inductive logic programming
- Structuring the search space
- learning as search (Mitchell)
- generality relation provides structure
- Generality and logical entailment coincide
- inference rules for induction by inverting
deductive inference resolution - theta-subsumption and resolution
55Language and Logic
- Many implementations of NLP systems employ
computational logic - Prolog
- Unification based grammars
- Example Definite clause grammars
56Context-free grammar
p.132
sentence --gt noun_phrase,verb_phrase. noun_phrase
--gt proper_noun. noun_phrase --gt
article,adjective,noun. noun_phrase --gt
article,noun. verb_phrase --gt intransitive_verb. v
erb_phrase --gt transitive_verb,noun_phrase. articl
e --gt the. cons(the,Nil) adjective --gt
lazy. adjective --gt rapid. proper_noun --gt
achilles. noun --gt turtle. intransitive_verb -
-gt sleeps. transitive_verb --gt beats.
57Parse tree
p.133
58Analogy Proof/Derivation
p.133
sentence
sentence --gt noun_phrase,
verb_phrase
noun_phrase --gt article,
noun_phrase,verb_phrase
adjective,
noun
article --gt the
article,adjective,noun,verb_phrase
adjective --gt rapid
the,adjective,noun,verb_phrase
noun --gt turtle
the,rapid,noun,verb_phrase
the,rapid,turtle,verb_phrase
verb_phrase --gt transitive_verb,
noun_phrase
the,rapid,turtle,transitive_verb,noun_phrase
transitive_verb --gt beats
the,rapid,turtle,beats,noun_phrase
noun_phrase --gt proper_noun
the,rapid,turtle,beats,proper_noun
proper_noun --gt achilles
the,rapid,turtle,beats,achilles
59Translating CFG in Prolog
60Difference lists in grammar rules
p.135
noun phrase
verb phrase
VP2
VP1
NP1
sentence(Np1,Vp2)-noun_phrase(Np1,Vp1),verb_phr
ase(Vp1,Vp2)
61Translating CFG in Prolog
62Non-terminals with argumentsUnification
Based
p.137
sentence --gt noun_phrase(N),verb_phrase(N). noun_p
hrase(N) --gt article(N),noun(N). verb_phrase(N) --
gt intransitive_verb(N). article(singular) --gt
a. article(singular) --gt the. article(plural)
--gt the. noun(singular) --gt turtle. noun(plura
l) --gt turtles. intransitive_verb(singular) --gt
sleeps. intransitive_verb(plural) --gt sleep.
63Translate to Prolog
- Translation is automatic !
- Most Prolog implementation can directly cope with
DCG notation
64Constructing parse trees
p.137-8
sentence(s(NP,VP)) --gt noun_phrase(NP),verb_phrase
(VP). noun_phrase(np(N)) --gt proper_noun(N). noun_
phrase(np(Art,Adj,N)) --gt article(Art),adjective(A
dj), noun(N). noun_phrase(np(Art,N)) --gt
article(Art),noun(N). verb_phrase(vp(IV)) --gt
intransitive_verb(IV). verb_phrase(vp(TV,NP)) --gt
transitive_verb(TV), noun_phrase(NP). article(ar
t(the)) --gt the. adjective(adj(lazy)) --gt
lazy. adjective(adj(rapid)) --gt
rapid. proper_noun(pn(achilles)) --gt
achilles. noun(n(turtle)) --gt
turtle. intransitive_verb(iv(sleeps)) --gt
sleeps. transitive_verb(tv(beats)) --gt beats.
-sentence(achilles,beats,the,lazy,turtle,nil,T)
T s(np(pn(achilles)), vp(tv(beats),
np(art(the), adj(lazy),
n(turtle))))
65Parse tree
p.133
66Natural language processing
- Using grammar formalisms
- Close correspondence between certain types of
grammars and logic programs - Also, many popular grammar formalisms are
unification based
67Learning and Language
- Classical Approach
- Writing grammars and lexicon by hand
- Very labour intensive
- Not so easy to reuse results from one domain to
the next - Deep parsing
- Empirical/Statistical NLP
- Start from so-called corpora (data)
- Derive or optimize (possibly parts of) grammar
lexicon using ML/Stats Methods - Shallow parsing
68Illustration
- Wall Street Journal Corpus
- 3 000 000 words
- Correct parse tree for sentences known
- Constructed by hand
- Can be used to derive stochastic context free
grammars - SCFG assign probability to parse trees
- Compute the most probable parse tree
69Ambiguity
70(No Transcript)
71Language and Learning
- Other popular techniques
- Transformation based learning by Eric Brill
- Part of speech tagging
- Given a sentence w1, , wn
- Assign lexical categories c1, , cn to words
- TBL assign most likely category to word
- (estimated on corpus)
- Then learn rules such as
- If c1 c2 c3 occurs in consecutive positions
- Then change c2 into c4
- Hidden Markov Models
72Logic, Language and Learning
- Emphasis on
- Logic and Prolog
- Peter Flachs Simply Logical (first few chapters)
- Language
- NLP using Prolog/Computational Logic
- Learning
- Introduction to Inductive Logic Programming
- Not so much on learning of language
- Cf. Advanced AI Techniques Course