Logic, Language and Learning - PowerPoint PPT Presentation

1 / 68
About This Presentation
Title:

Logic, Language and Learning

Description:

Some mini-projects (1-3) Larger exercises, more work, more time and ... Fair play demanded ! Teams of two are allowed. Books ... Studies in computer science ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 69
Provided by: profdrlu
Category:

less

Transcript and Presenter's Notes

Title: Logic, Language and Learning


1
Logic, Language and Learning
  • Chapter 1 Introduction
  • Luc De Raedt

2
Ingredients
  • Logic
  • Computational logic
  • Language
  • Natural language processing
  • Learning
  • Machine learning

Logic
Language
Learning
3
Practical
  • Theory
  • Tuesday 14-15
  • Thursday 9-10
  • Exercises
  • Thursday 10-11
  • Responsible
  • Luc De Raedt and Sunna Torge

4
Exercises and Exams
  • Normal Exercise Sheet
  • Each exercise sheet is worth one bonus point
  • One bonus points requires half of the marks on
    the sheet
  • To be able to use your bonus points, you need to
    have earned at least half of them.
  • Some mini-projects (1-3)
  • Larger exercises, more work, more time and
    count for more bonus points
  • Fair play demanded !
  • Teams of two are allowed

5
Books and Materials
  • Logic and Prolog
  • Peter Flach, Simply Logical, John Wiley, 94.
  • Other recommended books
  • Bratko, Prolog programming for AI
  • Sterling and Shapiro, The art of Prolog
  • Learning
  • Draft by Luc De Raedt, From ILP to Relational
    Data Mining, Springer, in preparation
  • Language
  • Covington, NLP for Prolog Programmers 94
  • Matthews, An Introduction to NLP through Prolog,
    98
  • Possibly research papers.

6
Logic
Language
Logic
  • Computational logic
  • Programming language Prolog as a tool for
    studying computational logic

Learning
7
Why Logic ?
  • Logic is (and has always been) popular in AI and
    CS
  • as a tool to represent knowledge
  • as a tool for reasoning
  • as a tool to formalize
  • See the course on Artificial Intelligence
  • Field of Computational Logic

8
Computational logic
  • Using Logic to Compute
  • J.A. Robinsons Resolution principle
  • A machine oriented logic based on the resolution
    theorem prover, JACM, 1965.
  • Single inference rule for clausal logic
  • R. Kowalski Algorithm Logic Control
  • (Kowalski, Logic for problem solving, 1979, North
    Holland)
  • Procedural and Declarative Interpretations of
    Logic

9
Computational logic (2)
  • The programming language Prolog
  • Kowalski, Colmerauer, Bruynooghe, ...
  • Use of Logic Programming to study a variety of
    problems in artificial intelligence and computer
    science
  • natural language processing
  • e.g. Definite Clause Grammars, Unification Based
    Grammars, ...
  • Planning and knowledge representation
  • situation calculus and event calculus,
    non-monotonic reasoning, ...

10
Computational logic (3)
  • Deductive databases
  • datalog, constraints, database updating,
    recursion, ...
  • Theory of databases
  • Constraint logic programming
  • reasoning about constraints
  • Abductive logic programming
  • applications in diagnostic reasoning, planning
    and databases
  • Inductive logic programming
  • data mining and machine learning in computational
    logic

11
Computational logic (4)
  • Meta-programming
  • Theory of logic programming
  • computation, relation to first order logic,
    negation, non-monotonic reasoning, ...
  • Studies in computer science
  • e.g. termination, program transformation, program
    synthesis, program analysis, ...
  • ...

12
Deductive Reasoning
  • from KB derive h such that KB h
  • (KB logically entails h)
  • from forall X man(X) -gt mortal(X)
  • and man(socrates)
  • infer mortal(socrates)
  • normal use of logic
  • Truth preserving if KB is true, then h is true
    as well
  • applications theorem proving - computations

13
Abductive Reasoning
  • from KB and o find h such that KB U h o
  • from forall X man(X) -gt mortal(X)
  • and mortal(socrates)
  • infer male(socrates)
  • diagnostic/causal reasoning
  • (usually) about single observation
  • usually only facts are inferred

14
Inductive Reasoning
  • from KB and o1, o2 find h such that
  • KB U h o1 and o2
  • from man(socrates), mortal(socrates)
  • and man(plato), mortal(plato)
  • infer forall X man(X) -gt mortal(X)
  • learning
  • (usually) about multiple observations
  • usually rules / clauses
  • Inductive reasoning is falsity-preserving
  • If o1 or o2 is false (and KB is assumed to be
    true), then h must be false as well.

15
Logic
Language
Logic
  • Many examples of Prolog programming will be seen
  • Theorem provers
  • NLP systems
  • Eliza
  • Abductive reasoning ...

Learning
16
Language
Language
Logic
Learning
  • Natural language processing
  • Focus on Computer Science perspective
  • Building systems that understand Nat. Lang.
  • Linguistic aspects important too !

17
Eliza
18
Eliza
19
Eliza
  • Eliza does not understand
  • Relies on your intelligence
  • No world/domain knowledge
  • Easy to mislead Eliza

20
Representation and understanding
  • Need to represent meaning
  • Disambiguation
  • Multiple senses Rice flies like sand
  • Representation
  • Precise and unambiguous
  • Intuitive structure of language
  • Syntax
  • Semantics
  • Meaning

21
Ambiguity
22
An Architecture for NLP
23
Levels of language analysis
  • Phonetic and phonological knowledge
  • Morphological
  • Syntactic
  • Semantic
  • Pragmatic
  • Discourse
  • World

24
Applications of NLP
  • Text based
  • Document retrieval
  • Information extraction
  • Translation
  • Summarization
  • Story understanding
  • Various levels of understanding

25
Applications of NLP
  • Dialogue based
  • Question answering
  • Automated customer service
  • Tutoring systesm
  • Spoken language control of machines
  • Cooperative problem solving
  • Active participation is different from speech
    recognition
  • Understanding \ speech recognition

26
Learning
Language
Logic
  • Machine learning
  • Aims at improving ones performance on a specific
    task with experience
  • Often related to use of data, data analysis and
    data mining

Learning
27
Inductive learning and data mining
  • Generalizing specific observations into general
    laws
  • Synthesizing new knowledge from sets of examples

28
Predictive Induction
- - - -- --
-- -
29
Machine Discovery using Induction
30
Quinlans Example
31
Decision Tree
Outlook
sunny overcast rain
Humidity
Windy
plus
high normal
yes no
neg plus
neg plus
32
Algorithm
  • Recursively split imperfect nodes
  • contain examples of multiple classes
  • Select best attribute to split upon
  • using statistical criteria
  • information gain
  • Until perfect nodes or further splitting
    uninteresting (not significant)
  • Numberhandling
  • Noise handling

33
Descriptive Induction

34
Basket-Analysis Example
35
Descriptive Data Mining
Table in Relational Database
Association Rules
IF mustard and sausage THEN beer support 7
confidence 65
IF bread and butter THEN cheese support 20
confidence 66
Selected, Preprocessed, and Transformed Data
36
Mining Association Rules ...
2-Phased Algorithm (Agrawal et al., 93)
Phase 1 Queries with high frequency
customers
bread and butter
cheese
30 of customers buy bread and butter
40 of customers buy cheese
20 of customers buy butter and bread and cheese
Phase 2 Association Rules with high confidence
support
IF bread and butter THEN cheese support 20
confidence 66
confidence
37
Association rules
  • Available in ALL commercial data mining tools
  • Typical applications in marketing, basket
    analysis, sequence analysis, networking,
  • E.g. alarms in a network

38
Unification Based Grammars
Inductive Logic Programming
Language
Logic
Language
Logic
Learning
Learning
Language
Logic
Learning
Empirical/Statistical Natural Language Processing
39
Inductive logic programming
  • The study of (inductive) machine learning and
    data mining using computational logic
  • Why ?
  • The need for an expressive representation
    language
  • Usual learning/mining techniques employ flat
    table
  • Logical aspects of learning
  • Induction/generalization and Deduction/Specializat
    ion

40
Examples labeled neg
Examples labeled pos
41
QSAR - a real-life example
Helma et al. JCICS, 2004 fragment -
ccccccccc
42
Molecular Feature Mining
  • Fragment is substructure
  • Here linear fragment - sequence of atoms and
    bonds
  • Works with SMILES / SMARTS
  • Computational chemistry format
  • Kramer et al KDD 01

43
Notice
  • General purpose systems
  • Declarative knowledge
  • structural alerts
  • readily understandable to human experts
  • Published in application domain
  • Real discoveries / understanding
  • Towards a Discovery Science

44
Why in logic ?
  • Limitations of classical machine learning and
    data mining techniques
  • essentially a propositional logic
  • Imagine a neural net / statistical package infer
    the structural alert !
  • PL1 is more expressive and declarative
  • PL1 is well-understood
  • Inference rules

45
Linked Bibliographic Data
Examples
Logistic Regression, SVMs,
Naive Bayes,
Real World
46
Linked Bibliographic Data
P2
paper
P1
author
institution
P3
I1
citation
co-citation
A1
author-of
P4
author-affiliation
Attributes/classes
  • Multi-relational, heterogeneous
  • and semi-structured

Real World
47
Linked Bibliographic Data
Authors
Text/attributes
Citations
Real World
48
Representing bibliographic information
  • author(ottmann, algods)
  • author(wiedemayer, algods)
  • author(wirth, algodsprog)
  • institute(ottmann, iif)
  • institute(wiedemayer, eth)
  • cites(algods,algodsprog)

49
A first order representation
triangle(e1,o1,up). triangle(e1,o2,up). ...
in(e1,o1,o2). in(e1,o3,o4). ...
square(e1,o3). square(e1,o4). ...
50
A first order representation
atom(nitro,a1,carbon,-4). atom(nitro,a2,hydrogen,-
2). atom(nitro,a3,carbon,-2). ...
bond(nitro,a1,a2,double). bond(nitro,a2,a3,single)
. ...
benzene-ring(nitro,a1,a2,a3,a4,a5,a6). could
be defined as a view predicate/relation
51
Clause Rule
  • pos(E) - triangle(E,X),in(E,X,Y),triangle(E,Y)
  • pos(E) - circle(E,X),in(E,Y,X),config(E,Y,up)
  • pos(E) - circle(E,X),in(E,Y,X),square(E,Y),triang
    le(E,Z)

52
covers(H,pos(e)) iff KB U H pos(e)
Pos if here are two distinct triangles with
identical configuration, and a circle
pos(E) - triangle(E,X,C),triangle(E,Y,C),not(XY)
,circle(E,Z).
E
square(e1,o1). triangle(e1,o2,up). in(e1,o2,o1). c
ircle(e1,o3). triangle(e1,o4,up). in(e1,o4,o3).
?-pos(e1) Yes
triangle(e5,o51,up). square(e5,o52). in(e5,o52,o51
).
?-pos(e5) No
53
Inductive logic programming
  • Representational issues
  • how to represent examples, hypotheses and
    background knowledge ?
  • Problem settings and algorithms
  • predictive
  • descriptive
  • distance based
  • k-nearest neighbor clustering

54
Inductive logic programming
  • Structuring the search space
  • learning as search (Mitchell)
  • generality relation provides structure
  • Generality and logical entailment coincide
  • inference rules for induction by inverting
    deductive inference resolution
  • theta-subsumption and resolution

55
Language and Logic
  • Many implementations of NLP systems employ
    computational logic
  • Prolog
  • Unification based grammars
  • Example Definite clause grammars

56
Context-free grammar
p.132
sentence --gt noun_phrase,verb_phrase. noun_phrase
--gt proper_noun. noun_phrase --gt
article,adjective,noun. noun_phrase --gt
article,noun. verb_phrase --gt intransitive_verb. v
erb_phrase --gt transitive_verb,noun_phrase. articl
e --gt the. cons(the,Nil) adjective --gt
lazy. adjective --gt rapid. proper_noun --gt
achilles. noun --gt turtle. intransitive_verb -
-gt sleeps. transitive_verb --gt beats.
57
Parse tree
p.133
58
Analogy Proof/Derivation
p.133
sentence
sentence --gt noun_phrase,
verb_phrase
noun_phrase --gt article,
noun_phrase,verb_phrase
adjective,
noun
article --gt the
article,adjective,noun,verb_phrase
adjective --gt rapid
the,adjective,noun,verb_phrase
noun --gt turtle
the,rapid,noun,verb_phrase
the,rapid,turtle,verb_phrase
verb_phrase --gt transitive_verb,
noun_phrase
the,rapid,turtle,transitive_verb,noun_phrase
transitive_verb --gt beats
the,rapid,turtle,beats,noun_phrase
noun_phrase --gt proper_noun
the,rapid,turtle,beats,proper_noun
proper_noun --gt achilles
the,rapid,turtle,beats,achilles
59
Translating CFG in Prolog
60
Difference lists in grammar rules
p.135
noun phrase
verb phrase
VP2
VP1
NP1
sentence(Np1,Vp2)-noun_phrase(Np1,Vp1),verb_phr
ase(Vp1,Vp2)
61
Translating CFG in Prolog
62
Non-terminals with argumentsUnification
Based
p.137
sentence --gt noun_phrase(N),verb_phrase(N). noun_p
hrase(N) --gt article(N),noun(N). verb_phrase(N) --
gt intransitive_verb(N). article(singular) --gt
a. article(singular) --gt the. article(plural)
--gt the. noun(singular) --gt turtle. noun(plura
l) --gt turtles. intransitive_verb(singular) --gt
sleeps. intransitive_verb(plural) --gt sleep.
63
Translate to Prolog
  • Translation is automatic !
  • Most Prolog implementation can directly cope with
    DCG notation

64
Constructing parse trees
p.137-8
sentence(s(NP,VP)) --gt noun_phrase(NP),verb_phrase
(VP). noun_phrase(np(N)) --gt proper_noun(N). noun_
phrase(np(Art,Adj,N)) --gt article(Art),adjective(A
dj), noun(N). noun_phrase(np(Art,N)) --gt
article(Art),noun(N). verb_phrase(vp(IV)) --gt
intransitive_verb(IV). verb_phrase(vp(TV,NP)) --gt
transitive_verb(TV), noun_phrase(NP). article(ar
t(the)) --gt the. adjective(adj(lazy)) --gt
lazy. adjective(adj(rapid)) --gt
rapid. proper_noun(pn(achilles)) --gt
achilles. noun(n(turtle)) --gt
turtle. intransitive_verb(iv(sleeps)) --gt
sleeps. transitive_verb(tv(beats)) --gt beats.
-sentence(achilles,beats,the,lazy,turtle,nil,T)
T s(np(pn(achilles)), vp(tv(beats),
np(art(the), adj(lazy),
n(turtle))))
65
Parse tree
p.133
66
Natural language processing
  • Using grammar formalisms
  • Close correspondence between certain types of
    grammars and logic programs
  • Also, many popular grammar formalisms are
    unification based

67
Learning and Language
  • Classical Approach
  • Writing grammars and lexicon by hand
  • Very labour intensive
  • Not so easy to reuse results from one domain to
    the next
  • Deep parsing
  • Empirical/Statistical NLP
  • Start from so-called corpora (data)
  • Derive or optimize (possibly parts of) grammar
    lexicon using ML/Stats Methods
  • Shallow parsing

68
Illustration
  • Wall Street Journal Corpus
  • 3 000 000 words
  • Correct parse tree for sentences known
  • Constructed by hand
  • Can be used to derive stochastic context free
    grammars
  • SCFG assign probability to parse trees
  • Compute the most probable parse tree

69
Ambiguity
70
(No Transcript)
71
Language and Learning
  • Other popular techniques
  • Transformation based learning by Eric Brill
  • Part of speech tagging
  • Given a sentence w1, , wn
  • Assign lexical categories c1, , cn to words
  • TBL assign most likely category to word
  • (estimated on corpus)
  • Then learn rules such as
  • If c1 c2 c3 occurs in consecutive positions
  • Then change c2 into c4
  • Hidden Markov Models

72
Logic, Language and Learning
  • Emphasis on
  • Logic and Prolog
  • Peter Flachs Simply Logical (first few chapters)
  • Language
  • NLP using Prolog/Computational Logic
  • Learning
  • Introduction to Inductive Logic Programming
  • Not so much on learning of language
  • Cf. Advanced AI Techniques Course
Write a Comment
User Comments (0)
About PowerShow.com