Title: Graph-Based Concept Learning
1Graph-Based Concept Learning
Jesus A. Gonzalez, Lawrence B. Holder, and Diane
J. Cook Department of Computer Science and
Engineering University of Texas at Arlington Box
19015, Arlington, TX 76019-0015 gonzalez,holder,c
ook_at_cse.uta.edu http//cygnus.uta.edu/subdue/
2MOTIVATION AND GOAL
- Need for non-logic-based relational concept
learner - Empirical and theoretical comparisons of
relational learners - Logic-based relational learners (ILP)
- FOIL Quinlan et al.
- Progol Muggleton et al.
- Graph-based relational learner
- SUBDUE
3SUBDUE KNOWLEDGE DISCOVERY SYSTEM
- SUBDUE discovers patterns (substructures) in
structural data sets - SUBDUE represents data as a labeled graph.
- Vertices represent objects or attributes
- Edges represent relationships between objects
- Input Labeled graph
- Output Discovered patterns and instances
4SUBDUE EXAMPLE
Input
Output
shape
triangle
object
shape
square
on
object
4 instances of
5SUBDUES SEARCH
- Starts with a single vertex and repeatedly
expands by one edge - Computationally-constrained beam search
- Polynomially-constrained inexact graph matching
- Search space is all sub-graphs of input graph
- Guided by compression heuristic
- Minimum description length
6EVALUATION CRITERION MINIMUM DESCRIPTION LENGTH
- Minimum Description Length (MDL) principle
- The best theory to describe a set of data is the
one that minimizes the DL of the entire data
set. - DL of the graph the number of bits necessary
to completely describe the graph. - Search for the substructure that results in the
maximum compression.
7CONCEPT LEARNING SUBDUE
- Modify Subdue for concept learning (SubdueCL)
- Accept positive and negative graphs as input
examples - Find substructure describing positive examples,
but not negative examples - Learn multiple rules (DNF)
8CONCEPT LEARNING SUBDUE
- Evaluation criteria based on number of positive
examples covered without covering negative
examples - Substructure value 1 - Error
9CONCEPT LEARNING SUBDUE EXAMPLE
- Examples in graph format (chess domain)
a) Board Configuration b) Graph Representation
10PRELIMINARY RESULTS
- Comparison with FOIL and Progol
- Significance test p for the Vote domain
- Significance test p for the Chess domain
11RELATED THEORY
- Galois lattice reference?
- Subdues search space is similar to the Galois
lattice - Polynomial convergence results for the Galois
lattice apply to Subdue - PAC analysis of conceptual graphs reference?
- Subdues representation is a superset of
conceptual graphs - PAC sample complexity results for conceptual
graphs apply to Subdue
12CONCLUSIONS
- Empirical results indicate Subdue is competitive
with ILP systems - More empirical comparisons are necessary
- Theoretical results on Galois lattice and
conceptual graphs apply to Subdue - Need to identify specific components of the
theory directly applicable to Subdue - Expand theories where needed