Bayesian Network Learning using Transfer - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Bayesian Network Learning using Transfer

Description:

Relative frequencies of the associated combinations of attribute ... Sub-domain classification. General domain classification. Structure overlap in sub-domains ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 43
Provided by: sjin2
Category:

less

Transcript and Presenter's Notes

Title: Bayesian Network Learning using Transfer


1
Bayesian Network Learning using Transfer
Liang Yi Wang
  • December 2007
  • University of Amsterdam

2
Overview
  • Bayesian Network (Learning)
  • K2
  • DAG (directed acyclic graph) search
  • Using Transfer
  • K2-Transfer
  • DAG-Transfer
  • Experiment
  • Conclusion

3
Bayesian Networks
4
Bayesian Network Learning
  • Learning consists of 2 parts
  • Structure
  • Conditional probability tables

5
Conditional probability tables
  • Conditional probability tables can be estimated
    when the structure is known
  • Relative frequencies of the associated
    combinations of attribute values in the dataset
  • Avoid zero frequencies

6
Structure
  • General idea rank all possible structures and
    select the best one
  • Problem too many possible structures
  • n(2) 3
  • n(3) 25
  • n(10) 4.2 X 1018
  • Solution Evaluation function Search
    strategy

7
Evaluation function
8
K2
  • Cooper Herskovits 1992
  • General idea
  • Ordered list of attributes to restrict the
    possible conditional dependencies
  • Attributes before current attribute in the list
    is considered as possible parents
  • Parameter to reduce overfitting

9
K2
  • 1, 2, 3, 4, 5, 6
  • Possible parents 1, 2, 3
  • Best parent 3, 3 ? 4
  • Possible parents 1, 2
  • Best parent 1, 1 ? 4
  • Possible parents 2
  • Best parent -
  • 1, 2, 3, 4, 5, 6
  • Possible parents 1, 2, 3, 4

10
K2 algorithm
  • Ordered attribute list
  • For every attribute in list
  • Possible parents attributes before current
    attribute
  • Do
  • Parent of attribute best attribute from
    possible parents
  • remove best attribute from possible parents
  • While better structure found and max parents not
    reached

11
K2
  • Pros
  • Fast algorithm
  • Cons
  • Ordering of attributes must be known
  • If the ordering is unknown, repeat K2 with
    different attribute ordering

12
DAG search
  • General idea
  • Greedy search through neighbour structures
  • A neighbour of a structure is a structure with
  • One extra link
  • One removed link
  • One link with changed direction
  • Parameter to reduce overfitting

13
DAG search algorithm
  • current empty structure
  • Do
  • If best neighbour_structure better than current
  • current best neighbour_structure
  • While better structure found

14
DAG search
  • Pros
  • No initial ordering of attributes
  • Cons
  • Slower than K2
  • Local maxima

15
Using Transfer
  • General idea
  • Give learner extra information by using a
    transfer structure
  • Transfer structure is related to new task
  • Same attributes
  • Share some conditional dependencies
  • Resulting structure is influenced by the transfer
    structure and the new dataset

16
K2-Transfer
  • Use transfer structure to extract a possible
    attribute ordering
  • Give preference of edges in transfer structure if
    they are good according to the dataset

17
K2-Transfer
  • EDUCATION
  • AGE
  • FNLWGT
  • RELATIONSHIP
  • CAPITAL_LOSS
  • HOURS_PER_WEEK
  • RACE
  • SALARY_GROUP

18
K2-Transfer algorithm
  • Order attribute list created by transferred edges
  • For every attribute in list
  • Possible parents attributes before current
    attribute
  • Do
  • Parent of attribute best attribute from
    possible parents that are also in the
    transferred edges
  • remove best attribute from possible parents
  • While better structure found and max parents not
    reached
  • Do
  • Parent of attribute best attribute from
    possible parents
  • remove best attribute from possible parents
  • While better structure found and max parents not
    reached

19
K2-Transfer
  • Fast algorithm
  • Structures will be more similar to the transfer
    structure while still being a model of the dataset

20
DAG-Transfer
  • Start DAG search algorithm with transfer
    structure
  • Give simple neighbour structures preference over
    complex ones

21
DAG-Transfer algorithm
  • current transferred structure
  • Do
  • If best missing_edge_neighbour_structure better
    than current
  • current best neighbour_structure
  • else If best reverse_edge_neighbour_structure
    better than current
  • current best neighbour_structure
  • else If best add_edge_neighbour_structure
    better than current
  • current best neighbour_structure
  • While better structure found

22
DAG-Transfer
  • Learner will first remove edges that are not
    relevant to the dataset
  • Learner adds edges that are relevant to the
    dataset
  • Reduce chance of bad choice in the beginning

23
Experiment
  • Dataset of people about their salary
  • Contains 30.000 instances
  • 15 attributes
  • Test transfer learner by splitting dataset by
    occupation

24
Dataset
25
Dataset
26
Evaluation
  • Classification test percentage of correct
    classified instances when an attribute is unknown
  • Sub-domain classification
  • General domain classification
  • Structure overlap in sub-domains

27
Hypothesis
  • The transfer learners should find a model that
    is similar to the transfer model, while still
    being an accurate model of the new dataset

28
Result K2 (Transfer)
29
Result DAG (Transfer)
30
Result K2 (Transfer)
31
Result DAG (Transfer)
32
Result structure overlap
33
Summary
34
Hypothesis
  • The transfer learners should find better network
    models for small training sets when using a good
    transfer structure

35
Result K2 (Transfer)
36
Result DAG (Transfer)
37
Result structure overlap
38
Hypothesis
  • The transfer algorithms should be faster or at
    least not slower than their original versions

39
Result K2 (Transfer)
40
Result DAG (Transfer)
41
Conclusion
  • Classification performance between default
    learners and transfer learners are similar
  • If the purpose is to classify attribute values,
    then Bayesian Network Learning using Transfer has
    no added value
  • Influence of transfer decreases gradually when
    dataset becomes larger
  • Transfer learners tries to find a structure that
    is similar to the transfer while still being a
    model of the new dataset
  • Useful for comparing similar domains

42
Conclusion
Causal model
Diagnostic model
Write a Comment
User Comments (0)
About PowerShow.com