Learning%20Agents%20Laboratory - PowerPoint PPT Presentation

About This Presentation

Title:

Learning%20Agents%20Laboratory

Description:

... allows for more accurate estimates of the error rates while training on most cases. ... the influence of different types of noise on the predictive accuracy; ... – PowerPoint PPT presentation

Number of Views:19

Avg rating:3.0/5.0

Slides: 18

Provided by: Gheorgh

Learn more at: http://lalab.gmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Learning%20Agents%20Laboratory

1
CS 782 Machine Learning
5. Evaluation of Empirical Inductive Learners
Prof. Gheorghe Tecuci
Learning Agents Laboratory Computer Science
Department George Mason University
2
Overview
Introduction
Computational learning theory
Empirical evaluation Single partitioning
Empirical evaluation Resampling
Recommended reading
3
Introduction

Suppose we have collected a body of training
examples, adopted a learning bias, implemented
the learning algorithm, executed the algorithm,
and learned the concept c represented by the
examples.
There are several questions we may ask about this
process
Can we believe the we have learned the right
concept?
What is the likelihood that c will correctly
classify previously unseen examples?
How can we have confidence that the concept c
is approximately correct?
There are two possible answers, a theoretical
answer and an experimental one.

4
The Computational Learning Theory
The Computational Learning Theory, pioneered by
Valiant, is concerned with finding theoretical
answers to the previous questions. In this
theory, learning is viewed as function
reconstruction Given a set of input-output
pairs x, f(x) for a boolean function
f. Determine an expression f1 that provides a
good approximation of the boolean function f f
0,1n --gt 0,1 The Valiant framework provides
bounds on the number of training examples
required for a given bias, in order to have high
confidence that the learned hypothesis f1 is
approximately correct. That is, how many
training examples would one need so that the
probability that the error rate of f1 is less
than e is greater than 1 - d Probability
(error rate of f1 e) 1 - d
5
The Computational Learning Theory (cont.)
This style of analysis is called probably
approximately correct (PAC) learning. The basic
idea is to analyze the expressiveness of the
hypothesis space. If a restricted hypothesis
space H is very small Then it is unlikely that
a learning algorithm could by chance succeed in
finding a hypothesis f1 Î H consistent with the
training examples. Therefore, it is more
likely that f1, if it is found, is a good
approximation of the correct hypothesis.
6
The Computational Learning Theory (cont.)
The theoretical analysis has provided insight
into the relationship between - the number of
training examples, - the bias of the learning
algorithm, and - the confidence that we can
have in the hypothesis f1 produced by the
algorithm. This analysis has been successful
only for simple learning algorithms. Most
applied work in machine learning employs
experimental techniques for determining the
correctness of f1.
7
Overview
Introduction
Computational learning theory
Empirical evaluation Single partitioning
Empirical evaluation Resampling
Recommended reading
8
Simple partitioning the holdout method

1. The available examples are randomly broken
into two disjoint groups the training set and
the testing set 2. The concept is learned by
using only the examples from the training set 3.
The learned concept is then used to classify
examples from the testing set 4. The obtained
results are compared with the correct
classification to produce an error rate.
9
Discussion
How does the number of examples affects the
result of the evaluation?
How does the distribution of examples affects the
result of the evaluation?
How to evaluate if there are very few examples?
How to reuse examples?
10
Overview
Introduction
Computational learning theory
Empirical evaluation Single partitioning
Empirical evaluation Resampling
Recommended reading
11
Resampling the leave-one-out method

Let us consider that the number of available
examples is 'n'. A concept is learned from n-1
examples and is tested on the remaining
example. This is repeated n times, each time
leaving out a different example. The error rate
is the total number of errors on the single test
case divided by n.
12
Discussion
How is the error estimate likely to compare with
single partitioning?
What about the repeatability of the experimental
results? Why is this important?
What is a likely problem with the leave one out
method and how could it be avoided?
13
Resampling the cross-validation method

In k-fold cross validation, the cases are
randomly divided into k (usually 10) mutually
disjoint sets of approximately equal size (of at
least 30 examples). The concept is learned from
the examples in k-1 sets, and is tested on the
examples from the remaining set. This is repeated
k times, once for each set (i.e. each set is once
used as a test set). The average error rates over
all k sets is the cross-validated error rate.
14
Resampling vs single partitioning
Resampling is a powerful idea. With a single
train and test partition, too few cases in the
training group can lead to the learning of a poor
concept, while too few test cases can lead to
erroneous error estimates. Resampling allows for
more accurate estimates of the error rates while
training on most cases. Resampling allows the
duplication of the analysis conditions in future
experiments on the same data.
15
Discussion
How could we compare two learning
algorithms? What can be said about the result of
the comparison?
16
Other types of experiments

Determine other characteristics of the learning
methods
the speed of learning
the asymptotic behavior and the number of
examples needed to approximate this behavior
predictive accuracy versus concept complexity
the influence of different types of noise on the
predictive accuracy
the influence of different biases on the
predictive accuracy
etc.

17
Recommended reading
Mitchell T.M., Machine Learning, Chapter 5
Evaluating Hypotheses, pp. 128 - 153, McGraw
Hill, 1997. Weiss, S.M., Kapouleas, I., An
Experimental Comparison of Pattern Recognition,
Neural Nets, and Machine Learning Classification
Methods, in Readings in Machine Learning. Kibler
D., Langley P., Machine Learning as an
Experimental Science, in Readings in Machine
Learning.

Write a Comment

User Comments (0)