Machine Learning with WEKA Introduction and Preprocessing - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Machine Learning with WEKA Introduction and Preprocessing

Description:

How to add your own algorithm to Weka. How to use the test environment in ... Add your classifier class into wekaguiGenericObjectEditor.props. For example: ... – PowerPoint PPT presentation

Number of Views:590
Avg rating:3.0/5.0
Slides: 22
Provided by: kingso
Category:

less

Transcript and Presenter's Notes

Title: Machine Learning with WEKA Introduction and Preprocessing


1
Machine Learning with WEKAIntroduction and
Preprocessing
2
Outline
  • How to preprocess data.
  • How to add your own algorithm to Weka.
  • How to use the test environment in Weka.

3
Resource
  • UCI Machine Learning Repository
  • ftp//ftp.ics.uci.edu/pub/machine-learning-databas
    es/
  • Weka
  • http//www.cs.waikato.ac.nz/ml/weka/
  • http//www.cs.unb.ca/profs/hzhang/CS6735
  • Tutorial.
  • http//prdownloads.sourceforge.net/weka/weka.ppt

4
WEKA the software
  • Machine learning/data mining software written in
    Java
  • Main features
  • Comprehensive set of data pre-processing tools,
    learning algorithms and evaluation methods
  • Graphical user interfaces (incl. data
    visualization)
  • Environment for comparing learning algorithms

5
Explorer pre-processing the data
  • Data can be imported from a file in various
    formats ARFF, CSV, C4.5, binary
  • Data can also be read from a URL or from an SQL
    database (using JDBC)
  • Pre-processing tools in WEKA are called filters
  • WEKA contains filters for
  • Discretization, normalization, resampling,
    attribute selection, transforming and combining
    attributes,

6
WEKA only deals with flat files
  • The data must be converted to ARFF format before
    applying any algorithm.
  • The datasets name _at_relation
  • The attribute information _at_attribute
  • The data section begins with _at_data
  • Data a list of instances with the attribute
    values being separated by commas.
  • By default, the class is the last attribute in
    the ARFF file.

7
Numeric attribute and Missing Value
  • _at_relation heart-disease-simplified
  • _at_attribute age numeric
  • _at_attribute sex female, male
  • _at_attribute chest_pain_type typ_angina, asympt,
    non_anginal, atyp_angina
  • _at_attribute cholesterol numeric
  • _at_attribute exercise_induced_angina no, yes
  • _at_attribute class present, not_present
  • _at_data
  • 63,?,typ_angina,233,no,not_present
  • 67,male,asympt,286,yes,present
  • 67,male,asympt,229,yes,present
  • 38,female,non_anginal,?,no,not_present
  • ...

The datasets name
The attribute information
Data
8
Numeric attribute and Missing Value
  • _at_relation heart-disease-simplified
  • _at_attribute age numeric
  • _at_attribute sex female, male
  • _at_attribute chest_pain_type typ_angina, asympt,
    non_anginal, atyp_angina
  • _at_attribute cholesterol numeric
  • _at_attribute exercise_induced_angina no, yes
  • _at_attribute class present, not_present
  • _at_data
  • 63,?,typ_angina,233,no,not_present
  • 67,male,asympt,286,yes,present
  • 67,male,asympt,229,yes,present
  • 38,female,non_anginal,?,no,not_present
  • ...

numeric attribute
nominal attribute
Missing Value
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
Programming in Weka
Weka Test Environment
16
Add classifier into Weka
  • Add your classifier class into \weka\gui\GenericOb
    jectEditor.props
  • For example
  • weka.classifiers.trees.J48,\
  • weka.classifiers.YourOwnClassifier,\
  • weka.classifiers.bayes.NaiveBayes,\

17
Performing experiments
  • Experimenter makes it easy to compare the
    performance of different learning schemes
  • For classification and regression problems
  • Results can be written into file or database
  • Evaluation options cross-validation, learning
    curve, hold-out
  • Can also iterate over different parameter
    settings
  • Significance-testing built in!

18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
END OF PART ONE
Write a Comment
User Comments (0)
About PowerShow.com