Properties of Machine Learning Applications for Use in Metamorphic Testing PowerPoint PPT Presentation

presentation player overlay
1 / 18
About This Presentation
Transcript and Presenter's Notes

Title: Properties of Machine Learning Applications for Use in Metamorphic Testing


1
Properties of Machine Learning Applications for
Use in Metamorphic Testing
  • Chris Murphy, Gail Kaiser, Lifeng Hu, Leon Wu
  • Columbia University

2
Introduction
  • We are investigating the quality assurance of
    Machine Learning (ML) applications
  • Machine Learning applications fall into a class
    for which it can be said that there is no
    reliable test oracle

3
Introduction
  • Previously we have investigated approaches to
    testing such applications by considering
    properties of their data sets and by using random
    testing
  • In this work, we seek to adapt Metamorphic
    Testing Chen 98 to these applications and
    consider their Metamorphic Properties

4
Contribution
  • Our contribution is a set of Metamorphic
    Properties that can be used to define these
    relationships so that Metamorphic Testing can be
    used as a general approach to testing machine
    learning applications

5
Overview
  • Background
  • Testing Approach
  • Findings and Results
  • Future Work and Conclusion

6
Metamorphic Testing
  • General technique for creating follow-up test
    cases based on existing ones, particularly those
    that have not revealed any failure
  • Chen 98, Gotleib COMPSAC03, Chen STEP04, Zhou
    ISFST04
  • Use a functions Metamorphic Properties to
    predict the output for a particular input, given
    the known output for another input
  • For example, if we know sin(x)y, then we know
    sin(x2p) y and sin(-x) -y

7
Related Work
  • Applying metamorphic testing to situations in
    which there is no test oracle Chen IST02
  • There has been much research into applying
    Machine Learning techniques to software testing,
    but not much the other way around
  • Testing of intrusion detection systems has
    typically addressed quantitative measurements but
    does not seek to ensure that the implementation
    is free of defects

8
Machine Learning Fundamentals
  • Data sets consist of a number of examples, each
    of which has attributes and a label
  • In the first phase (training), a model is
    generated that attempts to generalize how
    attributes relate to the label (if they exist)
  • In the second phase, the model is applied to a
    previously-unseen data set with unknown labels to
    produce a classification (or, in some cases, a
    ranking)

9
Sample Data Set
  • For supervised machine learning

27,81,88,59,42,16,88, 0 82, 6,51,47, 5, 4, 1,
0 22,72,11,84,96,24,44, 1 4,77,91,86,89,77,61,
1 76,11, 4,51,43, 2,79, 0 6,33,44,18,52,63,94,
0 77,36,91,81,47, 3,85, 1 39,17,15, 2,90,70,13,
0 8,58,42,41,74,87,68, 1
examples
labels
attributes
10
Applications Investigated
  • MartiRank
  • Specifically designed for potential future
    experimental use in predicting impending
    electrical device failures by ranking them
    according to likelihood of failure
  • Seeks to find the combination of segmenting and
    sorting the data that produces the best result
  • Support Vector Machines (SVM)
  • Seeks to find a hyperplane that separates
    examples from different classes
  • SVM-Light has a ranking mode based on the
    distance from the hyperplane
  • PAYL
  • Anomaly-based intrusion detection system (IDS)
  • Builds a model of normal network traffic based
    on byte distribution, and reports any anomalies

11
Approach
  • Previously tested such applications by analysis
    of the data sets and algorithms, and by using
    equivalence partitions to guide random testing
  • In this work, we use our knowledge of MartiRank
    to devise a set of Metamorphic Properties, and
    then see if they also apply to SVM and PAYL
  • We then use these properties to guide testing of
    these applications

12
MartiRank Metamorphic Properties
  • Additive
  • If each value in the data set is increased by a
    constant, the final ranking should be unchanged
  • Multiplicative
  • If each value in the data set is multiplied by a
    positive constant, the final ranking should be
    unchanged
  • Permutative
  • If the order of the data is permuted, the final
    ranking should be unchanged (assuming distinct
    values in the data set)

13
MartiRank Metamorphic Properties
  • Invertive
  • If each value in the data set is multiplied by a
    negative constant, the final ranking should be in
    the reverse order
  • Inclusive
  • In the testing phase, if the model is already
    known, it should be possible to create an example
    in the testing data such that it is guaranteed to
    be at the top of the ranking
  • Exclusive
  • If an example is removed from the testing data,
    the final ranking should be unchanged

14
Testing MartiRank
  • Its invertive property should hold for the labels
    in the training data, too
  • Multiplying the labels by 1 should yield a model
    that, when applied to the same testing data, will
    result in the reverse ordering
  • Negative labels were not considered by the
    developer and a defect was revealed through
    Metamorphic Testing

15
Applying Approach to SVM
  • SVM exhibits all six Metamorphic Properties
  • A defect was found in SVM-Light by using its
    permutative property
  • Permuting the input data led to different models
    (and then different rankings)
  • Caused by chunking data for use by an
    approximating variant of optimization algorithm

16
Applying Approach to PAYL
  • PAYL exhibits all six Metamorphic Properties
  • Even though it is unsupervised ML
  • Two defects were found by using its exclusive
    property
  • Removing a value from the training data did not
    cause it to be considered anomalous later on
  • It also caused other values to be considered
    anomalous

17
Future Work and Conclusion
  • We have identified six Metamorphic Properties
    that we believe exist in many machine learning
    applications
  • additive, multiplicative, permutative, invertive,
    inclusive, and exclusive
  • These properties were used to find new defects in
    the ML applications of interest
  • Further investigation could involve applying
    these properties to other, larger ML
    applications, and looking to classify other
    properties

18
Properties of Machine Learning Applications for
Use in Metamorphic Testing
  • Leon Wu
  • leon_at_cs.columbia.edu
  • Columbia University
Write a Comment
User Comments (0)
About PowerShow.com