AntiLearning Signature in Biological Classification - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

AntiLearning Signature in Biological Classification

Description:

Anti-Learning Signature in Biological Classification ... Telstra. Bhavani Raskutti. Peter MacCallum Cancer Centre. David Bowtell. Coung Duong. Wayne Phillips ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 26
Provided by: cma106
Category:

less

Transcript and Presenter's Notes

Title: AntiLearning Signature in Biological Classification


1
Anti-Learning Signature in Biological
Classification
  • Adam Kowalczyk
  • Statistical Machine Learning
  • NICTA, Canberra
  • (Adam.Kowalczyk_at_nicta.com.au)

National ICT Australia Limited is funded and
supported by
1
2
Definition of anti-learning
Systematically
Random guessing accuracy
Off-training accuracy
Training accuracy
3
Learning and anti-learning mode of supervised
classification
4
Anti-learning in Low Dimensions
-1
1
1
-1
5
Anti-learning in Cancer Genomics
6
Predicting CRT response to Oesophageal carcinoma
7
Predicting CRT response
cDNA microarray data (from biopsies)
(LOO test)
8
Prediction of CRT response
) using data as of Nov. 2004
9
Anti-learning in Classification of Genes in Yeast

10
KDD02 task identification of Aryl Hydrocarbon
Receptor genes (AHR data)
11
Anti-learning in AHR-data set from KDD Cup 2002
Average of 100 trials random splits training
test 66 34
12
KDD Cup 2002 Yeast Gene Regulation Prediction
Taskhttp//www.biostat.wisc.edu/craven/kddcup/ta
sk2.ppt
13
Anti-learning in High Dimensional Approximation
(Mimicry)
14
Mimicry in High Dimensional Spaces
15
Quality of mimicry
d 1000
nE / nX
Average of independent test for of 50 repeats
16
Formal result
17
Proof idea 1 Geometry of the mimicry data
18
Proof idea 3kernel matrix
19
Hadamard Matrix

20
CS-kernels
Kowalczyk, Chapelle ALT05
21
Anti-learning in classification of Hadamard
dataset
Kowalczyk, Smola
22
Perfect anti-learning theorem
Kowalczyk, Smola, submitted
23
Perfect anti-learning i.i.d. a learning curve
n 100, nRand 1000
random

AROC mean std
2
1
4
5
0
3


nsamples i.i.d. samples from the perfect
anti-learning-set S
24
Conclusions
  • Statistics and machine learning are indispensable
    components of forthcoming revolution in medical
    diagnostics based on genomic profiling
  • High dimensionality of the data poses new
    challenges pushing statistical techniques into
    uncharted waters
  • Challenges of biological data can stimulate novel
    directions of machine learning research

25
Acknowledgements
  • Telstra
  • Bhavani Raskutti
  • Peter MacCallum Cancer Centre
  • David Bowtell
  • Coung Duong
  • Wayne Phillips
  • MPI
  • Cheng Soon Ong
  • Olivier Chapelle
  • NICTA
  • Alex Smola
Write a Comment
User Comments (0)
About PowerShow.com