Microarrays: algorithms for knowledge discovery in oncology and molecular biology - PowerPoint PPT Presentation

About This Presentation
Title:

Microarrays: algorithms for knowledge discovery in oncology and molecular biology

Description:

Microarrays: algorithms for knowledge discovery in oncology and molecular biology Frank De Smet Katholieke Universiteit Leuven Faculteit Toegepaste Wetenschappen – PowerPoint PPT presentation

Number of Views:139
Avg rating:3.0/5.0
Slides: 27
Provided by: Frank651
Category:

less

Transcript and Presenter's Notes

Title: Microarrays: algorithms for knowledge discovery in oncology and molecular biology


1
Microarrays algorithms for knowledge discovery
in oncology and molecular biology
  • Frank De Smet
  • Katholieke Universiteit Leuven
  • Faculteit Toegepaste Wetenschappen
  • Departement Elektrotechniek (ESAT)
  • Promotor Prof. dr. ir. B. De Moor

2
Overview
  • Introduction basic concepts of microarray data
  • Feature extraction
  • Univariate analysis
  • Multivariate analysis PCA
  • Classification
  • Clustering
  • Conclusions and future research

Introduction Feature extraction Classification
Clustering Conclusions
3
Transcription - Translation
Introduction Feature extraction Classification
Clustering Conclusions
4
Microarrays
Introduction Feature extraction Classification
Clustering Conclusions
5
Importance
  • Clinical (oncology)
  • Clinical management of cancer is in many cases
    empirical and not all information that is
    clinically relevant can be extracted using the
    data that physicians have access to
  • Fundamental mechanisms behind carcinogenesis are
    not always taken into account
  • But
  • Expression patterns measured with microarrays in
    malignant cells reflect the phenotype of the
    tumour
  • Molecular biology
  • Study of the expression behaviour of genes can
    help to determine their biological role or
    function

Introduction Feature extraction Classification
Clustering Conclusions
6
Data-mining framework
Introduction Feature extraction Classification
Clustering Conclusions
7
Expression matrix
Introduction Feature extraction Classification
Clustering Conclusions

Microarray experiments





8
Univariate analysis in microarray data
Introduction Feature extraction Classification
Clustering Conclusions
9
Multiple testing
Introduction Feature extraction Classification
Clustering Conclusions
  • Overlap of the p-values of the genes with and
    without actual differential expression Type I
    and II errors
  • In literature control of the Type I error too
    conservative for microarray data
  • Here balance of Type I and II error

10
Estimation of Type I and II error
Introduction Feature extraction Classification
Clustering Conclusions
11
Calculations
Introduction Feature extraction Classification
Clustering Conclusions
12
ROC curve
  • Optimal balance between Type I and II errors
  • Area under the curve
  • Quantifies how well the genes whose expression is
    and is not affected by the difference between
    conditions can be discriminated using their
    p-values
  • Quality measure for microarray data

Introduction Feature extraction Classification
Clustering Conclusions
13
Example Acute leukemia
Introduction Feature extraction Classification
Clustering Conclusions
14
Multivariate analysis in microarray
dataPrincipal Component Analysis
Introduction Feature extraction Classification
Clustering Conclusions
Unsupervised
15
Classification
Introduction Feature extraction Classification
Clustering Conclusions
Unsupervised
16
Clustering gene expression profiles
  • Importance
  • Identification of groups of coexpressed genes
  • Have a higher probability of having similar
    biological functions e.g., might interact with
    the same transcription factors (coregulation)
  • First generation algorithms disadvantages
  • Parameter fine-tuning
  • Assign each profile to a cluster
  • Computational complexity

Introduction Feature extraction Classification
Clustering Conclusions
17
Quality-based clustering (Heyer et al.)
  • Algorithm produces clusters with
  • a quality guarantee (fixed and user-defined
    threshold for diameter D)
  • with a maximum number of profiles

Introduction Feature extraction Classification
Clustering Conclusions
D
? Still some disadvantages !
18
Adaptive quality-based clustering (AQBC)
  • A heuristic iterative two-step approach
  • Step 1 Quality-based approach
  • Find a cluster center in an area of the data set
    where the density of expression profiles, within
    a sphere with preliminary radius, is locally
    maximal
  • Step 2 Adaptive approach
  • Re-estimation of the radius

Introduction Feature extraction Classification
Clustering Conclusions
19
Step 1 Localization of a cluster center
R
Introduction Feature extraction Classification
Clustering Conclusions
20
Step 2 Re-calculation of the radius
Introduction Feature extraction Classification
Clustering Conclusions
21
Comparison
Introduction Feature extraction Classification
Clustering Conclusions
22
Validation
Introduction Feature extraction Classification
Clustering Conclusions
23
Availability
Introduction Feature extraction Classification
Clustering Conclusions
24
Conclusions
  • Data-mining framework for microarray data
  • Feature extraction
  • Univariate analysis
  • Estimation of n1 and n0
  • ROC curves optimal balance between Type I and II
    error quality measure
  • Multivariate analysis PCA
  • Classification FDA and LS-SVM
  • Clustering
  • Microarray experiments
  • Gene expression profiles AQBC
  • Clinical data

Introduction Feature extraction Classification
Clustering Conclusions
25
Selected publications
  • De Smet, F., Marchal, K., Timmerman, D., Vergote,
    I., De Moor, B. and Moreau, Y. (2001) Gebruik van
    microroosters in de klinische oncologie, Tijdschr
    voor Geneeskunde, 57, 1225-1236.
  • De Smet, F., Mathys, J., Marchal, K., Thijs, G.,
    De Moor, B. and Moreau Y. (2002) Adaptive
    quality-based clustering of gene expression
    profiles. Bioinformatics, 18, 735-746.
  • Moreau, Y., De Smet, F., Thijs, G., Marchal, K.
    and De Moor, B. (2002) Functional bioinformatics
    of microarray data from expression to
    regulation. Proceedings of the IEEE, 90,
    1722-1743.
  • De Smet, F., Moreau, Y., Tmmerman, D., Vergote,
    I. and De Moor, B. (2004) Balancing false
    positives and false negatives for the detection
    of differential expression in malignancies. Br J
    Cancer, submitted.
  • Epstein, E., Skoog, L., Isberg, P.E., De Smet,
    F., De Moor, B., Olofsson, P.A., Gudmundsson, S.
    and Valentin, L. (2002) An algorithm including
    results of gray-scale and power Doppler
    ultrasound examination to predict endometrial
    malignancy in women with postmenopausal bleeding.
    Ultrasound Obstet Gynecol, 20, 370-376.

Introduction Feature extraction Classification
Clustering Conclusions
26
Future research
  • Specific
  • Ovarian cancer transcriptomics
  • Prediction of chemosensitivity in stage III
  • Prediction of recurrence in stage I
  • Endometriosis proteomics and transcriptomics
  • Detection of endometriosis
  • Prediction of relapse after surgery
  • General
  • Microarrays number of patients - validation -
    standardization
  • Proteomics
  • Combination and comparison of microarray,
    proteomic and clinical data

Introduction Feature extraction Classification
Clustering Conclusions
Write a Comment
User Comments (0)
About PowerShow.com