Data analysis of gene expression data - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Data analysis of gene expression data

Description:

HELSINKI UNIVERSITY OF TECHNOLOGY. LABORATORY OF COMPUTER AND INFORMATION SCIENCE ... differentially expressed genes in pulmonary adenocarcinoma by using a cDNA array. ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 11
Provided by: itl9
Category:

less

Transcript and Presenter's Notes

Title: Data analysis of gene expression data


1
Data analysis of gene expression data
  • Jaakko Hollmén

2
Personnel
  • Jaakko Hollmén, Heikki Mannila
  • Graduate students (3) Jouni Seppänen, Salla
    Ruosaari, Anne Patrikainen
  • Undergraduate students (2) Mikko Katajamaa,
    Antti Rasinen

3
Gene expression data
  • State of protein production
  • Tissue to RNA to hybridized arrays
  • High-dimensional, noisy measurement data matrices
  • 500-10000 simultaneous measurements from an
    organism

4
Research scope
  • Goal advances in data analysis, with a specific
    focus on analyzing gene expression data
  • High-dimensional, noisy measurement data matrices
  • Signal decomposition and projection methods (PCA,
    ICA, NMF, ...), MCMC, and pattern discovery
    methods

5
Understanding measurements
  • Simulation model for gene expression data
  • To understand measurements and their analysis

6
Closer look at the real world
7
Expression data as numbers
  • 0.8214 0.5298 0.4586 0.7505
    0.0147 0.2440 0.7258 0.1302 0.8995
    0.2233 0.7430 0.9636 0.7067 0.7333
    0.4974 0.0264
  • 0.4447 0.6405 0.8699 0.7400
    0.6641 0.8220 0.3987 0.2544 0.6928
    0.3965 0.6508 0.1205 0.1684 0.6223
    0.0750 0.3554
  • 0.6154 0.2091 0.9342 0.4319
    0.7241 0.2632 0.3584 0.8030 0.4397
    0.1351 0.9398 0.0483 0.8137 0.9898
    0.7666 0.7439
  • 0.7919 0.3798 0.2644 0.6343
    0.2816 0.7536 0.2853 0.6678 0.7010
    0.2411 0.8328 0.3802 0.4662 0.1524
    0.0454 0.2987
  • 0.9218 0.7833 0.1603 0.8030
    0.2618 0.6596 0.8686 0.0136 0.6097
    0.9275 0.4700 0.4128 0.7223 0.2033
    0.1651 0.1812
  • 0.7382 0.6808 0.8729 0.0839
    0.7085 0.2141 0.6264 0.5616 0.2999
    0.3911 0.6299 0.4014 0.9949 0.8193
    0.7772 0.4152
  • 0.1763 0.4611 0.2379 0.9455
    0.7839 0.6021 0.2412 0.4546 0.8560
    0.5113 0.0582 0.4210 0.3625 0.0584
    0.2083 0.8673
  • 0.4057 0.5678 0.6458 0.9159
    0.9862 0.6049 0.9781 0.9049 0.1121
    0.0929 0.5422 0.3770 0.7308 0.5385
    0.2518 0.6249
  • 0.9355 0.7942 0.9669 0.6020
    0.4733 0.6595 0.6405 0.2822 0.2916
    0.0217 0.4557 0.9073 0.6497 0.1902
    0.3965 0.0552
  • 0.9169 0.0592 0.6649 0.2536
    0.9028 0.1834 0.2298 0.0650 0.0974
    0.1595 0.8631 0.6702 0.6813 0.5995
    0.4807 0.4041
  • 0.4103 0.6029 0.8704 0.8735
    0.4511 0.6365 0.6813 0.4766 0.3974
    0.8445 0.8552 0.9618 0.0076 0.2923
    0.5093 0.3020
  • 0.8936 0.0503 0.0099 0.5134
    0.8045 0.1703 0.6658 0.9837 0.3333
    0.8792 0.4723 0.1630 0.6541 0.0913
    0.6248 0.1523
  • 0.0579 0.4154 0.1370 0.7327
    0.8289 0.5396 0.1347 0.9223 0.9442
    0.1870 0.7869 0.7486 0.9452 0.5068
    0.6255 0.3092
  • 0.3529 0.3050 0.8188 0.4222
    0.1663 0.6234 0.0225 0.5612 0.8386
    0.9913 0.6560 0.3741 0.6133 0.8841
    0.9912 0.0033
  • 0.8132 0.8744 0.4302 0.9614
    0.3939 0.6859 0.2622 0.6523 0.2584
    0.7120 0.0000 0.4542 0.7829 0.6156
    0.3592 0.4374
  • 0.0099 0.0150 0.8903 0.0721
    0.5208 0.6773 0.1165 0.7727 0.0429
    0.8714 0.1312 0.0386 0.0032 0.0464
    0.2760 0.6764
  • 0.1389 0.7680 0.7349 0.5534
    0.7181 0.8768 0.0693 0.1062 0.0059
    0.4796 0.4949 0.5624 0.7970 0.9519
    0.6781 0.8229
  • 0.2028 0.9708 0.6873 0.2920
    0.5692 0.0129 0.8529 0.0011 0.5744
    0.4960 0.0383 0.3723 0.6418 0.1690
    0.5088 0.7558
  • 0.1987 0.9901 0.3461 0.8580
    0.4608 0.3104 0.1803 0.5418 0.7439
    0.2875 0.2274 0.7928 0.1785 0.8267
    0.2769 0.1626

8
Quality control at spot level
  • Choose good quality spots for subsequent analysis
  • image analysis, detection and cost-sensitive
    classification

9
Collaboration with biologists
  • Department of Medical genetics, Lab. of
    Cytomolecular Genetics, U. of Helsinki
  • Institute of Occupational Health
  • Turku Centre for Biotechnology
  • Karolinska Institutet
  • Journal articles during 2002

Wikman et al., Identification of differentially
expressed genes in pulmonary adenocarcinoma by
using a cDNA array. Oncogene 21(37), 2002,
Nature Publishing Group Niini et al., Expression
of myeloid-specific genes in childhood acute
lymphoblastic leukemia cDNA array study.
Leukemia, 16(11), 2002, Nature Publishing
Group Mannila et al., Long-range control of gene
expression in yeast. Bioinformatics 18(3), 2002.
10
Current topics and further work
  • Correlation between gene expression and gene
    location in the genome
  • Combinations with sequence information
  • Time-series analysis, decompositions
  • Sparse decompositions of data matrices
  • MCMC techniques
  • Pattern discovery methods
  • Etc.
Write a Comment
User Comments (0)
About PowerShow.com