Prof. Yike Guo - PowerPoint PPT Presentation

About This Presentation
Title:

Prof. Yike Guo

Description:

Understand the basic bioarray technology including microarray technology for ... ORF. ORF. PM. MM. Averaged PM-MM 'presence' feature statistics. 25-mers. Affymetrix2 ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 22
Provided by: Asatisfied8
Category:
Tags: guo | orf | prof | yike

less

Transcript and Presenter's Notes

Title: Prof. Yike Guo


1
Data Management and Mining in BioArray
Informatics
  • Prof. Yike Guo
  • Dept. of Computing, Imperial College, London

2
Goal
  • Understand the basic bioarray technology
    including microarray technology for gene
    expression, protein chips, NMR spectroscopy and
    other high throughout devices
  • Learn the basic analytical technology and its
    applications to the bioarray information
  • Learn the analysis processes of processing and
    analysing bioarray data (e.g. gene expression
    analysis)

3
Lecture Overview
  • Lecture One BioArray Informatics Introduction
  • Lecture Two BioArray Technology
  • Lecture Three Analysis Technology (1)Data
    Normalisation and Transformation
  • Lecture Four Analysis Technology
    (2)--Clustering and Classification
  • Lecture Five Analysis Technology (3)
    Multivariate Statistics
  • Lecture Six Analysis Applications (1)Gene
    Expression Analysis
  • Lecture Seven Analysis Application
    (2)Integrative Analysis of BioArray Data

4
BioArray Informatics Integrative Analysis of
BioArray Data within the Biological Context
secondary structure tertiary structure
polymorphism patient records epidemiology
expression patterns physiology
sequences alignments
receptors signals pathways
ATGCAAGTCCCT AAGATTGCATAA GCTCGCTCAGTT
linkage maps cytogenetic maps physical maps
5
Functional -Omics Analysis
REAL WORLD
INPUTS NOXIOUS AGENT/STRESSOR OUTPUTS BIOLO
GICAL END-POINTS PATHOLOGY
ALTERED PHYSIOLOGY AND METABOLISM
6
A Dynamics in BioArray Informatics
Interactions
Environment
DNA
Protein
Growth rate
Expression
7
A mathematical model
8
BioArray Provides the Means for Revealing the
Interaction
Relations 1- gene homologs 2- gene encodes a
protein 3- protein can regulate the expression of
a gene 4- protein phosphorylates another protein
5- protein binds to another protein 6- protein
lyses another protein 7- Proteins can sometimes
be receptors 8- Receptors bind a ligand 9-
Receptors (if bound) activate other proteins
9
BioArray Quantitative Measurement of Biological
Concepts
experiment
ORF
  • R/G ratios
  • R, G values
  • quality indicators

control
  • Microarrays1
  • 1000 bp hybridization

10
Quantitative Analysis
Reproducibility confidence intervals to find
significant deviations
11
BioArray Informatics BioArray is the data,
everything else is Informatics
  • Data Engineering
  • Data Warehousing
  • Data Integration
  • Data Analysis
  • Knowledge Discovery
  • Discovery Integration
  • Discovery Validation
  • Knowledge Integration
  • Knowledge Warehousing

12
Data Warehousing
Data Sources
External Data Sources
Operational Data Sources
Data Warehousing
13
Example - ArrayExpress
14
Data Warehousing and Data Integration
15
Data Schema in Warehousing A Gene Expression
Example
Gene Expression Warehouse
OMIM
Enzyme
Protein
Disease
Affy Fragment
Known Gene
Sequence
Pathway
SNP
Metabolite
Sequence Cluster
KEGG
Genbank
NMR
16
A Workflow of Gene Expression Database
Data Reduction Queries
Warehousing Output
Comparisons
Profile Report
between 2 samples
Set Fold Change

Comparisons
(e.g., gt 2X)
between multiple
Data in
User defined
samples
analysis
dataset
Set higher avg difference
value (e.g., gt200)
Visualisation
A-gtP/ P-gtA stringency
(e.g., 80)
Advanced Gene Expression Analysis
17
Queries, Queries..
  • Query to the data
  • Which genes are linked ?
  • Which genes are expressed similarly to my gene
    XYZ?
  • Which genes are co-expressed in differing
    conditions ?
  • classification (of tumors, diseased tissues
    etc.) which patterns are characteristic for a
    certain class of samples, which genes are
    involved?
  • functional classification of genes Are changes
    clustered in particular classes?
  • metabolic pathway information Is a certain
    pathway/route in a pathway affected?
  • disease information clinical follow up
    correlation to expression patterns.
  • phenotype information for mutants Are there
    correlations between particular phenotypes and
    expression patterns?

18
Gene Expression Data Analysis Work Flow
Data in
Knowledge Deliverables
Interactive Analysis Procedures
analysis
Cluster by genes
Study outliers
Correlate clinical
measurements
Literature analysis
Time course analysis
Defined subsets of
genes
Classic drug targets
Examples, not
Known disease association
exhaustive
Cross species indices
19
(Un)fortunately, Scientists never think linearly
  • Why those genes are co-expressed?
  • What do their protein products do?
  • What is the common regulatory motifs of a
    co-expressed gene set?
  • Can we patent them?
  • Do we know which metabolic pathway they are in?
    If there is no, can I synthesis one?
  • Are there HTS results for any proteins in the
    pathway?
  • Are there any compounds in the HTS library that
    hit selectively and consistently against those
    proteins?
  • Which ones have good activity, availability and
    toxicity?

20
Advanced Analysis
  • Discovery Annotation and Validation
  • E.X. Annotating a set of co-expressed genes
    with some conserved regulatory motifs
  • E.X. Scoring a co-expression pattern with
    pathways
  • E.X. Literature analysis to annotate biological
    semantics
  • Integrative Analysis
  • E.X. Multi-modality Analysis
  • E.X. Cross Annotation of Discovered Patterns
  • Modelling and Simulation
  • E.X. Pathway Synthesis
  • E.X. Virtual Cell Modelling

21
Pathway Scoring
22
Analysis of Gene Expression Data with Pathway
Scores
Our Approach
Write a Comment
User Comments (0)
About PowerShow.com