Bioinformatics%20for%20Microarray%20Studies%20at%20IBS - PowerPoint PPT Presentation

About This Presentation
Title:

Bioinformatics%20for%20Microarray%20Studies%20at%20IBS

Description:

array data analyses, storage of the scanning result, biology-oriented bioinformatics analyses ... links to more bioinformatics tools. can record analysis ... – PowerPoint PPT presentation

Number of Views:156
Avg rating:3.0/5.0
Slides: 45
Provided by: Pei7
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics%20for%20Microarray%20Studies%20at%20IBS


1
Bioinformatics for Microarray Studies at IBS
  • Pei-Ing Hwang, Ph.D.
  • Mar. 24, 2005

2
Different aspects for life science research
genomics
transcriptomics
proteomics
3
Building blocks for DNA or RNA
  • DNA A, T, G, C
  • RNA A, U, G, C

4
DNA deoxyribonucleic acid
5
Why microarray?
  • Gene Expression
  • To simultaneously study multiple genes
  • To obtain an overview of gene expression at
    transcriptional level under specific experimental
    conditions
  • To study gene interaction network from the
    transcriptional aspect
  • Genome
  • SNP detection
  • To find out recombination site in the
    chromosome/genome
  • Hopefully to discover the gene responsible for a
    genetic disease

6
Outline
  • Introduction to Microarray experiments
  • Experiences at IBS for the cDNA arrays
  • Data generated with microarray
  • DNA annotation
  • Data Analysis
  • Data Management

7
About Microarray Technology-1
  • Up to hundreds of thousands of spots in a fixed
    area on a glass slide or a membrane
  • One species of DNA molecules per one spot
  • Spot is also named as feature
  • DNA fixed on the chip or membrane is also called
    probe
  • The sequence or/and function of each DNA species
    on the spot is known .

8
About Microarray Technology-2
  • Making use of hybridization method
  • A T, U
  • G C
  • Image processing
  • Data analysis
  • Result interpretation from biology aspect

9
Types of Microarray
  • Types of DNA immobilized on the solid support
  • cDNA vs. oligonucleotides
  • Manufacturing methods
  • Printing vs. photolithography
  • Solid support
  • Glass slides
  • Membrane
  • Nucleotide labeling (slide scanning condition)
  • One color vs. two colors

10
GeneChip Array Manufacuturing
Figure 1. Affymetrix uses a unique combination of
photolithography and combinatorial chemistry to
manufacture GeneChip Arrays.
11
Microarray printing machine
http//arrayit.com/Products/MicroarrayI/NanoPrint/
Nano-Print-new-600.jpg
12
Procedure for one-channel array
13
Experimental Procedure for 2-channel Microarray
14
Data Analyses
  • Feature intensity acquisition
  • Image analyses
  • To identify differentially expressed genes
  • Normalization (global, local, print-tip, btwn
    array etc.)
  • Clustering or Classification
  • Analyses from biology aspect
  • Significant genes
  • Transcriptional regulation study
  • Cellular pathway or network finding

15
Experiences at IBS for the cDNA arrays
16
About IBS tomato arrays
  • 13000 spots/features per chip
  • 1 clone per spot
  • cDNA clones from a dozen of various cDNA
    libraries
  • At least two different protocols were followed
    and six different vectors were used
  • More than ten technicians involved

17
Bioinformatics for Microarray at IBS (contd)
  • IBS tomato EST database construction
  • Installation, management and maintenance of data
    analyses software
  • Reference information searching
  • Batch Submission of EST sequences

18
Bioinformatics Needs for Microarray Studies at IBS
  • Pre-arraying data management
  • cDNA info collection, vector trimming, sequence
    annotation, EST submission..etc.
  • Array information management
  • Gene set characterization, data storage, data
    retrieval
  • Post-hybridization data analysis and management
  • array data analyses, storage of the scanning
    result, biology-oriented bioinformatics analyses

19
Bioinformatics Service Work for Microarray
studies at IBS
  • Data pre-processing for the cDNAs
  • Clone id assignment
  • Sequence trimming
  • gene annotation
  • Function classification
  • Data sheet preparation for commercial software to
    analyze microarray data
  • Gal file preparation for GenePixPro
  • Master Gene List preparation for GeneSpring

20
cDNA clones
Vector trimming Assembly Function annotation
sequencing
Database
PCR
Spotfire, GeneSpring
Biological meaning Pathway analysis
Transcription network Gene-gene interaction
GenePix
Data analysis Normalization, Variance Clustering
Feature intensities normalization
21
Pre-array Bioinformatics
  1. Clone id generation
  2. Vector Trimming
  3. Sequence assembly
  4. Seq annotation (BLAST)
  5. EST submission to NCBI
  6. Database construction

clones from labs
sequencing
Raw EST seq
Data Processing and Management
22
Clone id generation
  • Data centralization following sequencing
  • Rules for re-arraying
  • 96 well plate to/from 384 well
  • PCR from 96 well and spotting from 384 well
  • Order of A1, A2, B1, B2

23
96 or 384 well
96 well
96 well
384 well
24
96-well to 384 well plates
B2
B1
A2
A1
25
Data collection
  • Raw sequencing data obtained from the sequencing
    company
  • Organized and stored both ABI and text files by
    labs and by date
  • Confirmed with each sequence contributor for
    clone info
  • Clone id matched with raw sequences

26
Processing the sequencing data
  • cDNA libraries procedures confirmed with each
    single lab
  • Vector/linker/primer trimming (Seqclean)
  • Function annotation
  • Blast against different database
  • Gene Ontology annotation
  • Sequence Assembly (Phrap)

27
Procedure to generate cDNA clones
28
IBS tomato EST Database
  • Cloning information
  • Sequencing data
  • Vector/adaptor Trimming information
  • EST assembly
  • Function annotation
  • Cross Reference

29
The Tomato Database Entity-Relationship model
Trimmed Sequence 1. Seq id 2. Trimmed
Sequence 3. Method 4. Trim set
TAIR Result 1. Seq id 2. At number 3.
E-Value 4. Description 5. Identity 6. Other result
Untrimmed Sequence 1. Seq id 2. Trimmed Sequence
Assembly Information 1. Contig _ id 2. Contig
Sequence 3. BLAST Result 4. Position 5. Component
seq id
NCBI BLAST Result 1. Seq id 2. NCBI _id 3.
E-Value 4. Description 5. Identity 6. Other result
Seq _ id
Lab info 1. Seq id 2. Comment 3. Primer 4.
Biotech 5. Sender 6. Collect From
TIGR Result 1. Seq id 2. TC number 3.
E-Value 4. Description 5. Identity 6. Other result
Clone _ id
TOM 4
Clone _ id
TOM 3
ID MAP 1. Seq id 2. Clone _ id 3.
Contig id 4. Lab_id1 5. Lab_id2 6.
NCBI_sbmt_id93 7. NCBI_sbmt_id94 8. dbEST _ accn
_no 9. note
Gene Ontology 1. TC number 2. EC number 3.
Process -GO_id -Description 4. Function
-GO_id -Description 5. Component
-GO_id -Description
cDNA Library Information 1. Clone _
id(3)(4) 8. Host. 2. Name
9. Species 3. Date made
10. Vector 4. Developmental stage 11.
Antibiotic. 5. Cloning sites 12.
Authors 6. Description 13.
Tissue 7. Library 14.
Primer
Clone _ id
n
1
1
n
TC number
30
Information to be further analyzed
  • Gene set characterization
  • Number of unique genes on the array
  • Number of known/ unkown genes
  • Coordination of each spotted sequence
  • Statistics about spotted cDNA
  • grouped by function/pathway
  • grouped by sequence similarity

31
Post-hybridization data analysis and management
32
Post-hybridization data analysis
  • Software for Microarray Analysis At IBS
  • GenePix Pro5.0 image processing
  • GeneSpring microarray data analysis
  • Spotfire microarray data analysis and data
    storage
  • TransPath pathway searching

33
Image Processing
  • GenePix Pro5.0
  • GAL (GenePix Array List) file

34
From multi-well plate to microarray
35
GAL online
36
GeneSpring at IBS
  • for microarray data analyses
  • standalone software
  • providing statistical methods for data analysis
  • Some bioinformatics
  • providing visaulization
  • licensed annually
  • rigid format requirement for input data
  • requiring installation of a master gene list
    (master table) prior to data analysis

37
Master table for GeneSpring
  • Master table contains information of
  • Id
  • Source of DNA
  • Gene name
  • Gene function annotation (from Blast results)
  • GO annotation
  • Each array needs its own master table
  • Format of master table may vary with different
    version of the software.

38
To generate master table for GeneSpring
  • Batch blast against three sequence database
  • Parsing Blast results
  • Incorporating EC number, GO number and other
    related data from the best BLAST matched results
  • Integrate all required data from various files
    and generate the master table
  • checking

39
Spotfire
  • for microarray data analyses
  • server-client software
  • linked to Oracle database for data storage
  • providing various statistical methods for data
    analysis
  • capability in establishing links to more
    bioinformatics tools
  • can record analysis procedure
  • more flexible format requirement for input data

40
One color array for Arabidopsis
  • Affymetrix ATH1 chip
  • Annotation information provided by company and
    available on internet

41
Bioinformatics support at Affymetrix
42
Projects for now and the near future
  • Infrastructure build-up
  • Microarray data management system
  • Platform for Bioinformatics analyses
  • Plant Signaling Pathway Database

43
Team
44
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com