Introduction and Applications of Microarray Databases - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

Introduction and Applications of Microarray Databases

Description:

Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 48
Provided by: Chenhsi9
Category:

less

Transcript and Presenter's Notes

Title: Introduction and Applications of Microarray Databases


1
Introduction and Applications of Microarray
Databases
  • Chen-hsiung Chan
  • Department of Computer Science and Information
    Engineering
  • National Taiwan University

2
MIAME (Minimum Information About a Microarray
Experiment)
  • MIAME describes the Minimum Information About a
    Microarray Experiment that is needed to enable
    the interpretation of the results of the
    experiment unambiguously and potentially to
    reproduce the experiment. Brazma et al, Nature
    Genetics

3
MIAME
  • raw data (CEL or GPR files)
  • final processed (normalized) data
  • essential sample annotation including
    experimental factors and their values
  • experimental design including sample data
    relationships
  • sufficient annotation of the array
  • essential laboratory and data processing
    protocols

4
Databases using MIAME
  • ArrayExpress at EBI
  • GEO at NCBI
  • CIBEX at DDBJ

5
ArrayExpress
  • http//www.ebi.ac.uk/microarray-as/aer/
  • Stores transcriptomics and related data
  • Data warehouse stores gene indexed expression
    profiles
  • In accordance with MGED recommendations MIAME

6
(No Transcript)
7
ArrayExpress statistics
  • Experiment repository 2,914 experiments (each
    with at least 6 microarrays) and growing
  • Expression profiles including 267 experiments,
    121,891 genes
  • Data warehouse updated everyday

8
Searching ArrayExpress
  • Keywords breast cancer, cell cycle, etc.
  • Accession numbers E-XXXX-d, e.g. E-AFFY-1281,
    E-TIGR-372, etc.
  • Secondary accession numbers GEO accession, e.g.
    GSE5389.
  • Species names mainly in Latin names (e.g. Homo
    sapiens), common names may be used as well (e.g.
    human).

9
(No Transcript)
10
ArrayExpress interface
11
(No Transcript)
12
ArrayExpress Search/Browse ResultKeyword lung
cancer
13
ArrayExpress Search/Browse ResultDetailed view
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
Expression Profile results
  • Thumbnail view
  • BigPlot view
  • Gene ranking (most differentially expressed
    experiments are top ranked)
  • Similarity search search genes with similar
    expression levels

20
(No Transcript)
21
(No Transcript)
22
Take a break
23
Gene Expression Omnibus (GEO)
  • http//www.ncbi.nlm.nih.gov/geo/
  • Gene expression/molecular abundance repository
  • MIAME compliant
  • Supports browsing, query and retrieval

24
(No Transcript)
25
GEO record types
  • Platform
  • Sample
  • Series
  • DataSet
  • Profile

26
GEO Platform
  • Platform record defines the list of elements that
    may be detected and quantified in that experiment
    (e.g., cDNAs, oligonucleotide probesets)
  • Each Platform record is assigned a unique and
    stable GEO accession number (GPLxxx)
  • A Platform may reference many Samples that have
    been submitted by multiple submitters

27
GEO Sample
  • Sample record describes the conditions under
    which an individual Sample was handled, the
    manipulations it underwent, and the abundance
    measurement of each element derived from it
  • Each Sample record is assigned a unique and
    stable GEO accession number (GSMxxx)
  • A Sample entity must reference only one Platform
    and may be included in multiple Series

28
(No Transcript)
29
GEO Series
  • A Series record links together a group of related
    Samples and provides a focal point and
    description of the whole study
  • Series records may also contain tables describing
    extracted data, summary conclusions, or analyses
  • Each Series record is assigned a unique and
    stable GEO accession number (GSExxx)

30
(No Transcript)
31
GEO DataSet
  • Assembled in NCBI
  • Samples are all equivalently measured and
    normalized
  • Can be viewed and analyzed with NCBIs advanced
    data display and analysis tool

32
(No Transcript)
33
GEO Profile
  • Profile consists of the expression measurements
    for an individual gene across all Samples in a
    DataSet
  • Profiles can be searched using Entrez GEO
    Profiles
  • Similar to Expression Profile in ArrayExpress

34
(No Transcript)
35
(No Transcript)
36
SOFT (Simple Omnibus Format in Text)
  • Text based
  • Line based
  • Easily parsed with text processing languages,
    including Perl, Python, Ruby, PHP, etc.

37
(No Transcript)
38
Take a break
39
Network Biology Visualization and Analysis
40
Cytoscape
  • Open source network visualization and analysis
    software
  • Core features include network layout and query,
    also integrate visualizations with state data
  • Can be extended by plugins

41
Cytoscape developers
  • University of California at San Diego (Trey
    Ideker)
  • Institute for Systems Biology (Leroy Hood)
  • Memorial Sloan-Kettering Cancer Center (Chris
    Sander)
  • Institut Pasteur (Benno Schwikowski)
  • Agilent Technologies (Annette Adler)
  • University of California at San Francisco (Bruce
    Conklin)

42
Cytoscape
  • A java application
  • Require Java 5 or 6 (JDK5/6 or JRE5/6)

43
(No Transcript)
44
Simple Interaction Format (SIF)
  • Each line denotes one interactionInteractorA xx
    Interactor B
  • xx are interaction types
  • pp protein-protein interaction
  • pd protein-DNA interaction (transcription
    factor/regulation)
  • pr (protein-reaction), rc (reaction-compound), cr
    (compound-reaction), gl (genetic-lethal), pm
    (protein-metabolite), mp (metabolite-protein)

45
Other interaction formats supported
  • GML
  • XGMML
  • SBML
  • BioPAX
  • PSI-MI
  • Tab-delimited text table and excel

46
Cytoscape Demonstration
47
Applications of Gene Expression
  • Gene selection (differentially expressed genes)
  • State annotation in networks (expression level)
  • Gene regulatory network identification
Write a Comment
User Comments (0)
About PowerShow.com