GENAdb - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

GENAdb

Description:

Centralised storage and management for the details of experimental conditions ... Acknowledgements. Chris Helliwell. Iain Wilson. IT Support Group. Robert Power (CMIS) ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 21
Provided by: plan51
Category:
Tags: genadb | iain

less

Transcript and Presenter's Notes

Title: GENAdb


1
GENAdb Genomics Array Database
GENAdb
CSIRO Plant Industry Genomics Array Database
Presented by Gavin Kennedy CSIRO Mathematical and
Information Sciences
2
GENAdb Genomics Array Database
Overview of the Microarray Process
3
GENAdb Genomics Array Database
Identification of expressed genes
Analyzed data
4
GENAdb Genomics Array Database
Data points in the microarray process
5
GENAdb Genomics Array Database
Drivers for a microarray database
  • Centralised storage and management for the
    details of experimental conditions
  • Centralised storage and management for large
    volumes of result sets
  • 1 Million individual data points for a single
    slide
  • Consistent access to stored experiment details
    and result sets.
  • Ability to perform structured queries across all
    the data.
  • Investigation of genes of interest across
    result sets.
  • Allow annotation at various stages of the
    microarray process.
  • Allow analytical tools to be used against the
    data and subsequent storage of analysis results.

6
GENAdb Genomics Array Database
What GENA does
  • Stores data related to a microarray experiment
  • Data to describe the experiment
  • Data to reproduce the experiment
  • Stores results from a microarray experiment
  • Results to support further investigation
  • Results to support a publication
  • Results for multiple analysis techniques
  • Stores data and results to support enquiry and
    analysis from multiple perspectives
  • Provides the interfaces to store, access and
    analyse the data and results

7
GENAdb Genomics Array Database
What GENA does not do
  • Store blast results
  • Analyse the result sets for you
  • Limitations
  • Time and complexity
  • Focus on purpose

8
GENAdb Genomics Array Database
High level structure of the database
  • Generic at the organism level
  • Plants and fungi for Plant Industry
  • Animals and bacteria for Livestock Industry
  • Specific at the microarray technology level (so
    far)
  • Three subsets of data tables
  • Arrays
  • Samples
  • Results

9
GENAdb Genomics Array Database
High level structure of the database
  • Structure consistent with other microarray
    databases
  • RAD RNA Abundance Database
  • (U. Pennsylvania, www.cbil.upenn.edu/RAD2/)
  • SMD Stanford Microarray Database
  • (genome-www.stanford.edu/MicroArray/SMD/)
  • GeneX
  • (NCGR genex.ncgr.org)
  • Compliance with developing standards
  • MIAME Minimum Information About a Microarray
    Experiment
  • (MGED www.mged.org)

10
GENAdb Genomics Array Database
Data points in the microarray process
11
GENAdb Genomics Array Database
Entity Relationship Diagram
Hybridised Onto
Produces
Source
Sample
Used In
Cardinality
Mandatory 1
Groups
Scans As
Scan
Slide
Experiment
1 to Many
0 or 1
Mapped by
Zero to Many
Contains
Spotted From
Amplified From
Array
Plate
Amplification
Consists of
Recorded In
Spot
Result Set
Contains
Contains
Identifies
Sequence
12
GENAdb Genomics Array Database
Gena Schema Source to Slide (Hybridisation)
13
GENAdb Genomics Array Database
Gena Schema Plate to Slide (Spotting)
Slide
Slide_ID
Experiment _ID
Array_ID
Date_Spotted
Date_Hybridised
Bio_Replicate_No
Tech_Replicate_No
Sample_X_ID
X_Labelling_Info
Sample_Y_ID
Y_Labelling_Info
14
GENAdb Genomics Array Database
Gena Schema Slide to Results (Scanning and
Quantification)
Slide
Slide_ID
Experiment _ID
Array_ID
Date_Spotted
Date_Hybridised
Bio_Replicate_No
Tech_Replicate_No
Sample_X_ID
X_Labelling_Info
Sample_Y_ID
Y_Labelling_Info
Norm_Results_1
Norm_Results_2
Norm_Results_3
Primary_Results_ID
Scan_ID
Spot_ID
Ch1_Median
Ch1_Mean
Ch2_Median
Ch2_Mean
15
GENAdb Genomics Array Database
Gena Database Schema
16
GENAdb Genomics Array Database
Normalisation and Analysis
  • Three sets of normalised data reflecting
    normalisation three methods
  • 1st Most popular
  • 2nd Second most popular
  • 3rd Flavour of the month
  • Normalisation performed internal to the database
  • Each new normalisation technique requires all
    result sets to be re-normalised
  • Analysis can be carried out on any of the three
    normalised result sets

17
GENAdb Genomics Array Database
Communications Model
18
GENAdb Genomics Array Database
Implementation of Gena
  • GENAdb runs as an Oracle database
  • Oracle database hosted on a dedicated NT/2000
    server managed by IT Support Group
  • A separate (but connected) web server runs
    Apache JServer
  • Java servlets used to
  • Generate web pages
  • Format, process and store data
  • Oracle/Java combination makes Gena portable to
    Unix/Linux platforms
  • R will run within the structure for statistical
    analysis tasks

19
GENAdb Genomics Array Database
Timeline
  • 3 months
  • Ready to load GPR files
  • 6 months
  • 90 of expected functionality implemented
  • 9-12 months
  • Refine existing processes
  • Implement remaining processes

20
GENAdb Genomics Array Database
  • Tell me more
  • Gena mailing list
  • gena_at_pi.csiro.au
  • Gena development homepage
  • www.pi.csiro.au/gena/
  • Feedback is welcome
  • Acknowledgements
  • Chris Helliwell
  • Iain Wilson
  • IT Support Group
  • Robert Power (CMIS)
Write a Comment
User Comments (0)
About PowerShow.com