Title: ArrayExpress A public database for microarray based gene expression data http:www'ebi'ac'ukmicroarra
1ArrayExpressA public database for microarray
based gene expression datahttp//www.ebi.ac.uk/m
icroarray/
- European Bioinformatics Institute
- EMBL-EBI
- Alvis Brazma, Helen Parkinson, Ugis Sarkans,
Mohammadreza Shojatalab, Jaak Vilo team
MGED IV, Boston, February 2002
2ArrayExpress
Tuesday, February 12th, 2002 Opened to public
- Standards MIAME-compliant
- Data model MAGE-OM
- Data input MAGE-ML, web
- Data output HTML, MAGE-ML,
- TAB-delimited, link to Expression
Profiler - Data curation Team of curators
- Data sets Yeast, human
3General overview
ArrayExpress
MAGE-ML
MAGE-ML
MIAMExpress
Expression Profiler
Internet
www
4ArrayExpress component architecture
www
Application server Java servlets MAGE-OM
ArrayExpress
Main database SQL derived from MAGE-OM
Data warehouse gene-centred queries
Submission/ curation
Images file server
MAGE-ML
5ArrayExpress - features
- MIAME-compliant, MAGE-ML, MAGE-OM
- Can deal with
- raw quantitation data
- processed data
- data transformations
- Independent of
- experimental platforms
- image analysis methods
- data normalization methods
6ArrayExpress details
- Database schema derived from MAGE-OM
- Standard SQL, we use Oracle
- Data loader for MAGE-ML - generated
- Web interface (first release 12.2.2002)
- Queries by experiment, array, sample
- Browsing
- Object model-based query mechanism, automatic
mapping to SQL
7Simplified ArrayExpress model
8MIAMExpress
- Data annotation and submission tool
- MIAME based web interface
- Experiment, Array, Protocol submissions
- Uses CV/ontology wherever possible
- Creates MAGE-ML files for loading into
ArrayExpress - Based on MySQL, Perl, CGI, Apache
9(No Transcript)
10MIAMExpresssubmission procedure
11MIAMExpress design and future
- Species and domain specific pages and ontologies,
ontology development - Life-span of data submissions is long
- Curation control, submissions tracking
- Interaction with ArrayExpress
- Full MAGE-OM, data updating
- Usability, flexibility, scalability, platform
independence - User needs, free in-house installation
12ArrayExpress curation effort
- User support and help documentation
- Submission support for MIAMExpress
- Support on ontologies and CVs
- Minimize free text, removal of synonyms
- MIAME encouragement
- Help on MAGE-ML
- Goal to provide high-quality, well-annotated
data to allow automated data analysis
13Accession numbers
- E-MEXP-234 Experiment 234 via MIAMExpress
- E-SANG-25 Experiment 25 from Sanger
Institute - A-AFFY-1034 Array description 1034 from
Affymetrix - P-LABL-5 Protocol 5 for labeling
14Data in ArrayExpress
Now
Work underway
- Human data (ironchip) from EMBL
- Yeast data from EMBL
- S. pombe data Sanger Institute
- TIGR array descriptions
- Affymetrix chip designs
- Direct pipeline from Sanger (Rob Andrews)
- HGMP mouse
- EMBL mosquito
- (Add your name here!)
15Data browsing and queries
16(No Transcript)
17Experiment info
18Sample info
19General overview
ArrayExpress
MAGE-ML
MAGE-ML
MIAMExpress
Expression Profiler
Internet
www
20Expression Profiler EPCLUST
FOLDER
DATA
SELECT
ANALYZE
A CLUSTER
URLMAP
21101 Sequences relative to ORF start
YGR128C 100
gtYAL036C chromo1 coord(76154-75048(C))
start-600 end2 seq(76152-76754) TGTTCTTTCTTCTT
CTGCTTCTCCTTTTCCTTTTTTTCCTTCTCCTTTTCCTTCTTGGACTTTA
GTATAGGCTTACCATCCTTCTTCTCTTCAATAACCTTCTTTTCTTGCTTC
TTCTTCGATTGCTTCAAAGTAGACATGAAGTCGCCTTCAATGGCCTCAGC
ACCTTCAGCACTTGCACTTGCTTCTCTGGAAGTGTCATCTGCACCTGCGC
TGCTTTCTGGATTTGGAGTTGGCGTGGCACTGATTTCTTCGTTCTGGGCG
GCGTCTTCTTCGAATTCCTCATCCCAGTAGTTCTGTTGGTTCTTTTTACT
CTTTTTCGCCATCTTTCACTTATCTGATGTTCCTGATTGCCCTTCTTATC
CCCTCAAAGTTCACCTTTGCCACTTATTCTAGTGCAAGATCTCTTGCTTT
CAATGGGCTTAAAGCTTGAAAAATTTTTTCACATCACAAGCGACGAGGGC
CCGTTTTTTTCATCGATGAGCTATAAGAGTTTTCCACTTTTAAGATGGGA
TATTACGGTGTGATGAGGGCGCAATGATAGGAAGTGTTTGAAGCTAGATG
CAGTAGGTGCAAGCGTAGAGTTGTTGATTGAGCAAA_ATG_ gtYAL025C
chromo1 coord(101147-100230(C)) start-600
end2 seq(101145-101747) CTTAGAAGATAAAGTAGTGAATT
ACAATAAATTCGATACGAACGTTCAAATAGTCAAGAATTTCATTCAAAGG
GTTCAATGGTCCAAGTTTTACACTTTCAAAGTTAACCACGAATTGCTGAG
TAAGTGTGTTTATATTAGCACATTAACACAAGAAGAGATTAATGAACTAT
CCACATGAGGTATTGTGCCACTTTCCTCCAGTTCCCAAATTCCTCTTGTA
AAAAACTTTGCATATAAAATATACAGATGGAGCATATATAGATGGAGCAT
ACATACATGTTTTTTTTTTTTTAAAAACATGGACTCGAACAGAATAAAAG
AATTTATAATGATAGATAATGCATACTTCAATAAGAGAGAATACTTGTTT
TTAAATGAGAATTGCTTTCATTAGCTCATTATGTTCAGATTATCAAAATG
CAGTAGGGTAATAAACCTTTTTTTTTTTTTTTTTTTTTTTTGAAAAATTT
TCCGATGAGCTTTTGAAAAAAAATGAAAAAGTGATTGGTATAGAGGCAGA
TATTGCATTGCTTAGTTCTTTCTTTTGACAGTGTTCTCTTCAGTACATAA
CTACAACGGTTAGAATACAACGAGGAT_ATG_ ... gtYBR084W
chromo2 coord(411012-413936) start-600 end2
seq(410412-411014) CCATGTATCCAAGACCTGCTGAAGATGCTT
ACAATGCCAATTATATTCAAGGTCTGCCCCAGTACCAAACATCTTATTTT
TCGCAGCTGTTATTATCATCACCCCAGCATTACGAACATTCTCCACATCA
AAGGAACTTTACGCCATCCAACCAATCGCATGGGAACTTTTATTAAATGT
CTACATACATACATACATCTCGTACATAAATACGCATACGTATCTTCGTA
GTAAGAACCGTCACAGATATGATTGAGCACGGTACAATTATGTATTAGTC
AAACATTACCAGTTCTCGAACAAAACCAAAGCTACTCCTGCAACACTCTT
CTATCGCACATGTATGGTTCTTATTGTTTCCCGAGTTCTTTTTTACTGAC
GCGCCAGAACGAGTAAGAAAGTTCTCTAGCGCCATGCTGAAATTTTTTTC
ACTTCAACGGACAGCGATTTTTTTTCTTTTTCCTCCGAAATAATGTTGCA
GCGGTTCTCGATGCCTCAAGAATTGCAGAAGTAAACCAGCCAATACACAT
CAAAAAACAACTTTCATTACTGTGATTCTCTCAGTCTGTTCATTTGTCAG
ATATTTAAGGCTAAAAGGAA_ATG_
GATGAG.T 152/70 2453/508 R7.52345
BP1.02391e-33G.GATGAG.T 139/49 2193/222
R13.244 BP2.49026e-33AAAATTTT 163/77
2833/911 R4.95687 BP5.02807e-32TGAAAA.TTT
145/53 2333/350 R8.85687 BP1.69905e-31TG.A
AA.TTT 153/61 2538/570 R6.45662
BP3.24836e-31TG.AAA.TTTT 140/43 2254/260
R10.3214 BP3.84624e-30TGAAA..TTT 154/65
2608/645 R5.82106 BP1.0887e-29 ...
GATGAG.T TGAAA..TTT
221 mismatch
GATGAG.T TGAAA..TTT
GATGAG.T W/30 TGAAA..TTT
Upstream sequence (600bp)
23(No Transcript)
24Components of Expression Profiler http//ep.ebi.a
c.uk/
External data, tools pathways, function, etc.
EPPPI Prot-Prot ia.
EPGO GeneOntology
Expression data
EPCLUST Expression data
GENOMES sequence, function, annotation
URLMAP provide links
SEQLOGO
SPEXS discover patterns
PATMATCH visualise patterns
25Ackowledgments the team (3)
1999 November
MGED 1 in Hinxton, EBI
Alvis Brazma Alan Robinson Jaak Vilo
26Ackowledgments the team (5)
2000 August
Alvis Brazma, Alan Robinson
Database
Ugis Sarkans
Expression Profiler
Research, students
Jaak Vilo
Thomas Schlitt
27Ackowledgments the team (9)
2001 June
Alvis Brazma
Database
Curation
MIAMExpress
Ugis Sarkans
Helen Parkinson
Mohammadreza Shojatalab
Expression Profiler
Research, students
Jaak Vilo
Thomas Schlitt
Patrick Kemmeren
Katja Kivinen
Johan Rung
28Ackowledgments the team (19)
2002 February
Alvis Brazma
Database
Curation
MIAMExpress
Ugis Sarkans
Helen Parkinson
Mohammadreza Shojatalab
Susanna Sansone
Ahmet Oezcimen
Gonzalo Garcia
Philippe Rocca-Serra
Niran Abeyguna- wardena
Ele Holloway
Expression Profiler
Research, students
Jaak Vilo
Thomas Schlitt
Lev Soinov
Patrick Kemmeren
Katja Kivinen
Anastasia Samsonova
Misha Kapushesky
Johan Rung
Koichi Tazaki