Title: Microarray and Gene Expression
1(No Transcript)
2Microarray and Gene Expression Markup
Language (MAGE-ML) Evolution of a Standard
Michael Miller Senior Application
Developer Rosetta Biosoftware I3C Technical
Meeting November 8, 2002
3Overview
Acknowledgments A World without
Standards Benefits of Standardization The Road
to Standardization Lessons Learned Links
4Acknowledgments
- Bill Andreopoulos University of Toronto
- Cathy Ball Stanford University
- Doug Bassett Rosetta Biosoftware
- Derek Bernhart Affymetrix
- Alvis Brazma EBI
- Tina Boussard Stanford University
- Steve Chervitz Affymetrix
- Francisco De La Vega Applied Biosystems
- Eric Deutsch ISB
- Michael Dickson LION
- David Frankel IONA
- Jason Goncalves Iobion
- Ken Griffiths LION
- Robert Hubley ISB
- Brandon Hunt Rosetta Biosoftware
- Daniel Iordan Iobion
- Hilmar Lapp Novartis GNF
- Marc Lepage Molecular Mining
- Scott Markel LION
- W.L. Marks Iobion
- Douglas McArthur UC Santa Cruz
- Michael Miller Rosetta Biosoftware
- Helen Parkinson EBI
- Kjell Petersen University of Bergen
- Todd Peterson NCGR
- Angel Pizarro University of Pennsylvania
- Alan Robinson EBI
- Ugis Sarkans EBI
- Martin Senger EBI
- Paul Spellman UC Berkeley
- Jason Stewart Open Informatics
- Marcin Swiatek Imaging Research
- Charles Troup Agilent
- Joe White TIGR
- John Yost National Cancer Institute
- (and many more)
5A World without Standards
- Proprietary Lock-in
- Customers must use tools specific to the
technology purchased. - Allows vendor control over customers' data.
- Continual Development
- Everyone keeps adding or subtracting from a
moving target. - What worked one day suddenly stops working the
next.
6Benefits of Standardization
- Empowers gene expression data and annotation
interchange - Accommodate information from all leading
industry data formats and gene expression
platforms. - Able to convert from existing formats into
MAGE-ML. - Exchange data with colleagues.
- Facilitate development of interoperable freeware
and commercial tools for Conversion,
Visualization, and Analysis. - Empowers online supplements to published
manuscripts - Provide consistent format for peer review.
- Verify minimum information required for a
microarray experiment is provided.
7The Road to Standardization--Beginnings
- OMG Life Sciences Research Technical Committee
(LSR) issues a Request for Information about Gene
Expression, January 1999 - Requests organizations to submit requirements
for interchange of Gene Expression Data. - EBI, Rosetta Inpharmatics, and NetGenics
respond. - EBI replies with what will become the basis of
Minimum Information about Microarray Experiments
(MIAME). - Rosetta Inpharmatics replies with what will be
the basis of the XML format, Gene Expression
Markup Language (GEML format) v1.0. - Microarray Gene Expression Data Group (MGED) was
established at the International Meeting on
Microarray Gene Expression Databases, sponsored
by EBI, November 1999 - Four working groups established MIAME XML
Format Ontologies and Normalization.
8The Road to Standardization--Proposals
- LSR issues Gene Expression Request for Proposals
(RFP), March 2000 - Requirements based on the replies to the Request
for Information. - GEML v1.0 XML format intoduced into production
environment by Rosetta Inpharmatics and Agilent
Technologies, June 2000 - Launch of Rosetta Resolver system.
- MGED Holds Second Meeting in Heidelberg, May 2000
- Further develops MIAME.
- Initial Microarray Markup Language format
(MAML.dtd) is created. - LSR receives the Initial Proposals against the
Gene Expression RFP, November 2000 - NetGenics submittal based on CORBA interfaces.
- Rosetta Inpharmatics submittal based on GEML
v2.0 format. - EBI submittal based on MAML format.
9The Road to Standardization--Collaboration
- Submitters decide to collaborate
- Submitters meet face to face at MGED3
conference, March 2001. - Affymetrix, and other organizations provide
valuable feedback. - Submitters adopt OMG Model Driven Architecture
(MDA) approach - Start with a blank UML model, eventually named
Microarray and Gene Expression Object Model
(MAGE-OM). - Draw experience from, but do not base the model
on either GEML or MAML. - MDA approach allows the submitters to
concentrate on the domain problem and not worry
about implementation details. - Follow up
- Many teleconferences and face to face meetings
held. Mailing list provides feedback from broad
community. - Programming effort is begun after Toronto OMG
meeting, September 2001, to generate DTD and
platform code for Java, Perl, and C.
10The Road to Standardization--Finalization
- Revised Submission becomes Adopted by the OMG
- MAGE-OM v1.0 considered complete, October 2001.
MAGE-ML.dtd is generated from the model. - Submission is adopted by the OMG, January 2002,
and begins its finalization process to work out
any problems encountered by implementers. -
- OMG Finalization Task Force (FTF) and MGED
Programming Effort - Finalization Task Force (FTF) formed at OMG to
handle issues. - MGED mailing list generates discussion and
Programming Jamborees implement and test the code
generated from the model, finding issues and
potential solutions to those issues. - Adopted Specification passes OMG vote and becomes
an Available Specification - The FTF report is submitted, with an updated
specification and MAGE-ML.dtd, and accepted by
the OMG, October 2002. - MAGE-ML.dtd v1.0 available at http//schema.omg.o
rg/lsr/gene_expression/1.0.
11Lessons Learned
- Process
- By going through the OMG adoption process,
deadlines are setup, ensuring progress will be
made. - Requirements and milestones are clearly called
out by the adoption process. - Peer review ensures fair treatment.
- Specifications, once adopted, are open to
members and non-members for comment and use. -
- Collaboration
- Working together with fellow submitters lets
"best of breed" emerge. - GEML format was strong on those areas that
benefitted high throughput. - MAML format was strong in overview of gene
expression experiments. - MAGE-OM combines both these ideas.
- All interested parties can contribute, making
everyone a stakeholder. - Discourages proprietary approaches by offering
open alternative. - Freeware and commercially available tools will
catalyze conversion, publication, and exchange of
data.
12Links
- OMG and Submitters
- Object Management Group www.omg.org
- Life Science Research Domain Task Force
www.omg.org/lsr - Rosetta Biosoftware www.rosettabio.com
- Microarray Gene Expression Data Group
www.mged.org -
- Open Source
- Download for reference implementations
- http//mged.sf.net/downloads.shtml
- Discussion group
- mged-mage http//lists.sourceforge.net/lists/lis
tinfo/mged-mage - Generating Code
- http//cvs.sourceforge.net/cgi-bin/viewcvs.cgi/mg
ed