The Tissue Microarray Data Exchange Specification - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

The Tissue Microarray Data Exchange Specification

Description:

The TMA Specification is an open access document that can be used without any ... Speakers: Olli Kallioniemi, Chris Chute, Richard Lieberman, Paul Spellman. ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 17
Provided by: Berm
Category:

less

Transcript and Presenter's Notes

Title: The Tissue Microarray Data Exchange Specification


1
The Tissue Microarray Data Exchange
Specification Presented for Cambridge
Healthtech Institute Microarrays in
Medicine Boston, MA April 26, 2004 Jules J.
Berman, Ph.D., M.D. Program Director for
Pathology Informatics Cancer Diagnosis
Program National Cancer Institute National
Institutes of Health Rockville, MD This
presentation is a U.S. government-sponsored work
in the public domain
2
In brief The TMA Specification is an open
access document that can be used without any
restriction. Its development was sponsored by the
NCI and by the Association for Pathology
Informatics All the documents and software that
you might need to obtain, understand and
implement the specification are available in two
recently published open access manuscripts.
3
Basics of the specification Jules J Berman, Mary
Edgerton and Bruce Friedman.The tissue microarray
data exchange specification a community-based,
open source tool for sharing tissue microarray
data. BMC Med Inform Decis Mak. 2003 May 2335
Real-world implementation example Jules J
Berman, Milton Datta, Andre Kajdacsy-Balla,
Jonathan Melamed, Jan Orenstein, Kevin Dobbin,
Ashok Patel, Rajiv Dhir, Michael J Becich. The
tissue microarray data exchange specification
implementation by the Cooperative Prostate Cancer
Tissue Resource. BMC Bioinformatics 2004 Feb 27,
519
4
Why is it important to have a data exchange
specification for TMAs? The greatest value of
TMAs is the ability to link TMA data with data
from other TMAs and from other databases that
inform on the data contained in the TMA database.
That value is essentially untapped because
there has been no way to publish, exchange, merge
and link TMA datasets in a manner that everyone
can use and understand. The data exchange
specification provides a common intermediate
structure for TMA data that can be used to
exchange data between different TMA databases.
5
Analagous situation Wordperfect (different
versions) Word (different versions) Abiword Postsc
ript Pdf One vendors software often cannot open
files prepared in another vendors software. But
any good word processor should be able to export
a file as an RTF file (simple ascii with markup
for formatting), and should be able to import the
RTF file and convert it to their preferred
proprietary format.
6
  • We wanted to make a flexible specification for
    TMAs that would permit researchers with
    proprietary systems to port their TMA data into a
    file that could be easily disassembled and
    re-assembled into other formats.
  • The basic properties of the file
  • Self-describing
  • Made from commonly understood data structures
  • Extremely simple (most of our stakeholders are
    not sophisticated bioinformaticians, computer
    scientists, or metadata experts)
  • Infinitely scalable (can be endlessly combined
    with other data sources)

7
The first draft of the specification was
developed through open workshops held at
meetings sponsored by the Association for
Pathology Informatics and the National Cancer
Institute
8
May 30, 2001. Ann Arbor, Michigan. Chair of
speaker session Mark A Rubin. Speakers David
Rimm, Steve Bova, Matt Van de Rijn, Jules
Berman Oct. 6, 2001. Pittsburgh, PA and
co-sponsored by The National Cancer Institute.
Chair, Mary Edgerton. Speakers Olli Kallioniemi,
Chris Chute, Richard Lieberman, Paul Spellman.
Chair of Data Exchange Workshop Mary Edgerton.
May 22, 2002. Ann Arbor, Michigan and
co-sponsored by the National Cancer Institute.
Chair of Speaker session Mark A. Rubin.
Speakers James Bacus, Angelo de Marzo, Peggy
Porter, David Rimm and Guido Sauter. Chair of
Data Exchange Workshop Dr. Mary
Edgerton. October 4, 2002. Held in conjunction
with Advancing Pathology Informatics, Imaging and
the Internet, Pittsburgh, PA. Chair of speaker
session Mary Edgerton. Speakers Steve Hewitt,
Ulysses Balis. Chair of Data Exchange Workshop
Mary Edgerton.
9
Specification is XML XML allows heterogeneous
systems to communicate and exchange their data It
achieves this through metadata (data about data).
Can produce an ideal document that completely
describes itself, including all data and all
metadata.
10
Four required sections 1) Header, containing
the specification Dublin Core identifiers, 2)
Block, describing the paraffin-embedded array of
tissues, 3)Slide, describing the glass slides
produced from the Block, and 4) Core, containing
all data related to the individual tissue samples
contained in the array.
11
Eighty Common Data Elements (CDEs), conforming
to the ISO-11179 specification for data elements
constitute XML tags used in the TMA data exchange
specification. Only a hand-ful of these are
required in TMA files. A set of six simple
semantic rules describe the complete data
exchange specification. Anyone using the data
exchange specification can validate their TMA
files using a software implementation written in
Perl and distributed as a supplemental file with
this publication.
12
lthistogt    lttmagt    ltheadergt    lt/headergt       ltb
lockgt          ltslidegt          lt/slidegt          
ltcoregt          lt/coregt       lt/blockgt    lt/tmagt lt
/histogt
13
lt?xml version"1.0" ?gt lthisto xmlns"http//65.222
.228.150/jjb/tma_cde.htm"
xmlnscpctr"http//www.pathology.pitt.edu/pdf/cpc
tr/cpctr-cde-v22.pdf" xmlnsdc"http//dubl
incore.org"gt lttmagt ltheadergt ltdctitlegtCooperative
Prostate Cancer Tissue Resource (CPCTR) Prostate
Cancer Microarray 1-2lt/dctitlegt ltdccreatorgtCPCTR
lt/dccreatorgt ltdcsubjectgtProstate tissue
microarraylt/dcsubjectgt ltdcdescriptiongtCPCTR TMA
XML datafile for Microarray 1-2lt/dcdescriptiongt lt
dcpublishergtCPCTRlt/dcpublishergt ltdccontributorgt
CPCTRlt/dccontributorgt ltdcdategt2003-10-05lt/dcdat
egt ltdctypegtProstate Cancer Tissue
Microarraylt/dctypegt
14
ltrecordgt ltcpctrIMS_Case_Identifiergt1053371588lt/c
pctrIMS_Case_Identifiergt ltcpctrLocation_CodegtG6
1lt/cpctrLocation_Codegt ltcpctrRacegtCaucasianlt/cp
ctrRacegt ltcpctrYear_of_Birthgt1926lt/cpctrYear_o
f_Birthgt ltcpctrYear_of_Diagnosisgt1991lt/cpctrYea
r_of_Diagnosisgt ltcpctrYear_of_Prostatectomygt1991
lt/cpctrYear_of_Prostatectomygt
ltcpctrIs_Residual_Carcinoma_PresentgtYeslt/cpctrIs
_Residual_Carcinoma_Presentgt ltcpctrMost_Prominen
t_Histologic_Typegtadenocarcinoma NOS aka
acinarlt/cpctrMost_Prominent_Histologic_Typegt
ltcpctrGleason_Primary_Gradegt4lt/cpctrGleason_Prim
ary_Gradegt ltcpctrGleason_Secondary_Gradegt3lt/cpct
rGleason_Secondary_Gradegt ltcpctrGleason_Sum_Sco
regt7lt/cpctrGleason_Sum_Scoregt
ltcpctrNumber_of_Nodes_Examinedgt5lt/cpctrNumber_of
_Nodes_Examinedgt ltcpctrNumber_of_Nodes_Positivegt
0lt/cpctrNumber_of_Nodes_Positivegt
ltcpctrDistant_Mets__1_at_Time_of_DiagngtBladderlt/c
pctrDistant_Mets__1_at_Time_of_Diagngt
ltcpctrpT_StagegtpT3blt/cpctrpT_Stagegt
ltcpctrpN_StagegtpN0lt/cpctrpN_Stagegt
ltcpctrpM_StagegtpMXlt/cpctrpM_Stagegt
ltcpctrVital_StatusgtAlivelt/cpctrVital_Statusgt
ltcpctrYear_of_PSA_Recurrencegtlt/cpctrYear_of_PSA_
Recurrencegt ltcpctrPSA_Recurrence_StatusgtUnknownlt
/cpctrPSA_Recurrence_Statusgt ltcpctrRecurrence_F
ree_Yeargtlt/cpctrRecurrence_Free_Yeargt ltarray_loca
tionsgtrow 9, column 18row 10, column
4lt/array_locationsgt lt/recordgt
15
  • Implementing the specification
  • We provide
  • The specification (XML data structure and 80
    common data elements)
  • A perl-script validator
  • A paper that describes a real-world
    implementation (porting TMA data from an excel
    spreadsheet)
  • You provide
  • Whatever database you like for storing your TMA
    data
  • A script (java, perl, python, whatever) that can
    port your data into the TMA specification.
  • A script that can port TMA files in the data
    exchange specification into whatever database you
    prefer.

16
Future?
Write a Comment
User Comments (0)
About PowerShow.com