The ACGT Data Access Infrastructure - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

The ACGT Data Access Infrastructure

Description:

Providing homogeneous, seamless access to heterogeneous (in terms of syntax and ... dicom:PatientsName 'Huge, Lurch' . ?study dicom:Patient ?patient ; ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 26
Provided by: luism8
Category:

less

Transcript and Presenter's Notes

Title: The ACGT Data Access Infrastructure


1
The ACGT Data Access Infrastructure
  • Luis Martín (lmartin_at_infomed.dia.fi.upm.es)
  • HealthInf 08
  • 30/01/2008

2
The Data Access Infrastructure Aims At
  • Providing homogeneous, seamless access to
    heterogeneous (in terms of syntax and semantics)
    sources of information.
  • Providing querying services to both end users and
    data analysis tools.

3
Main Resources
  • Within the framework of the Data Access
    Infrastructure in ACGT, several tools are being
    developed, namely
  • The ACGT Semantic Mediator
  • The ACGT Master Ontology on Cancer
  • The ACGT Data Access Services

4
Data Access Infrastructure within ACGT
5
Data Access Architecture
6
The ACGT Master Ontology
  • The ACGT Master Ontology on Cancer aims at
  • Enhancing cancer managment in Europe by enabling
    semantic interoperability.
  • Meeting all necessary preconditions of the
    project infrastructure.
  • Creating an ontology that is both philosophically
    and technically valid and sound.

7
Development Procedure
  • Continuous iterative development process that
    includes domain experts via face-to-face
    meetings, online telcos and e-mail discussions
  • At all times feedback is highly encouraged and
    integrated in the development

ontologydevelopers
clinicians researchers
8
Introduction
  • Following examples are takenfrom the clinical
    trial formsof the TOP trial onbreast cancer
  • Another source arethe forms from clinicaltrials
    on nephroblastomadone by and

9
(No Transcript)
10
Ontology as Black Box
  • Ontology has a heavilycomplex internal
    structurethat should not be exposedto the
    actual end user
  • End users access the ontology onlyvia
    specialized tools
  • Ontology Viewer
  • Mapping Tool
  • Querying Tool

11
The ACGT Semantic Mediator
  • The ACGT Semantic Mediator aims at
  • Providing access to integrated repositories of
    semantically heterogeneous databases.
  • Offering users a friendly interface to query
    these data.

12
Scientific Foundations of the Semantic Data
Integration Approach (I)
  • Query Translation vs. Data
    Warehouses
  • Given the nature of the data in ACGT, a query
    translation based approach was selected

13
Scientific Foundations of the Semantic Data
Integration Approach (II)
  • Global as View vs. Local
    as View
  • A LaV based approach has been selected. Master
    Ontology will act as Global Schema.

14
The ACGT Semantic Mediation Process
  • Data Integration using the mediator
  • A query is performed using the interface (query
    based on the ACGT Master Ontology).
  • The query is split, and different queries for the
    underlying databases are generated (via the
    mapping filter).
  • Queries are performed in the databases (through
    corresponding Data Access Services).
  • Results are returned and integrated (using the
    selected format).

15
The ACGT Semantic Mediator
  • Different components addressing different aspects
    of the same problem
  • Query Formulation Interface ? Helping end-users
    in formulating queries
  • Master Ontology ? Acting as Global Schema
  • Mediation Layer ? Resolving the query translation
    problem
  • OntoQueryClean ? Dealing with query identifier
    heterogeneities
  • OntoDataClean ? Addressing instance level
    heterogeneities
  • Mapping API and GUI ? Aiding in the virtual views
    creation process.

16
Mediator SIOP Dicom query
SELECT ?PatientIdentifier.ClinicalTrialPatientNum
ber, ?PatientIdentifier.pnr, ... WHERE ( ?a,
rdftype, hPatientIdentifier ), ... (
?a, hPatientIdentifier.hasStudy.Study, ?b
) USING ...
PREFIX h lthttp//gridnode.ehv.campus.philips.c
om/dicom/gt PREFIX xsd lthttp//www.w3.org/20
01/XMLSchemagt SELECT ?PatientID
?PatientsName WHERE OPTIONAL ?a
hPatientID ?PatientID . OPTIONAL ?a
hPatientsName ?PatientsName .
SELECT DISTINCT patient.siopnr, patient.pnr,
... FROM patient
17
Results
ltrdfRDF xmlnsj.0"http//infomed.dia.fi.upm.
es/SIOPDicom" xmlnsrdf"http//www.w3.org/19
99/02/22-rdf-syntax-ns" xmlnsj.1"http//www
.w3.org/2001/XMLSchema" xmlnsrdfs"http//ww
w.w3.org/2000/01/rdf-schema"
xmlnsowl"http//www.w3.org/2002/07/owl ...
ltowlClass rdfabout"http//infomed.dia.fi.upm.es
/SIOPDicomPatientIdentifier"/gt
ltowlDatatypeProperty rdfabout"http//infomed.di
a.fi.upm.es/SIOPDicomPatientIdentifier.ClinicalTr
ialPatientNumber"gt ... ltj.0PatientIdentifier
rdfabout"http//infomed.dia.fi.upm.es/SIOPDicom
PatientIdentifier13"gt ltj.0PatientIdentifier.H
ospitalIdentifiergt ltj.1stringgt
ltrdfvaluegtWithout Informationlt/rdfvaluegt
lt/j.1stringgt lt/j.0PatientIdentifier.Hospital
Identifiergt ltj.0PatientIdentifier.FirstNamegt
...
18
The ACGT Data Access Services
  • The ACGT Data Access Services aim at
  • Provide uniform interface
  • uniform transport protocol
  • uniform message syntax
  • uniform query syntax
  • uniform data format
  • Hide query peculiarities of data source
  • Hide query limitations of data source
  • Export data model of data source

19
Main types of data sources
  • Relational databases
  • CRF data, microarray data
  • DICOM servers
  • Medical image data
  • Public web databases
  • Gene and protein sequence databases
  • Files in various formats
  • Excell, XML, comma separated

20
Technology choices
  • OGSA-DAI
  • The standard web services framework for Data
    Access Interfaces
  • Supports activity framework for efficient and
    flexible services invocation
  • SPARQL
  • Modern RDF query language
  • Fits needs of mediator
  • Intermediate level of expressiveness
  • E.g. more expressive than DICOM query
    capabilities, less expressive than SQL
  • Suitable as an initial query language for wrappers

21
SPARQL for querying DICOM
  • Uniform query syntax
  • Any DICOM query can be expressed as SPARQL
  • SPARQL does not impose any limitations
  • Hide query limitations of data source
  • SPARQL filters can be used to create queries that
    cannot be expressed as DICOM queries
  • However, not all SPARQL queries can be
    efficiently converted to DICOM queries
  • Therefore, the data access service does not
    accept all queries
  • This is unavoidable, for performance reasons

22
Image retrieval
  • Hide query peculiarities of data source
  • Using DICOM Q/R you can only retrieve images by
    hosting a DICOM Application Entity
  • With the data access service, images can be
    delivered to URL
  • No need for the client to host a DICOM server
  • The use of various DICOM querying information
    models is hidden from user

23
Using the DICOM levels
  • DICOM LevelQuery.xml

SELECT ?patientId ?studyId ?seriesId WHERE
?patient dicomPatientID ?patientId
dicomPatientsName "Huge, Lurch" . ?study
dicomPatient ?patient
dicomStudyInstanceUID ?studyId . ?series
dicomStudy ?study
dicomSeriesNumber "3"
dicomSeriesInstanceUID ?seriesId .
sparqlQuery Statement
sparqlResults ToXML
ltresultgt ltbinding name"patientId"gt200650lt/bindi
nggt ltbinding name"studyId"gt1.3.46.670589.5.2.12.
2158432007.1002671691.401594lt/bindinggt ltbinding
name"seriesId"gt1.3.46.670589.5.2.12. 2158432007.1
002671552.91561lt/bindinggt lt/resultgt
24
Conclusions
  • We have successfully resolved the issues related
    to
  • DICOM and relational database integration
  • General mapping format
  • Implementation of a range of supporting tools
  • Open issues
  • Public databases integration
  • Development of a friendly query interface
    (exploring NL)
  • Ontology Mantainance and Extension (submission
    system)

25
Thank you
Write a Comment
User Comments (0)
About PowerShow.com