Title: caBIO at BOSC 2002 NCICB
1Use of Mapping for caBIG Standard Vocabularies
Issues to Consider
Margaret Haber Office of Communications National
Cancer Institute
2Overview
- Mapping Why
- Mapping How
- NCI Metathesaurus example methods
- caBIG Mapping Issues and Questions
3One Answer is MappingRelating Terminologies for
Effective Data Exchange
- A holistic view of information exchange requires
broader interoperability, butwhere do we place
the fences? - Clinical data, regulatory submissions, discovery
research? - Industry agreements, nationally accredited,
global standardization?
4The Pillars of InteroperabilityNecessary but not
sufficient
- Common information models across all domains of
interest - A foundation of rigorously defined data types
(metadata) - A methodology for interfacing with controlled
vocabularies
5Interoperability Keys for Terminology
- Use of Industry Standards, where feasible
- Must allow for extensions to core standards
- Specialty terminology remains common
- Mapping is therefore essential
- Conformance with Data Models
- For process (logical models)
- For data flow (messages)
- For data at rest (database design)
6Mapping has been named as an essential part of
enabling effective interchange between core,
standard terminologies. Complete mapping includes
- An initial high level map performed using
algorithmic insertion formulas, such as lexical
matching, followed by human review for accuracy - Rule-based mapping of non-overlap areas using
consistent, explicit coding principles - Actual testing and implementation in systems that
demonstrate accuracy and effectiveness for data
capture, transfer, collection and interpretation
7High Level Map
- Identifies content/concept overlap and gaps
- Reveals issues of semantic equivalence or
synonymy, differences in definition of terms - Allows comparative analysis of classification
systems and structural models - Provides comparison of content coverage in
required domains - Allows system query to quantify comparisons for
further analysis
8COSTART vs MedDRA vs SNOMED
9Rule Based Map
- Often application and system dependent
- Requires domain knowledge and consensus to judge
appropriate methodology for concept-walking
between terminologies - Must grapple with issues of pre- vs.
post-coordination (full specification vs.
composition) of terms and concepts - Terminology assessment and gap analysis
identifies new concept needs, driving both
structural requirements for the terminologies and
the mapping rules between them
10As an example of some issues that can be involved
. . . .
11SNOMED CT
- Merger of the College of American Pathologists
Systematized Nomenclature of Medicine (SNOMED)
Reference Terminology (RT) and UK Clinical Terms
Version 3 (CTV3 or Read Codes) - Broad based clinical terms (CT) with 350,000
concepts, gt1.37 Million semantic relationships - Terms of License Five year free distribution in
US through NLMs UMLS for English and Spanish
versions including concepts, descriptions,
relationships, and history. If terminated, the
last version of SNOMED remains distributable with
no further updates provided
12MedDRA (v7.1)
- Description
- International dictionary of medical terms
including some 65,000 diagnoses, signs
symptoms, adverse drug reactions, therapeutic
indications, names/results of laboratory,
radiological, and other investigations,
surgical/medical procedures, social circumstances - Developer
- Under the auspices of the International
Conference on Harmonization of Technical
Requirements for Registration of Pharmaceuticals
for Human Use (ICH) as agreed-upon terminology
for regulatory reporting - Terms of Licensing
- Maintenance and Support Services Organization
(MSSO) supplies MedDRA in various formats,
multiple levels of subscription
13MedDRA vs SNOMED Content
- MedDRA content coverage, granularity (level of
detail) and fully specified (pre-coordinated)
terms suited to the capture of data relevant to
adverse event reporting for drugs - SNOMED has broader/more granular content
coverage for the representation of information in
many domains of patient records, potential for
enabling post-coordination (composition) of
concepts
14MedDRA vs SNOMED Structure
- MedDRA hierarchy not is_a but rather groups of
related terms in a structure designed to maximize
capture and retrieval of relevant related terms
these terms may fall into more than one higher
level grouping - SNOMED is concept based, with terms and true
synonyms classified in an is_a
poly-hierarchical structure it is more reliant
on composition to fully express granular concepts
15Example of MedDRA Terminology Links for Toxic
Epidermal Necrolysis
Skin and Subcutaneous Tissue Disorders
Infections and Infestations
Injury, Poisoning and Procedural Complications
Immune System Disorders
SOC
Epidermal and Dermal Conditions
Ancillary Infectious Topics
Chemical Injury, Overdose, and Poisoning
Allergic Conditions
HLGT
Bullous Conditions
Inflammatory Disorders Following Infection
Poisoning and Toxicity
Allergies to Foods, Food Additives, Drugs and
Other Chemicals
HLT
Toxic Epidermal Necrolysis PT
PT
Drug Eruption Lyell Syndrome Type
Necrolysis Epidermal Toxic (Lyell Type)
Necrolysis Epidermal Toxic (Lyell Type)
Necrolysis Epidermal Toxic (Lyell Type)
Necrolysis Epidermal Toxic (Lyell Type)
Necrolysis Epidermal Toxic (Lyell Type)
Necrolysis Epidermal Toxic (Lyell Type)
LLT
Drug Eruption Lyell Syndrome Type
Necrolysis Epidermal Toxic (Lyell Type)
Necrolysis Epidermal Toxic (Lyell Type)
Necrolysis Epidermal Toxic (Lyell Type)
Necrolysis Epidermal Toxic (Lyell Type)
Necrolysis Epidermal Toxic (Lyell Type)
Necrolysis Epidermal Toxic (Lyell Type)
16Example of SNOMED Terminology Links for Toxic
Epidermal Necrolysis
Chemical compound
Barbitone chemical
Chemical product
Chemical categorized structurally
Chemical suspension
Antipyrine
Has causative
Pyrazole derivative
- Toxic epidermal necrolysis Synonyms Lells
toxic epidermal necrolysis, supdpidermal type
TEN Toxic epidermal necrolysis Lyells
syndrome
Dermis
Has finding site
Is a
Is a
Chemical-induced dermatotogical disorder
Non-infectious, vesicular and/or bullous disease
Disease of skin and/or subcutaneous tissue
17Issues to consider when looking at potential
standard terminologies before undertaking maps
- Terminologies in use for health communications
reporting must have clear policies and procedures
for both national and international use
(scope/extent of license, fees, maintenance) - Reliable mechanisms for user feedback must be in
place to ensure requirements for updates can be
incorporated into the terminology and published
on a regular basis - The above requirement (2) becomes more
challenging with terminologies that have very
broad coverage or a wide user base
18Issues to Consider . . .
- Concept capture is only one part of the story
structure such as hierarchy is critical for
aggregation and retrieval, and very important
particularly for reporting applications - Hierarchies are also important in mapping, to
indicate the intended meaning of a concept if
ambiguous, and to provide appropriate terms for
mapping (up-coding) from more granular
terms/terminologies. Cross-maps thus require
comparative structural analysis - Size of the terminology and depth of hierarchies
can be a significant issue for applications
19NCI EVS Goal Integration by Meaning
- Clinical, translational, and basic research
terminology have overlapping but specialized
needs, therefore EVS assists to - Integrate different conceptual frameworks
- Create terminological and taxonomic conventions
across systems - Provide terminology for
- Tagging store/transfer/archive data for future
analysis - Reasoning limited inferencing about data
20NCI Metathesaurus
- Filtered UMLS Metathesaurus extended with
additional required vocabularies - 930,000 concepts, 2,200,000 terms and phrases
with definitions - Mappings among over 50 vocabularies
- Extensive synonymy Over 40,000 terms for
neoplasms mapped to 7,000 concepts - Used as online dictionary and thesaurus, for
mapping and document indexing
21NCI Metathesaurus (2)
- Minor releases monthly, Major releases twice a
year - Provides a mapped overlap and partial
inter-relation of current versions of NCI
required vocabularies, ex. The ICDs, MedDRA,
SNOMED, MeSH (NLM Medical Subject Headings),
HCPCS (procedures), LOINC (lab values), drug
terminologies (VA NDF, AOD, RxNORM, Multum, NCI
Thesaurus drugs, etc.)
22NCI Metathesaurus Data Profile
- Multiple sources per concept or unit of
semantic meaning - Hierarchically organized to reference NCI
Thesaurus structures when possible alternate
source hierarchies may also be viewed - 109 Term Types within concepts such as
preferred term (PT), synonyms (SY), abbreviations
(AB), obsolete (OB), brand name (BR) drugs, etc.
23NCI Metathesaurus Profile (2)
- 135 Semantic Types divide groups of concepts
into general domains of meaning such as
Neoplasms, Genes, Pharmacologic Substance,
etc. - gt5,600,000 asserted relationships between
concepts ex. - Carcinoma Clinically_associated_with Lytic
Bone Lesion, - TP53 Gene_associated_with_Disease Breast
Carcinoma
24(No Transcript)
25 - For Mapping caBIG Standard Vocabularies
- Is it advisable to offer a caBIG standard
terminology set based on concept mappings to an
existing standard terminology?
26Maybe
- The answer is a qualified yes but its not
easy, and risks are present. - Maps are difficult to construct.
- Even carefully constructed maps can convey some
degree of false synonymy - Like terminologies, maps need maintenance.
27Imprecise or False Synonymy
- It may not be possible to find an exact match
between concepts in two terminologies. - Often the best one can do is to come close.
- How is the degree of closeness indicated to a
user? - For certain applications, e.g., clinical,
close may not be acceptable.
28Like Terminologies, Maps Require Maintenance
- A map exists between a pair of terminologies,
both of which are at specified revision levels or
versions - A change to either terminology requires that the
map be re-verified and possibly redone
29This Raises Questions
-
- Who will consistently verify an update maps?
- How committed is the mapping organization to
this activity? - How much time is needed to update the map after
either vocabulary changes? - What is the status of a mapped terminology
extension in the interim?
30One Size Doesnt Fit All
- Theres no single definitive map between a pair
of terminologies - Maps must be based on a well-defined use cases
- Different use cases may require different maps
- Example NLM makes multiple maps between CPT
and ICD to meet varying use cases from providers
and payers.
31Vocabularies for Reporting are Especially
Demanding
- Its extremely challenging to substitute an
alternative terminology for one designed for
specific purposes. - FDA requires MedDRA for reporting adverse events
- It would be difficult to code adverse events
using another terminology and map precisely to
MedDRA for reporting. - There is significant potential to undermine the
integrity of the reports submitted.
32The Bottom Line
- Mapping is expensive, inherently limited, highly
particularized, and represents an ongoing burden - The payoff for a mapping effort must be justified
by clear operational benefits - If a mapped terminology becomes a caBIG standard,
its ongoing maintenance and integrity becomes a
significant concern for all users of that
terminology - The analysis that supports the decision to create
such a map must consider all the costs and risks
of doing so.
33EVS Team Acknowledgments
- NCI Office of Communications
- Margaret Haber
- Larry Wright
- NCI Center for Bioinformatics
- Frank Hartel
- Sherri de Coronado
- Gilberto Fragoso
- Contractors
- Apelon, Inc. Northrop Grumman, Inc.
- Aspen, Inc. Kevric / IMC
- MSD J. Oberthaler Consulting
- SAIC Protégé/SMI
- Collaborations External Reviews
- NCI caBIG, CTEP, DTP, DCP, DCCPS, MMHCC, etc.
- NIH NLM, NHLBI
- Govt FDA, VHA, CDC, DoD, NASA
- Other CAP/SNOMED, AFIP, HL7, CDISC, MGED, W3C