EDRN%20CDE%20Information%20Model - PowerPoint PPT Presentation

About This Presentation
Title:

EDRN%20CDE%20Information%20Model

Description:

Family Medical History. Colorectal. Participant ID. Data Collection Date ... data modeling, classification, and data dictionary development and maintenance. ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 10
Provided by: COMP127
Category:

less

Transcript and Presenter's Notes

Title: EDRN%20CDE%20Information%20Model


1
EDRN CDE Information Model
  • (Where its been where its going)

2
CDE ER Diagram
3
Modifying the model
  • Want to create a modified model using Protégé
    tool so we can
  • Fix parts that dont really work
  • Represent more information in model
  • Represent richer information in model (i.e.
    object relationships)
  • Enable scalability and expandability

4
High-level model changes needed
  • Object classes
  • Baseline
  • Follow-up
  • Study Identifier
  • Properties
  • Family Member
  • Family Cancer History
  • Family Medical History
  • Colorectal
  • Participant ID
  • Data Collection Date
  • Procedure_surgical history

5
(No Transcript)
6
Process Approach for Evolving the Model
  • Review the requirements
  • Review the model
  • Determine if the model is meeting the
    requirements
  • Determine what is wrong and what is missing
  • Rehost the model (in Protégé)
  • Collect model information including formal
    documents, design notes, memory
  • Use ontology tool to capture the model
  • Validate the rehosted model
  • Ingest examples, Perform regression tests
  • Document the model using desired notation (UML,
    E-R, HTML Text)
  • Modify the model
  • Gain consensus on proposed changes
  • Determine impact of proposed changes
  • Validate, export, and document the model
  • Schedule proposed changes

7
Process and approach for maintaining model
  • Identify Standards Committee
  • Standards Coordinator
  • Standards representative from each stakeholder
    organization
  • Review the requirements for the model -
    frequently
  • Submit proposed change to a standards change
    queue
  • Title, Submit_Date, Due_Date, Priority, Tracking
    Number, Status
  • Useful Background Information
  • Description of change (Benefits, justification,
    )
  • Impact study
  • Gain consensus on proposed change
  • Email straw vote
  • Email dialogs
  • Telecons with clear agenda only after convergence
    using email dialogs
  • Periodic face-to-face meetings
  • Implement the change
  • Schedule proposed change
  • Notify stakeholders of proposed change
  • Implement Change
  • Notify stakeholders of change implementation

8
Lessons Learned
  • Need community commitment to develop and maintain
    the data model
  • Data model development and maintenance are hard
    work
  • Time and funding resources for model developers
    is significant
  • PDS maintenance of the data model and the
    repository 3 FTEs per year
  • Initial development of a comprehensive data model
    for a community typically takes 2-3 years and
    serious commitment from the domain experts
  • The capture of information for loading into the
    PDS catalog has always been difficult
  • To load the catalog, the metadata is collected
    using a catalog template (ASCII file and standard
    format), validated against a data dictionary, and
    then loaded into the catalog where referential
    integrity is checked.
  • Data providers find it onerous collecting the
    requested information and completing the forms
    for ingestion
  • The data is often collected over a long period of
    time by different individuals, often resulting in
    multiple versions if left in separate files
  • Often data providers dont have the breadth of
    knowledge required and will have to perform time
    consuming research
  • Intelligent data collection tools that use the
    data model to guide a data provider is highly
    recommended
  • Data providers often find that it is difficult to
    fit the collected data about a specific instance
    to the data model
  • The requirements for the data model must be clear
    even to the data provider since compromises are
    frequently made

9
Lessons Learned, cont.
  • Data providers are typically under tight time and
    funding constraints resulting in a short window
    of opportunity for guidance, validation, and
    error correction
  • Sufficient staff must be available for these
    opportunities
  • The data ingestion tool suite needs to be very
    user friendly and have the capability for
  • Import from a standard flat file format
  • Online repository update
  • Export to a standard flat file format
  • Other suggestions
  • Clearly document the ingestion lifecycle with all
    its requirements. If appropriate it should be in
    sync with the project lifecycle that provides the
    information.
  • Start validation as early as possible and perform
    it often. Referential integrity seems to be the
    hardest integrity constraint to maintain.
  • The data model must evolve to stay viable. Data
    model changes must be consensus based, but should
    be limited to a very few knowledgeable
    individuals including highly regarded community
    representatives and data engineering staff.
  • Data engineering staff members that are critical
    to success are those who understand data
    modeling, classification, and data dictionary
    development and maintenance. This task is not
    software engineering or even database engineering
    intensive. The use of international information
    standards such as ISO/IEC 11179 will help
    immensely. The ontology community is coming of
    age and directly addresses this type of work.
  • The collected metadata is best managed in a
    single master repository with version control and
    role based authentication for user update and
    retrieval
  • Timely update is critical to maintain user
    confidence
Write a Comment
User Comments (0)
About PowerShow.com