Metadata - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Metadata

Description:

Take 5 minutes, right now, to think about YOUR data, and do some brainstorming. ... native data and analysis for easier access control and separation (Brian's email) ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 18
Provided by: Rom566
Category:
Tags: brians | metadata

less

Transcript and Presenter's Notes

Title: Metadata


1
Metadata What is it, and why we need it
By Roman Olschanowsky roman2u_at_sdsc.edu
2
Create some metadata
  • Take 5 minutes, right now, to think about YOUR
    data, and do some brainstorming.
  • Write down a definition of metadata, and any
    ideas you have about metadata regarding your
    files and/or datasets.

3
Metadata - data about data?
  • System metadata (most file systems)
  • Developed for OS, not very helpful to you
  • Size, owner, permissions, timestamps,
  • Standardized metadata
  • File headers jpeg, mp3, DICOM(s),
  • Dublin Core Title, Creator, Subject, Date,
  • User defined metadata
  • XML (Whatever I want !!!)
  • Database (Whatever I want !!!)
  • SRB (Whatever I want !!!)

4
System Metadata
  • Q If all I have is a plain file system, how do I
    do metadata?
  • A Organization, build a meaningful hierarchy

Patient (Roman)
label
mri
surf
Log File
Surface File
Label File
brain
wm
filled
aseg
norm
transforms
flash
Transform File
Slice File
Slice File
Slice File
Slice File
Slice File
Flash File
Parameter_maps
5
A good hierarchy - Is this enough?
  • I now have 1000s of patients.
  • Dr. Suchandsuch asks me How many of your
    patients have a cranial thickness greater than .5
    inches?
  • We can dig through all the images and measure the
    thicknesses, but now where to store the results?
  • 50 are greater than .5 inches
  • Great! Now how many of those are male, and were
    scanned with a GE system?
  • Sir, 75 male and GE, other 25 male too but
    scanned with different systems (fictional
    numbers)

6
Standardized Metadata
  • Dublin core What is the bare minimum metadata
    that needs to be present?
  • Everybody's idea of bare minimum is different
  • Whats left isnt very useful Format Power
    Point File
  • File Headers
  • Very useful
  • (Think of them as system metadata for that file
    type)
  • Width 10px bite rate 128 Kbps Scanner GE
  • But, the more files you have the slower it gets!
  • Who decides what that header is? Does everybody
    actually follow that standard?

7
User defined metadata
  • Finally, a place to store my cranial thickness
    attribute.
  • XML
  • Great! Its not platform or application specific.
  • But, its usually slow, and with lots of
    overhead.
  • Database
  • Great! Its fast and it gives me my answers, more
    flexible (primary / foreign keys)
  • But, its expensive (Labor, licenses) Worst Its
    separate from the data, things can become out of
    sync.
  • SRB
  • Great! Its fast and its apart of the same
    system as the data.
  • But, what if I take the data out of the system?
    How does the metadata leave too?

8
BIRN Human Collection and Metadata hierarchy
Analyses on many subjects across institutions
BIRN_ID Timestamp
XML file
XML file
Analyses on a subject across institutions and
studies
VisitID?
XML file
XML file
Analyses on many series of a subject within an
institution
StudyID?
XML file
XML file
Analyses on muliple Series done at 1 institution
Image/Scanner Parameters?
XML file
XML file
Analyses on images from this Series
XML file
XML file

XML file
XML file
XML file
XML file
XML file
Freesurfer
LDMM
Original is a pointer to the corresponding
original scanner format
XML file?
XML file?
9
Directory Hierarchy
SRB Metadata
XML elements (non-structural)
HID Database
Notes
BIRN
Should analyses that cross multiple data levels
be split out to separate hierarchy?
Human
All Analysis collections are writeable so that
users can create their own analysis collections
(snapshots)
Research Project (Name__ID)
ltprojectgt
Project ID
nc_experiment
Analysis
Subject (BIRN ID)
ltsubjectConstgt
BIRN ID Timestamp
nc_humanSubject
Analysis
Institution Visit (Visit__Site ID_Visit )
Visit ID Institution ID
ltsubjectVargt
Analysis
nc_expComponent
Study (Study__ID )
ltscannergt
Analysis
Study ID
Series (Series__localID)
Series Number Scanner Parameters?
nc_expSegment and protocol section
Analysis
Separate the native data and analysis for easier
access control and separation (Brians email)
Native
Analysis
Native Data Represents an upload of the
original data Analysis Represents a different
analysis (either partial or full)
research and derived data sections
ltacqProtocolgt ltexpProtocolgt ltdatarecgt
Image Parameters?
Snapshot 1
Snapshot 1
DICOM
AFNI
Analysis Sub Tree

Analyze
Derived versions of an individual series should
remain with that series?
Snapshot N (Ver__SER)
Snapshot N
10
All problems solved?
  • Why are you calling it skull thickness?
  • Its suppose to be cranial thickness!
  • You have to query on brain, not purkinje cell
  • But, a purkinje cell is part of the brain
    shouldnt the system know that?

11
Ontologies
  • For AI systems, what "exists" is that which can
    be represented. When the knowledge about a domain
    is represented in a declarative language, the set
    of objects that can be represented is called the
    universe of discourse.
  • We can describe the ontology of a program by
    defining a set of representational terms.
    Definitions associate the names of entities in
    the universe of discourse (e.g. classes,
    relations, functions or other objects) with
    human-readable text describing what the names
    mean, and formal axioms that constrain the
    interpretation and well-formed use of these
    terms.
  • Formally, an ontology is the statement of a
    logical theory.

12
Distribution of Ryanodine receptor in cerebellum?
Brain
has a
  • Navigates down domain map
  • Situates result in context of domain map

Cerebellum
has a
Purkinje Cell Layer
has a
Purkinje cell
is a
neuron
13
ANATOM Domain Map
  • Rule-based ontology map
  • Encodes conceptual and semantic relationships
    using F-logic

14
Integrated Knowledge Map
15
Scared?
  • Do
  • Design a file hierarchy
  • Agree on a Standard Vocabulary
  • Add metadata in the right places, and several
    places
  • You can always add or change things later,
    doesnt have to be perfect the first time
  • If its there you will use it!
  • What metadata do other people want?
  • Automate the process! (scripts and or workflows)
  • Do not
  • Wait. Its harder to add metadata after the fact.
  • Do things manually, see 7 above
  • Attempt an ontology, professionals are working on
    them already! (Unless its already in your
    approved grant)

16
Review your notes
  • Take another 5 minutes to go over your notes
    about metadata
  • Any big changes you would do?
  • Write down any changes, additions, and
    revelations.
  • Share with us some of your discoveries.

17
Thanks!
  • Questions?
  • www.sdsc.edu/srb
  • srb_at_sdsc.edu
Write a Comment
User Comments (0)
About PowerShow.com