The medGIFT project on medical image retrieval presentation

About This Presentation

Transcript and Presenter's Notes

Title: The medGIFT project on medical image retrieval

1
The medGIFT project on medical image retrieval

Medical Imaging and Telemedicine (MIT 2005)

Henning Müller Medical Informatics Service
2
Outline

Geneva hospitals and medical informatics
Medical image retrieval
Why, how, what?
The medGIFT retrieval framework
MRML, system integration,
Image pre-processing
Needs analysis of medical image users
Retrieval system evaluation
ImageCLEF benchmarking event
Conclusions

3
Hospitals and medical informatics
4
Geneva University Hospitals

2,200 beds, 6 hospitals
900 beds in the main clinic
780,000 hospital days
10,000 employees
1,300 MDs
22,000 operations per year
30,000 images per day
6,000 computers
Budget gt 1 billion/year
Research and teaching have high importance
Geneva is strong in bioinformatics, genetics,
neurosciences
Service for medical informatics - management
informatics

5
Medical Informatics Service

60 employees, part of radiology
vs. administrative informatics
10 persons in research
Research areas
Multimedia electronic patient record
Decision support systems
Telemedicine, especially with African countries
Knowledge representation, natural language
processing, data mining
Image processing, PACS, operation planning
Teaching
Postgraduate course in medical informatics
Virtual campus for medical students in medical
informatics

6
Image Retrieval
7
Image retrieval
8
Content-based image retrieval

Based on visual features and visual queries
Query by image example, query by sketch, query by
region
Visual features include color histograms, texture
descriptors, shape descriptors, etc.
But query formulation is difficult
Page zero problem for query by example
Now match visual features and semantics, try
object recognition of simple objects

9
A medical example
10
Global structure of retrieval systems
11
Medical image retrieval Why?

Increasing variety amount of imaging in
medicine (diagnostics, treatment planning, follow
up, )
Hard to know everything extremely well
Currently, images are mainly accessed by patient
ID, used in a single context
Much information stored in images and connected
text
Little of this knowledge is exploited
Case-based reasoning and evidence-based medicine
need tools to integrate visual data as well
Standardized methods less dependent of MDs
personal experience

12
Medical image retrieval How?

Create annotated datasets for real tasks such as
diagnostic aid (administrative burdens)
To model expert knowledge
Infrastructures and database techniques exist
Web-based,
Visual features classification/retrieval
techniques need to be optimized based on the
problem
Integrate all knowledge available for a case
Visual (several varied images), textual (release
letter, etc.), numerical (lab results)
Include real users (feedback loops)

13
Medical image retrieval What?

Application for teaching
Help lecturers to find images
Help students to browse catalogs (continuing
education)
Replace books? Same environment as in the
hospital
Application in research
Optimize case selection for studies
Include visual features into studies
Visual data mining, visual knowledge management
Application as diagnostic aid
In specialized domains
Automation of processes
DICOM header correction, automatic annotation

14
medGIFT
15
The GIFT framework

GIFT GNU Image Finding Tool
Open source, free of charge, Linux
Techniques from text retrieval
Framework of components to avoid the
redevelopment of large parts for every project
Web-based interfaces
MRML Multimedia Retrieval Markup Language
Features can be plugged in, parameterized
Feedback schemes
Pruning methods, to allow interactive search
medGIFT add utilities, and integration into
medical applications

16
Framework overview
17
medGIFT

http//www.sim.hcuge.ch/medgift/ (open source)
Project for content-based search in medical image
databases
Goals of the project
Better management of visual medical data
(retrieval)
Visual Knowledge Management
Textual and visual data
Diagnostic aid
Specialized retrieval (lung CTs, fractures,
dermatologic images)
Access to PACS data
In the short term
Research, Teaching

18
Interface
Query image
Diagnosis
Link to casimage
Similarity score
19
Visual features

Global color histogram (HSV, 18, 3, 3, 4 grey
levels)
Color blocks at different scales and locations
Histogram of Gabor filter responses
4 directions, 3 scales, quantized in 10 strengths
Gabor blocks at different scales and locations
85,000 possible features, 1,000-3,000 features
per image, distribution similar to words in text
collections
Roughly Zipf distribution

20
Weighting schemes

Classical tf/idf
tf - term frequency
cf - collection frequency
j - feature number
Q - query with i1..N input images
k - possible result image
R - Relevance of an image in a query

21
Combination of visual and textual features

EasyIR text search engine, also open source
(EPFL)
Frequency-based techniques similar to gift
Stemming and stop work removal to improve
results, also for multilingual search
Mapping to MeSH terms delivers few terms reliably
but high quality results
Linear combination of normalized results of text
and visual system
Depending on the query the optimal factors are
varying

22
Relevance feedback

One-image queries do normally not lead to very
good results
Mainly false positives
Several input images improve the query quality
enormously
Negative feedback is extremely important
Positive feedback is often reordering of
highest-ranked results
But problems with too much negative feedback in
many systems
Log files of a web demo allow to analyze user
behavior
Learning of feature weightings as an additional
factor
Long-term learning from the user interaction
Changes of feature sets during feedback
First tests promise good results

23
Long-term learning

Learn automatically from user interaction on
non-classified databases
Log files from past interaction are used to
improve future results
Images marked together by users in the same query
step are taken into account
Positive, negative, neutral
Images marked together have something in common
Learning can include several levels (same user,
same database, same domain, )

24
Using this as additional factor for weighting

Learning on feature not on image basis is the
goal
Positive and negative feature occurrences
Additional factor in the frequency-based
weighting for each feature
With much feedback a pure probability approach
might be possible, as well as on an image level
Results are improved significantly, although web
demo is not reliable

25
Casimage a radiological case database

Case database for teaching
http//www.casimage.com/, interface developed
with the proprietary 4D software
gt65,000 images, 9,000 images externally
accessible, 500 added per week
Case descriptions (textual) available in XML
Very varying quality
Mix of French and English
Interface is compatible to the MIRC (Medical
Image Resource Center) standard of the RSNA

26
GIFT/casimage
27
GIFT integration

medGIFT -gt casimage
Simple link from image to case
Important to get info on images
Casimage -gt medGIFT
Constraint no change of a running routine
application of the hospital
Simple button under an image with a link opening
a new browser window
PHP interface traces address and downloads the
images, then executes a query

28
Image pre-treatment
29
Lung segmentation

Concentrate visual search on animportant region
of the image

30
Lung block analysis and classification

Segmentation of the lung
Cutting of the lung into blocks
Feature extraction from blocks
Classificiation of blocks into several classes
(8 in our case)
Learning database containing 112 annotated
regions (1000 blocks of size 32x32)
Features Cooccurence matrices, Gabor filters,
grey level histograms,
SVMs reach 84 accuracy healthy/non-healthy, 85
into 8 classes

31
Another problem Noise around object
Hospital logo
Text in the images
Specific problems
Large regions with no information
32
Object extraction

Mostly small structures with high frequencies
Object in the center, one large connected
component
Remove certain objects specifically (logo, grey
square)
Remove small structures
Query only on the image object

33
Object extraction steps
34
Object extraction examples
35
User needs
36
User needs

How to find out what the user really needs?
They will not tell you by themselves
Future use of images in medicine
HON (health on the net) media search
Log files from the web search engine
Mainly patients searching for information
Surveys among various medical professionals
Students, librarians
Clinicians, researcher, lecturers
Survey at OHSU and Geneva among 33 persons
Practical experiences when dealing with a PACS

37
Log file analysis of HONmedia search

http//www.hon.ch/HONmedia
2000 searches per month
Preliminary results (Jan 2005)
More French than English (2/1), mainly 1-3 words
Mostly diagnosis and anatomic region, sometimes
combined
Leukemia, tumeur glomique, fracture,
Many general questions
Childbirth, medical images, medical media,
Also XXX

38
Analysis of survey Questions

For which tasks are images useful for you?
What type of images do you use for each task?
Where and how do you search images
How do you define whether an image found is
relevant or not?
What kind of search would be useful for you
Separately for the following areas research,
clinics, lecturer, student, librarian
18 participants in Geneva, 15 in Portland (OHSU)
Mainly research/clinician/lectures together

39
Analysis of survey first results

Tasks are extremely differentt depending on
department, specific work, and experience
Mostly diagnostics and conference presentations
In diagnostics mainly radiographs and much CT,
for research and teaching CTs and illustrations
Most research in the PACS, but frequently in
google, our teaching file, and on specialized
pages
Relevance is defined by experience, problems on
the web with bad resolution/quality
Most wanted a search by pathology added and the
possibility to find similar cases to a current
patient

40
Performance Evaluation
41
Overview image retrieval benchmarks

Birds-I, Benchathlon
SPIE Electronic Imaging
Personal proposals
C. Leung,
ImageEval
French, only
ImageCLEF
Cross Language Evaluation Forum
Four tasks in total, two medical tasks for image
retrieval and classification

42
CLEF and ImageCLEF

Located at the Cross Language Evaluation Forum
(CLEF)
Goal is to evaluate the retrieval of images
through multi-lingual information retrieval
And not necessarily based on image information
2003 a first image retrieval task with 4
participants
Queries in different languages than the English
collection annotation, image is part of the query
2004 17 participants for two tasks (200 runs)
Medical task for visual image retrieval added
where the query topic is an image, only, and the
text is English/French mixed
Evaluation of interactive image retrieval
2005 24 participants for four tasks, gt300 runs,
36 inscriptions
Medical retrieval and classification tasks

43
ImageCLEF 2005 examples
Show me x-ray images with fractures of the
femur. Zeige mir Röntgenbilder mit Brüchen des
Oberschenkelknochens. Montre-moi des fractures du
fémur.
Show me chest CT images with emphysema. Zeige mir
Lungen CTs mit einem Emphysem. Montre-moi des CTs
pulmonaires avec un emphysème.
Show me any photograph showing malignant
melanoma. Zeige mir Bilder bösartiger
Melanome. Montre-moi des images de mélanomes
malignes.
44
ImageCLEF results

Resources 50,000 images for retrieval and 10,000
images for classification
Annotation in English/French/German
Query includes text and 1-3 images
3 types of queries (visual, mixed, semantic)
Average results are better using text than
images, best results are textvisual
130 runs submitted, mostly mixed, little feedback
Best result IPAL/I2R (map 0.2821)
Best visual map 0.1455, best textual map 0.2084
Results vary extremely over queries
Classification task 87.4 best rate for 57
classes

45
Conclusions
46
Conclusions

Content-based medical image retrieval can become
important in teaching, research and diagnostics
To use the inherently stored knowledge of images
Integration of various data sources and images
More is needed than technical solution
Users need to be included in the development
Hospitals need to work with computer science
researchers (more communication)
Standardized evaluation is needed to identify
promising techniques

47
Questions?

Write a Comment

User Comments (0)

About PowerShow.com

The medGIFT project on medical image retrieval PowerPoint PPT Presentation