medGIFT

About This Presentation

Transcript and Presenter's Notes

Title: medGIFT

1
medGIFT

Knowledge In Motion, 15.12.2004

Henning Müller University of Geneva Service of
Medical Informatics
2
Overview

Introduction to image retrieval
medGIFT and GIFT
Features
Feature weightings
Inverted file and pruning
Relevance feedback
MRML and user interfaces
Optimization, learning from log files
Evaluation
Pre-treatment of images, specialization
Conclusions

3
Content-based image retrieval

Query formulation with visual means (no text)
Images QBE Query by example(s)
Positive and negative relevance feedback
Components
Feature extractor (continuous, binary features)
Image representation
Distance measure or feature weighting
Indexing structure for quick access (data base)
User interface for effective and efficient
interaction with the user
Treatment of relevance feedback

4
Framework
5
Interface
6
GIFT and medGIFT

GNU Image Finding Tool
http//www.gnu.org/software/gift/
GPL GNU General Public Lincense
Medical domain has certain differences
More importance on grey levels
More importance on texture features
Segmentation to index important regions
Or at least to remove background, normalize grey
levels,
Interface needs to display more information
Integration into medical applications is needed
Evaluation and reference databases are important

7
Features

Need to correspond to how we perceive an image
Colors, textures, shapes, objects, fealings,
Biedermann 1987 (objects)
Tversky 1977 (binary stimulations instead of
continuous)
Segmentation of general images is impossible
Global features, saliency points,
Commonly used features
Local and global features
Colors, textures, shapes
Features that are invariant to rotations,
scaling, shifts,

8
medGIFT features

Image is scaled to 256x256 pixels
Global color features
Color histogram in the HSV space (Hue,
Saturation, Value), 18 hues, 3 saturations, 3
values, 4 grey levels
Local color features at several scales
HSV, image partitioned in four successively, mode
color
Global texture features (Gabor histogram)
4 directions, 3 scales, 10 quantisations of
strengths
Local Gabor blocks
Only smallest blocks of quadtree, Gabors of same
configuration as above

9
Local color features
10
HSV vs RGB
11
Gabor filter responses

Measure energy in a direction/scale (pixel-based)

12
Wavelets
13
Features statistically similar to words

Features are quantised to correspond to word
distributions
Binary or almost binary
Gabor filter responses into ten strengths
Lowest band can be discarded
Colors in blocks present or absent, certain color
is one feature
Frequent and rare features exist
Zipf distribution
Advantage We can use text retrieval techniques
Treat a large number of features (90.000
possible, 2000 per image)
Problem Some information can be lost

14
Typical distance measures

Euklidean distance
Cityblock distance
Histogram intersection
In text retrieval
Frequency based feature weights
tf term frequency
cf collection frequency

15
Histogram intersection (global color, texture)
16
Freqency-based feature weights

A feature frequent in an image describes this
image well. (tf high)
A feature frequent in the collection does not
distinguish well between images. (cf high)

17
GIFT weighting

All features weighted together allow small-scale
textures to dominate (high in number)
Evaluate and normalize four feature groups
separately and combine the normalized results
Much better final result with a separate
normalization
Interface allows partial selection of features
Each of the four groups has good and bad results
Normalization done with the pseudo-image not the
highest-scoring one

18
Inverted file

Access feature by feature instead of document by
document
Extremely fast access for rare features
Pentium IV 2.8, 9000 images 0.5 seconds
Efficient for sparsely populated spaces

19
Search pruning

Query process can be stopped before all features
are evaluated (features sorted by importance) to
improve speed -gt rank of final top ten

20
MRML

Based on XML
Separates interface from search engine
Allows easy integration into various applications
ltmrml session-id"1" transaction-id"44"gt
ltquery-step session-id"1"
resultsize"30"
algorithm-id"algorithm-default"
ltuser-relevance-listgt
ltuser-relevance-element
image-location"http//viper.unige.ch/images
/1.jpg"
user-relevance"1"
lt/user-relevance-listgt
lt/query-stepgt
lt/mrmlgt

21
Relevance feedback

Different strategies
Create one pseudo-image as query (medGIFT)
Make separate queries for every image
Positive and negative feedback are important
Negative allows exploring new areas of feature
space
Problems with too much negative feedback
Kills result, returns only black images
Solution Rocchio feedback (1971)

22
Long-term learning

Analyze user behavior over longer period
Logfiles with interaction are stored in MRML
Analyze images that are marked together as
relevant or non-relevant in the same query step
(concentrate on pairs)
This can lead to image correlations
We want to learn on a feature basis to be more
general

23
A factor for long-term learning

Based on probability for association rules
Market basket analysis

24
Evaluation

Define tasks
Suited for technology and users
Create databases
Including query tasks and a gold standard
Experts are extremely expensive
Compare several systems
To allow judging what works and what not
Standardized conditions
Cycle that needs to be rerun to refine tasks and
become more realistic over time

25
Visual/textual retrieval

imageCLEF 2004 tasks
Queries were images only
Database contained text
Through automatic query expansion we can access
text for retrieval
Separate queries for text and images (easyIR,
medGIFT)
Combination of normalized results depends
strongly on the query task

26
Pre-treatement of images for indexation
27
Retrieval result
28
Case-based retrieval
Combined results, normalized (take into account
incomplete data)
29
Specialized retrieval
30
Conclusions

medGIFT can be a step towards visual knowledge
management
Only one component that needs to be integrated
with other tools (textIR, data warehouse)
Good optimization data is extremely important
Generation of specialised databases and ground
truth is essentiel for evaluation
Scenarios for applications need to be defined

31
Questions?

Write a Comment

User Comments (0)

About PowerShow.com

medGIFT PowerPoint PPT Presentation