medGIFT - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

medGIFT

Description:

Inverted file and pruning. Relevance feedback. MRML and user interfaces ... Feature extractor (continuous, binary features) Image representation ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 32
Provided by: simH
Category:

less

Transcript and Presenter's Notes

Title: medGIFT


1
medGIFT
  • Knowledge In Motion, 15.12.2004

Henning Müller University of Geneva Service of
Medical Informatics
2
Overview
  • Introduction to image retrieval
  • medGIFT and GIFT
  • Features
  • Feature weightings
  • Inverted file and pruning
  • Relevance feedback
  • MRML and user interfaces
  • Optimization, learning from log files
  • Evaluation
  • Pre-treatment of images, specialization
  • Conclusions

3
Content-based image retrieval
  • Query formulation with visual means (no text)
  • Images QBE Query by example(s)
  • Positive and negative relevance feedback
  • Components
  • Feature extractor (continuous, binary features)
  • Image representation
  • Distance measure or feature weighting
  • Indexing structure for quick access (data base)
  • User interface for effective and efficient
    interaction with the user
  • Treatment of relevance feedback

4
Framework
5
Interface
6
GIFT and medGIFT
  • GNU Image Finding Tool
  • http//www.gnu.org/software/gift/
  • GPL GNU General Public Lincense
  • Medical domain has certain differences
  • More importance on grey levels
  • More importance on texture features
  • Segmentation to index important regions
  • Or at least to remove background, normalize grey
    levels,
  • Interface needs to display more information
  • Integration into medical applications is needed
  • Evaluation and reference databases are important

7
Features
  • Need to correspond to how we perceive an image
  • Colors, textures, shapes, objects, fealings,
  • Biedermann 1987 (objects)
  • Tversky 1977 (binary stimulations instead of
    continuous)
  • Segmentation of general images is impossible
  • Global features, saliency points,
  • Commonly used features
  • Local and global features
  • Colors, textures, shapes
  • Features that are invariant to rotations,
    scaling, shifts,

8
medGIFT features
  • Image is scaled to 256x256 pixels
  • Global color features
  • Color histogram in the HSV space (Hue,
    Saturation, Value), 18 hues, 3 saturations, 3
    values, 4 grey levels
  • Local color features at several scales
  • HSV, image partitioned in four successively, mode
    color
  • Global texture features (Gabor histogram)
  • 4 directions, 3 scales, 10 quantisations of
    strengths
  • Local Gabor blocks
  • Only smallest blocks of quadtree, Gabors of same
    configuration as above

9
Local color features
10
HSV vs RGB
11
Gabor filter responses
  • Measure energy in a direction/scale (pixel-based)

12
Wavelets
13
Features statistically similar to words
  • Features are quantised to correspond to word
    distributions
  • Binary or almost binary
  • Gabor filter responses into ten strengths
  • Lowest band can be discarded
  • Colors in blocks present or absent, certain color
    is one feature
  • Frequent and rare features exist
  • Zipf distribution
  • Advantage We can use text retrieval techniques
  • Treat a large number of features (90.000
    possible, 2000 per image)
  • Problem Some information can be lost

14
Typical distance measures
  • Euklidean distance
  • Cityblock distance
  • Histogram intersection
  • In text retrieval
  • Frequency based feature weights
  • tf term frequency
  • cf collection frequency

15
Histogram intersection (global color, texture)
16
Freqency-based feature weights
  • A feature frequent in an image describes this
    image well. (tf high)
  • A feature frequent in the collection does not
    distinguish well between images. (cf high)

17
GIFT weighting
  • All features weighted together allow small-scale
    textures to dominate (high in number)
  • Evaluate and normalize four feature groups
    separately and combine the normalized results
  • Much better final result with a separate
    normalization
  • Interface allows partial selection of features
  • Each of the four groups has good and bad results
  • Normalization done with the pseudo-image not the
    highest-scoring one

18
Inverted file
  • Access feature by feature instead of document by
    document
  • Extremely fast access for rare features
  • Pentium IV 2.8, 9000 images 0.5 seconds
  • Efficient for sparsely populated spaces

19
Search pruning
  • Query process can be stopped before all features
    are evaluated (features sorted by importance) to
    improve speed -gt rank of final top ten

20
MRML
  • Based on XML
  • Separates interface from search engine
  • Allows easy integration into various applications
  • ltmrml session-id"1" transaction-id"44"gt
  • ltquery-step session-id"1"
  • resultsize"30"
  • algorithm-id"algorithm-default"
  • ltuser-relevance-listgt
  • ltuser-relevance-element
  • image-location"http//viper.unige.ch/images
    /1.jpg"
  • user-relevance"1"
  • lt/user-relevance-listgt
  • lt/query-stepgt
  • lt/mrmlgt

21
Relevance feedback
  • Different strategies
  • Create one pseudo-image as query (medGIFT)
  • Make separate queries for every image
  • Positive and negative feedback are important
  • Negative allows exploring new areas of feature
    space
  • Problems with too much negative feedback
  • Kills result, returns only black images
  • Solution Rocchio feedback (1971)

22
Long-term learning
  • Analyze user behavior over longer period
  • Logfiles with interaction are stored in MRML
  • Analyze images that are marked together as
    relevant or non-relevant in the same query step
    (concentrate on pairs)
  • This can lead to image correlations
  • We want to learn on a feature basis to be more
    general

23
A factor for long-term learning
  • Based on probability for association rules
  • Market basket analysis

24
Evaluation
  • Define tasks
  • Suited for technology and users
  • Create databases
  • Including query tasks and a gold standard
  • Experts are extremely expensive
  • Compare several systems
  • To allow judging what works and what not
  • Standardized conditions
  • Cycle that needs to be rerun to refine tasks and
    become more realistic over time

25
Visual/textual retrieval
  • imageCLEF 2004 tasks
  • Queries were images only
  • Database contained text
  • Through automatic query expansion we can access
    text for retrieval
  • Separate queries for text and images (easyIR,
    medGIFT)
  • Combination of normalized results depends
    strongly on the query task

26
Pre-treatement of images for indexation
27
Retrieval result
28
Case-based retrieval
Combined results, normalized (take into account
incomplete data)
29
Specialized retrieval
30
Conclusions
  • medGIFT can be a step towards visual knowledge
    management
  • Only one component that needs to be integrated
    with other tools (textIR, data warehouse)
  • Good optimization data is extremely important
  • Generation of specialised databases and ground
    truth is essentiel for evaluation
  • Scenarios for applications need to be defined

31
Questions?
Write a Comment
User Comments (0)
About PowerShow.com