Enriching contentbased image retrieval with automatically extracted MeSH terms - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Enriching contentbased image retrieval with automatically extracted MeSH terms

Description:

Multi-modal multi-lingual image retrieval. Some results ... The Geneva radiology alone produces currently (2004) more than 20. ... Multi-modal multi-lingual ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 16
Provided by: simH
Category:

less

Transcript and Presenter's Notes

Title: Enriching contentbased image retrieval with automatically extracted MeSH terms


1
Enriching content-based image retrieval with
automatically extracted MeSH terms
  • GMDS 2004, 28.9.2004

Henning Müller Service of Medical
Informatics Geneva University Hospitals
2
Overview
  • Introduction
  • Scenarios for content-based visual data access
  • Casimage, medGIFT and imageCLEF
  • Extracting MeSH terms
  • Multi-modal multi-lingual image retrieval
  • Some results
  • Conclusions

3
Introduction
  • Hospitals produce a rising amount of visual data
  • Images, videos,
  • The Geneva radiology alone produces currently
    (2004) more than 20.000 images a day
  • Pathology, dermatology and haematology are
    digital
  • Other departments will be digital soon and might
    need further analysis (psychology, )
  • These data will need tools to exploit the stored
    knowledge other than access by patient ID
  • Content-based visual data access allows to manage
    images and other visual data automatically

4
An example
5
Scenarios for content-based medical IR
  • Teaching
  • Find visually similar cases with differing
    diagnoses
  • Allow students to browse annotated databases
  • Research
  • Optimise choice of cases for studies
  • Include visual features into studies
  • Diagnostic aid
  • Case-based reasoning, evidence-based medicine
  • Visual knowledge management
  • Specialized applications are necessary

6
casimage
  • Radiology teaching file
  • gt60.000 images of more than 12.000 cases
    submitted so far (500 images added per week)
  • Integration into PACS environment
  • Level/windowing on insertion
  • Images in JPEG
  • No control of textual input
  • Web-based interface
  • http//www.casimage.com/

7
The casimage dataset
  • Casimage database from a radiology teaching file
  • 9.000 images from 2.000 cases
  • Annotations in French and English, bad text
    quality
  • Spelling errors, empty case notes, abbreviations,
    etc.
  • Queries are 26 images chosen by a radiologist to
    represent the dataset well
  • Query does not include any text
  • Data set used for the imageCLEF initiative to
    evaluate image retrieval systems

8
medGIFT
  • Based on open source image retrieval engine GIFT
    (http//www.gnu.org/software/gift/)
  • Visual queries are possible by submitting one or
    several query images (positive and negative)
  • Gabor filter responses to describe textures
  • Local and global colour/grey level features
  • Use of techniques from text retrieval
  • Rare features are more important (tf/idf
    weighting)
  • A feature frequent in an image is a good
    descriptor
  • Inverted file structure for efficient data access
  • Relevance feedback (pos./neg.) to refine queries

9
medGIFT user interface
Query image
Diagnosis
Link to casimage
Selection of the user (pos/neg/ neutral)
Similarity score
10
Extraction of MeSH terms (easyIR)
  • Text of the cases is mixed in French and English
    in rather bad quality, in XML format
  • Pre-treatment is necessary
  • Removal of XML tags and unimportant fields
  • Stop word removal
  • Stemming
  • MeSH terms can then be extracted from the
    remaining textual data (in French and English)
  • Queries can be executed through free text
  • Case notes will be ordered by their similarity
    to the query

11
Multi-modal multi-lingual image retrieval
  • Text is only available through automatic query
    expansion as initial queries are images
  • Visual query
  • Text of first result (first three) is taken as
    textual query and a visual query is executed with
    the images
  • Not much is known on the quality of the first
    results
  • Case scores are expanded to all images of the
    case and normalized
  • Visual and textual results are merged (80
    visual/20 textual, )
  • Optimal ratio needs to be found
  • Several steps visual/textual might be best

12
Problems and tasks to do
  • Textual and visual queries are performed in
    different programs (Linux/Windows)
  • Not completely automated
  • Infrastructure for query interfaces will need to
    be created to automate these runs
  • Optimal combination of the weighting depends on
    the type of query and will need to be explored
  • More analysis of the results is needed
  • Not simple linear combination but also increase
    of only those images that appear high in the text
  • As the visual similarity scores are often very
    close, in contrast to textual similarity scores

13
Results
  • Small number of grey levels performs best
  • Relevance feedback is very important
  • Results in mean average precision (averaged over
    26 queries)
  • Visual 0.3157
  • Query expansion visual (1) 0.3100
  • Query expansion visual/textual (1) 0.3749 (better
    than all submitted automatic runs)
  • Relevance feedback visual 0.3791
  • Relevance feedback visual/textual 0.4214 (best
    run in competition)

14
Results a visual example
Visual retrieval
Visual/textual retrieval
15
Conclusions
  • Visual retrieval will become important to manage
    the rising amount of image (or visual) data
    produced in hospitals
  • Whenever textual data is available for retrieval
    it should be used even if the quality is mediocre
  • It never decreased the retrieval quality
  • To use low quality text for retrieval, a number
    of pre-treatment steps are necessary
  • Combinations of visual and textual results need
    to be performed with care
  • Small changes can have a large influence
  • More work on combining visual/textual is needed
Write a Comment
User Comments (0)
About PowerShow.com