Document/query expansion based on selecting significant concepts for context based retrieval of medical images.

Affiliation

ReDCAD Laboratory, National School of Engineering of Sfax, Sfax University, Tunisia. Electronic address: [Email]

Abstract

In the medical image retrieval literature, there are two main approaches: content-based retrieval using the visual information contained in the image itself and context-based retrieval using the metadata and the labels associated with the images. We present a work that fits in the context-based category, where queries are composed of medical keywords and the documents are metadata that succinctly describe the medical images. A main difference between the context-based image retrieval approach and the textual document retrieval is that in image retrieval the narrative description is very brief and typically cannot describe the entire image content, thereby negatively affecting the retrieval quality. One of the solutions offered in the literature is to add new relevant terms to both the query and the documents using expansion techniques. Nevertheless, the use of native terms to retrieve images has several disadvantages such as term-ambiguities. In fact, several studies have proved that mapping text to concepts can improve the semantic representation of the textual information. However, the use of concepts in the retrieval process has its own problems such as erroneous semantic relations between concepts in the semantic resource. In this paper, we propose a new expansion method for medical text (query/document) based on retro-semantic mapping between textual terms and UMLS concepts that are relevant in medical image retrieval. More precisely, we propose mapping the medical text of queries and documents into concepts and then applying a concept-selection method to keep only the most significant concepts. In this way, the most representative term (preferred name) identified in the UMLS for each selected concept is added to the initial text. Experiments carried out with ImageCLEF 2009 and 2010 datasets showed that the proposed approach significantly improves the retrieval accuracy and outperforms the approaches offered in the literature.

Keywords

Concepts,Expansion,Information retrieval,Medical images,Preferred name,UMLS,