Comparative study using inverse ontology cogency and alternatives for concept recognition in the annotated National Library of Medicine database.

Affiliation

Shannon GJ(1), Rayapati N(2), Corns SM(3), Wunsch DC 2nd(4).
Author information:
(1)Applied Computational Intelligence Laboratory, Missouri University of Science and Technology, Rolla, MO 65409, United States of America. Electronic address: [Email]
(2)Guise AI, Inc., Rolla, MO 65401, United States of America. Electronic address: [Email]
(3)Applied Computational Intelligence Laboratory, Missouri University of Science and Technology, Rolla, MO 65409, United States of America. Electronic address: [Email]
(4)Applied Computational Intelligence Laboratory, Missouri University of Science and Technology, Rolla, MO 65409, United States of America; National Science Foundation, ECCS Division, Alexandria, VA 22314, United States of America. Electronic address: [Email]

Abstract

This paper introduces inverse ontology cogency, a concept recognition process and distance function that is biologically-inspired and competitive with alternative methods. The paper introduces inverse ontology cogency as a new alternative method. It is a novel distance measure used in selecting the optimum mapping between ontology-specified concepts and phrases in free-form text. We also apply a multi-layer perceptron and text processing method for named entity recognition as an alternative to recurrent neural network methods. Automated named entity recognition, or concept recognition, is a common task in natural language processing. Similarities between confabulation theory and existing language models are discussed. This paper provides comparisons to MetaMap from the National Library of Medicine (NLM), a popular tool used in medicine to map free-form text to concepts in a medical ontology. The NLM provides a manually annotated database from the medical literature with concepts labeled, a unique, valuable source of ground truth, permitting comparison with MetaMap performance. Comparisons for different feature set combinations are made to demonstrate the effectiveness of inverse ontology cogency for entity recognition. Results indicate that using both inverse ontology cogency and corpora cogency improved concept recognition precision 20% over the best published MetaMap results. This demonstrates a new, effective approach for identifying medical concepts in text. This is the first time cogency has been explicitly invoked for reasoning with ontologies, and the first time it has been used on medical literature where high-quality ground truth is available for quality assessment.