Quantification of concept coverage and interrater agreement during large-scale medical vocabulary mapping

Update Item Information
Title Quantification of concept coverage and interrater agreement during large-scale medical vocabulary mapping
Publication Type thesis
School or College School of Medicine
Department Biomedical Informatics
Author Eagon, John Christopher
Date 1997-06
Description The Large-Scale Vocabulary Test was sponsored by the NLM to determine the concept coverage of a combination of medical controlled vocabularies including UMLS, SNOMED, Read, and LOENC. For this study, a diverse group of 10,538 terms was selected from patient problem lists recorded at 65 Veterans Affairs Medical Centers and internal medicine ambulatory care practices, nursing shift notes and emergency transport patient records, and history and physical examination terms from the Iliad diagnostic expert system. Seventeen raters agreed by consensus on rating rules and a set of coded comments to evaluate whether matches were acceptable and to indicate the relationships between submitted and matched terms. Raters independently rated 24 term overlapping test sets before and after rating large numbers of disjoint sets of terms. The percentage of responses equal to the group mode was calculated. Raters agreed with the modal response of the group 63-99% of the time, depending on the difficulty of the rating task. The mean modal response rate was 82-83% for the choice of a match term and 75-78% for the choice of a relationship, and these levels of agreement did not drift over time. For the 10,538 terms rated in a disjoint manner, 92% of submitted terms resulted in acceptable matches. Submitted terms were judged synonymous with the match term in 49% while they were judged more specific in 35% (usually due to modifiers), broader in 2%, and associated in 6%. Synonymous matches were less common for the Iliad history and physical terms (29%) than for the other term sources (61-65%). Match failures (8%) were primarily due to failure of the search engine to recognize misspellings, abbreviations, and synonymous phrases, and absence of an acceptable root concept from the LSVT target dataset was rare (0.3%). A manual search for terms related to the match failures allowed estimates of the percentages of the valid submitted terms that had representation in the target vocabularies as synonym (55%) and within one hierarchical generation (80-87%). These data suggest that future improvements in concept representation would best be achieved by implementing a simple information model which allows combination of a single root concept and a modifier concept.
Type Text
Publisher University of Utah
Subject MESH Medical Informatics; Subject Headings; Medicine
Dissertation Institution University of Utah
Dissertation Name MS
Language eng
Relation is Version of Digital reproduction of "Quantification of concept coverage and interrater agreement during large-scale medical vocabulary mapping". Spencer S. Eccles Health Sciences Library.
Rights Management © John Christopher Eagon.
Format application/pdf
Format Medium application/pdf
Format Extent 1,466,791 bytes
Identifier undthes,4181
Source Original: University of Utah Spencer S. Eccles Health Sciences Library (no longer available)
Master File Extent 1,466,839 bytes
ARK ark:/87278/s6v989zp
Setname ir_etd
ID 192009
Reference URL https://collections.lib.utah.edu/ark:/87278/s6v989zp
Back to Search Results