Rethinking information delivery: using a natural language processing application for point-of-care data discovery

Stoddart, Joan

Rethinking information delivery: using a natural language processing application for point-of-care data discovery

Download File | | Reference URL

Update Item Information

Publication Type	journal article
School or College	University Libraries
Department	Spencer S. Eccles Health Sciences Library
Creator	Stoddart, Joan
Other Author	Workman, Elizabeth , T.
Title	Rethinking information delivery: using a natural language processing application for point-of-care data discovery
Date	2012-01-01
Description	Objective: This paper examines the use of Semantic MEDLINE, a natural language processing application enhanced with a statistical algorithm known as Combo, as a potential decision support tool for clinicians. Semantic MEDLINE summarizes text in PubMed citations, transforming it into compact declarations that are filtered according to a user's information need that can be displayed in a graphic interface. Integration of the Combo algorithm enables Semantic MEDLINE to deliver information salient to many diverse needs. Methods: The authors selected three disease topics and crafted PubMed search queries to retrieve citations addressing the prevention of these diseases. They then processed the citations with Semantic MEDLINE, with the Combo algorithm enhancement. To evaluate the results, they constructed a reference standard for each disease topic consisting of preventive interventions recommended by a commercial decision support tool. Results: Semantic MEDLINE with Combo produced an average recall of 79% in primary and secondary analyses, an average precision of 45%, and a final average F-score of 0.57. Conclusion: This new approach to point-of-care information delivery holds promise as a decision support tool for clinicians. Health sciences libraries could implement such technologies to deliver tailored information to their users.
Type	Text
Publisher	Medical Library Association
Volume	100
Issue	2
First Page	113
Last Page	120
Dissertation Institution	University of Utah
Language	eng
Bibliographic Citation	Workman, E.T., & Stoddart, J. M. (2012). Rethinking information delivery: using a natural language processing application for point-of-care data discovery. Journal of the Medical Library Association, 100(2), 113-20.
Format Medium	application/pdf
Format Extent	280,532 bytes
Identifier	uspace,17451
ARK	ark:/87278/s6sb4qf4
Setname	ir_uspace
ID	707806
OCR Text	Show Rethinking information delivery: using a natural language processing application for point-of-care data discovery{ T. Elizabeth Workman, PhD, MLIS; Joan M. Stoddart, MALS, AHIP See end of article for authors' affiliations. DOI: http://dx.doi.org/10.3163/1536-5050.100.2.009 Objective: This paper examines the use of Semantic MEDLINE, a natural language processing application enhanced with a statistical algorithm known as Combo, as a potential decision support tool for clinicians. Semantic MEDLINE summarizes text in PubMed citations, transforming it into compact declarations that are filtered according to a user's information need that can be displayed in a graphic interface. Integration of the Combo algorithm enables Semantic MEDLINE to deliver information salient to many diverse needs. Methods: The authors selected three disease topics and crafted PubMed search queries to retrieve citations addressing the prevention of these diseases. They then processed the citations with Semantic MEDLINE, with the Combo algorithm enhancement. To evaluate the results, they constructed a reference standard for each disease topic consisting of preventive interventions recommended by a commercial decision support tool. Results: Semantic MEDLINE with Combo produced an average recall of 79% in primary and secondary analyses, an average precision of 45%, and a final average F-score of 0.57. Conclusion: This new approach to point-of-care information delivery holds promise as a decision support tool for clinicians. Health sciences libraries could implement such technologies to deliver tailored information to their users. INTRODUCTION Clinicians often encounter information needs in their work of caring for patients. In their 2005 study, Ely and colleagues discovered that physicians developed an average of 5.5 questions for each half-day observation, yet could not find answers to 41% of the questions for which they pursued answers [1]. Ely cited time constraints as one of the barriers preventing clinicians from finding answers. In another study, Chambliss and Conley also found that discovering answers is excessively time consuming [2]. Chambliss and Conley determined that references found in MEDLINE could answer or nearly answer 71% of clinicians' answerable questions; however, PubMed is not a tool exclusively designed for point-of- care information delivery. It generally returns excessive, irrelevant data, even when implementing diverse search strategies [3]. Clinicians can spend an average of 30 minutes answering a question using references from MEDLINE [4]. This time span is, by and large, due to the process of literature appraisal, which is naturally lengthened by excessive retrieval [5]. This information discovery process is not practical for a busy clinical setting [4]. Semantic MEDLINE Natural language processing (NLP) applications such as Semantic MEDLINE can filter PubMed results for a user's specific information need and summarize them to facilitate literature appraisal [6]. Semantic MEDLINE{, a resource developed by the National Library of Medicine (NLM), if enhanced by an adaptive algorithm known as Based on a presentation given at MLA '11, the 111th annual meeting of the Medical Library Association; Minneapolis, MN; 15 May 2011. { Funded by the National Library of Medicine, grant number T15LM007123. This article has been approved for the Medical Library Association's Independent Reading Program ,http://www.mlanet .org/education/irp/.. { A public demonstration of the National Library of Medicine's Semantic MEDLINE, without the Combo enhancement, is available at http://skr3.nlm.nih.gov/SemMedDemo/. Highlights N PubMed data can potentially serve clinicians at the point of care. N Semantic MEDLINE, when enhanced with the Com-bo algorithm, can summarize PubMed results to provide information needed at the point of care. Implications N Natural language processing applications that sum-marize text may facilitate appraisal of PubMed results for automated decision support. N Natural language processing applications offer an opportunity for libraries to provide value-added information delivery. N Natural language processing applications can be tailored to serve specific user groups. N Combo-enhanced Semantic MEDLINE could com-plement commercial decision support products or independently provide point-of-care information. J Med Lib Assoc 100(2) April 2012 113 Combo [7], can simplify MEDLINE results for many information needs. The user activates the Semantic MEDLINE application by submitting a search query expressing an information need to PubMed. Semantic MEDLINE then uses the individual processes of SemRep, Summarization, and Visualization to quickly transform the citations' title and abstract text into a compact form and identify data that are salient to a specific information need, which then can be displayed in a visual graph. Currently, NLM hosts an online Semantic MEDLINE application that consists of a publicly accessible demon-stration site and a restricted-access portal. The following paper describes these individual processes. This study evaluated a separate, enhanced Semantic MEDLINE system that accommodates additional information needs. This paper also briefly describes how an organization could develop it to serve its own users. SemRep SemRep [8], a rule-based NLP application in Semantic MEDLINE, interprets the meaning of PubMed title and abstract text and rephrases it into compact declarations called semantic predications. For exam-ple, consider the following citation abstract text: ‘‘Sandoglobulin significantly reduced the incidence of pneumonia (28 cases in the IGIV group, 43 cases in the placebo group, p50.0111)'' [9] SemRep rephrases the text with this semantic predication: Sandoglobulin_PREVENTS_Pneumonia SemRep identifies ‘‘Sandoglobulin'' and ‘‘Pneumonia'' as the respective subject and object of the text and maps them to the Unified Medical Language System (UMLS) Metathesaurus preferred concepts ‘‘Sandolobulin'' and ‘‘Pneumonia'' [10]. It also recognizes the phrase ‘‘reduced the incidence of'' as the concept that binds the subject and object terms, mapping it to the predicate ‘‘PREVENTS,'' as found in the UMLS Semantic Network. SemRep also identifies the logical UMLS semantic group classifications associated with the arguments, which in this case are ‘‘Pharmacologic Substance'' (associated with ‘‘Sandoglo-bulin'') and ‘‘Disease or Syndrome'' (associated with ‘‘Pneumonia''). Summarization Semantic MEDLINE's Summarization phase identifies SemRep semantic predications that are relevant to a user's indicated information need. This process begins by prompting the user to select a topic from a list of UMLS preferred concepts that appear in the SemRep data. A summarization software application in Semantic MEDLINE processes the SemRep output according to the following sequential phases: & Relevance gathers semantic predications containing the user-selected seed topic. For example, if the chosen topic were ‘‘Septicemia,'' this filter would collect the semantic predication ‘‘Blood culture_ DIAGNOSES_Septicemia.'' & Connectivity augments Relevance predications with those that share a non-seed argument's semantic type. For example, in the above predication ‘‘Blood culture_DIAGNOSES_Septicemia,'' the semantic type of the non-seed argument ‘‘Blood culture'' is ‘‘Labo-ratory Procedure.'' This filter would augment the Relevance semantic predications with others such as ‘‘Measurement of serum lipid level_DIAGNOSES_ Sepsis of the newborn,'' because ‘‘Laboratory Proce-dure'' is also the semantic type of the subject argument ‘‘Measurement of serum lipid level.'' & Novelty eliminates vague predications, such as ‘‘pharmaceutical preparation_TREATS_patients,'' that present information that users already likely know and are of limited use. & Saliency limits final output to predications that occur with adequate frequency. For example, if ‘‘Blood culture_ DIAGNOSES_Septicemia'' occurred enough times, all occurrences would be included in the final output. To operationalize the final Saliency phase, the summarization software in this study used a statistical algorithm known as Combo. Combo [7] analyzes predicate frequencies using an adaptation of the Kullback-Leibler Divergence [11] and measures the strength of predicate/semantic type pairings with Riloff's RlogF metric [12] and PredScal, a scaling metric developed for the Combo algorithm. Prior to this approach, summarization depended on conventional, static applications, called schemas, which are limited to specified ‘‘subject_predicate_object'' patterns. A differ-ent schema was required to summarize for each subheading-type refinement, limiting use to five op-tions: treatment of disease [13], substance interaction [14], diagnosis [15], pharmacogenomics [16], and genetic etiology of disease [17]. Because of its advanced computational methodology, Combo adapts to the properties of each set of SemRep output in determining what is relevant to the user's information need, thus enabling summarization for many subheading concepts. Visualization The semantic predications produced by the Summari-zation phase can be visually displayed. Figure 1 presents an interface used by NLM to display Summarization output. Due to the nature of the data's compact structure, users can quickly focus on desired data. For example, in Figure 1, the Summarization seed topic is ‘‘Septicemia,'' and the user has limited displayed output to items containing the predicate ‘‘DIAGNOSES.'' In Figure 2, the user has clicked on the arc connecting ‘‘Septicemia'' and ‘‘blood culture'' and is presented with the citations addressing blood culture's use as a diagnostic tool for septicemia. Objective The objective of this study was to evaluate the effectiveness of Semantic MEDLINE, with the Combo statistical algorithm enhancement, in identifying deci-sion support information for disease prevention. The authors wanted to explore its potential use as a point- Workman and Stoddart 114 J Med Lib Assoc 100(2) April 2012 of-care information delivery application. They wanted to determine if this approach could retrieve recom-mended preventive interventions found in a commer-cial, manually annotated product. Prior efforts in applying Semantic MEDLINE, with the Combo algo-rithm, in a simulated database curation task to identify information relevant to genetic disease etiology were successful [7]. The authors wanted to evaluate the system in a simulated clinical decision support task. The authors wanted to evaluate this system's performance in retrieving prevention information, because the concept is fluid and especially difficult to capture with such an NLP approach. For example, preventing congestive heart failure includes treating hypertension in vulnerable patients. To prevent lung cancer, clinicians counsel patients on smoking cessa-tion. Therefore, the authors hypothesized that, in addition to finding relevant output in the form of ‘‘Intervention X_PREVENTS_Disease Y,'' they would also find relevant semantic predications containing other predicates, such as ‘‘TREATS.'' Currently, there is no conventional static schema in NLM's Semantic MEDLINE designed to accommodate a disease pre-vention subheading refinement. The results of this study may offer commentary on the potential enhance-ment offered by Combo-driven Summarization in expanding Semantic MEDLINE's functionality. This study also served as a pilot for a larger project to examine Semantic MEDLINE's efficiency, when en-hanced with the Combo algorithm, in aiding decision support for disease prevention and drug treatment.1 METHODS Disease topics and data The authors chose the three disease topics: acute pancreatitis, coronary artery disease, and malaria. These three diseases have various etiologies and call for a variety of types of preventive interventions. These differences in disease characteristics and ranges of interventions motivated their selection. The authors executed the following PubMed searches and down-loaded the resulting citations. Acute pancreatitis search session: #11 Search #8 OR #9 #9 Search (pancreatitis/prevention and control[mesh] NOT Pancreatitis, Chronic[mesh]) AND ‘‘systematic re-view'' Limits: Review, Publication Date to 2010/08/31 #8 Search pancreatitis/prevention and control[mesh] NOT Pancreatitis, Chronic[mesh] Limits: Clinical Trial, Meta- Analysis, Randomized Controlled Trial, Publication Date to 2010/08/31 Coronary artery disease search session: #13 Search #10 OR #11 #11 Search coronary artery disease/prevention and con-trol[ mesh] AND ‘‘systematic review'' Limits: Review, Publication Date to 2010/10/31 #10 Search coronary artery disease/prevention and con-trol[ mesh] Limits: Clinical Trial, Meta-Analysis, Random-ized Controlled Trial, Publication Date to 2010/10/31 Malaria search session: #15 Search #12 OR #13 #13 Search Malaria/prevention and control[mesh] AND ‘‘systematic review'' Limits: Review, Publication Date to 2010/10/31 #12 Search Malaria/prevention and control[mesh] Limits: Clinical Trial, Meta-Analysis, Randomized Controlled Trial, Publication Date to 2010/10/31 The search sessions were conducted February 7, 2011. To garner evidence-based data, retrieval was focused on clinical trials, meta-analyses, randomized controlled trials, and systematic reviews. Retrieval was also limited to match the time period represented by the study's evaluative reference standards, as described below. Two rationales were behind the search queries' Figure 1 Interface used by the National Library of Medicine to display Summarization output Figure 2 Citations addressing blood culture's use as a diagnostic tool for septicemia 1 The first author and others describe the results of the ‘‘text summarization as a decision support aid'' project in a manuscript currently under review elsewhere. Using a natural language processing application J Med Lib Assoc 100(2) April 2012 115 structure. First, in evaluating Combo-enhanced Seman-tic MEDLINE for other related projects (addressing genetic disease etiology [7] and drug treatment), information retrieval for text summarization was based on a single disease topic, paired with a subheading-type concept, while drawing on all citations in the database (instead of selected intricate subsets). This provided some standardization across all projects, thus facilitat-ing the eventual comparison of all results. Researchers combined Medical Subject Headings (MeSH) terms with subheadings, keyword phrases (e.g., ‘‘systematic reviews''), and publication types when needed. Second, this specific study was designed to simulate an instance in which a clinician would create the search query. Realistically, clinicians' searching skills vary; one can expect anything from a very general keyword search to a more sophisticated search profiting from many of the PubMed value-added search tools. The search queries employed in this study represented a type of middle ground in this spectrum. Semantic MEDLINE processing The citations were processed with SemRep. SemRep output was processed with the Combo algorithm- enhanced Summarization application. The authors selected the following UMLS preferred concepts as seed topics for the Summarization phase: & Pancreatitis (for the acute pancreatitis citations) & Coronary Arteriosclerosis and Coronary heart disease (for the coronary artery disease citations) & Malaria (for the malaria citations) Evaluation To evaluate the results, the authors compiled a reference standard for each disease, consisting of preventive interventions recommended by DynaMed, a commercial decision support product. The authors chose DynaMed because it was one of three top-ranked products in a recent study [18], presented information in a straightforward bulleted list struc-ture, and was readily available. Preventive interven-tions prefaced with text such as ‘‘controversial or not well established with evidence'' were not included in the study's reference standards. The preventive intervention reference standards for the three disease topics are listed in Tables 1-3. As previously mentioned, the authors noted the most recently published primary articles that DynaMed used in identifying recommendations and then limited the dates of citation retrieval to avoid including data published after DynaMed's source references. This approach to data acquisition was used in a similar study conducted by other investigators [13]. One of the authors (Workman) captured DynaMed data address-ing prevention of the three diseases on February 6, 2011. The primary analysis examined Semantic MEDLINE output in the general form ‘‘Intervention X_PRE-VENTS_ Disease Y'' for summarized output for each of the three disease topic groups, along with the associated citation from which each semantic predication originat-ed. If a citation's text confirmed the retrieval of a reference standard intervention, it was counted as a true positive (i.e., a reference standard intervention that theoretically should have been retrieved). For example, if the citation included wording such as ‘‘[the interven-tion] is recommended for prevention of [the disease],'' the intervention received a true positive status. Because UMLS preferred concepts are sometimes broad in nature, the authors determined that if a general term was associated with citation text containing a reference standard intervention's precise wording, the reference standard would receive a true positive status (this is also demonstrated in the ‘‘Results'' section). All reference standard interventions not represented in the system's output were classified as false negatives (i.e., reference standard interventions that theoretically should have been retrieved by the system but were not). The authors limited the primary analysis to examining output in the form of ‘‘Intervention X_PREVENTS_Disease Y,'' Table 1 Semantic MEDLINE recall* of DynaMed preventive intervention reference standard for acute pancreatitis{ DynaMed prevention intervention Recall in Semantic MEDLINE Primary analysis (PREVENTS) Secondary analysis (other predicates) Guidewire cannulation TP N/A Nonsteroidal anti-inflammatory drugs (NSAIDs) TP N/A Octreotide TP N/A Prophylactic nitroglycerin TP N/A Interleukin 10 (IL-10) TP N/A Recall: 100% * Recall is the percentage of reference standard interventions found in the system output for each disease topic. { TP5true positive; FN5false negative; N/A5not applicable, found in primary analysis. Table 2 Semantic MEDLINE recall* of DynaMed preventive intervention reference standard for coronary artery disease{ DynaMed prevention intervention Recall in Semantic MEDLINE Primary analysis (PREVENTS) Secondary analysis (other predicates) Proper diet TP N/A Aerobic exercise FN FN Smoking cessation FN TP Modifiable lifestyles TP N/A Weight loss TP N/A Treatment of diabetes FN TP Treatment of hypertension TP N/A Treatment of hyperlipidemia TP N/A Prophylactic low-dose aspirin TP N/A Use of ACE inhibitors TP N/A Complete avoidance of tobacco smoke FN FN Angiotensin receptor blockers TP N/A Aldosterone blockade FN FN Beta blockers TP N/A Influenza vaccine FN FN Recall: 60% Recall: 73% * Recall is the percentage of reference standard interventions found in the system output for each disease topic. { TP5true positive; FN5false negative; N/A5not applicable, found in primary analysis. Workman and Stoddart 116 J Med Lib Assoc 100(2) April 2012 because if a clinician were to use Semantic MEDLINE as a decision support tool for preventive care, that physician would likely begin by reviewing data with the ‘‘PREVENTS'' predicate. Findings were measured according to recall, precision, and F-score (the weighted harmonic mean of precision and recall). Recall consisted of the percentage of reference standard interventions found in the system output for each disease topic. Precision scores were calculated in the primary analysis by grouping the interventions in the summarized data by name and assessing what percentage of these groups led to related citation text containing a reference standard intervention. The secondary analysis examined semantic predi-cations that included predicates other than ‘‘PRE-VENTS.'' The authors used the same strategy of using the associated citation data to confirm a given reference standard intervention's true positive status. Because the authors' primary interest was whether these additional data supplied additional reference standard interventions, these findings were factored into the final recall calculations, yielding one preci-sion score and two recall scores for each of the three disease topics. Reference standard interventions al-ready identified in the primary analysis received the designation ‘‘N/A,'' or not applicable, in the second-ary analysis. RESULTS Data acquisition and processing One of the authors (Workman) performed the infor-mation retrieval phase, SemRep processing, Summari-zation processing using the Combo algorithm- enhanced software, and evaluation of the output. The 3 PubMed search sessions retrieved a total of 3,276 citations; the acute pancreatitis session produced 156 citations, while the coronary artery disease and malaria sessions yielded 2,440 and 680 citations, respectively. SemRep produced 999 semantic predications from the acute pancreatitis citations, 14,781 semantic predica-tions from the coronary artery disease citations, and 3,374 semantic predications from the malaria citations. Using the associated SemRep disease topic outputs, Summarization identified 1,397 unique semantic pred-ications salient to the ‘‘Coronary Arteriosclerosis'' and ‘‘Coronary heart disease'' seed topics, 178 semantic predications salient to the ‘‘Pancreatitis'' seed topic, and 389 semantic predications salient to the ‘‘Malaria'' seed topic. Evaluation: primary analysis Semantic MEDLINE with the Combo algorithm en-hancement produced an average recall of 70% in the initial examination of output in the form of ‘‘Interven-tion X_PREVENTS_Disease Y.'' The average precision was 45%, resulting in an F-score of 0.54. The primary analysis recall results for each disease topic are listed in Tables 1-3. Precision results are indicated in Table 4. Evaluation: secondary analysis Examination of output semantic predications contain-ing predicates other than ‘‘PREVENTS'' identified additional reference standard interventions and in-creased average recall to 79%, with an adjusted F-score of 0.57. Reference standard results for each disease topic group are listed in Tables 1-3. Because all reference standard interventions for acute pancreatitis appeared in the primary analysis, no secondary analysis was necessary for this disease topic. DISCUSSION Findings of two analyses Interesting patterns emerged from both analyses. In the primary analysis (examining output in the form ‘‘Intervention X_PREVENTS_disease Y'') of the twenty-seven true positive findings for all three disease topics, Table 3 Semantic MEDLINE recall* of DynaMed preventive intervention reference standard for malaria{ DynaMed prevention intervention Recall in Semantic MEDLINE Primary analysis (PREVENTS) Secondary analysis (other predicates) Long sleeves FN FN Long pants FN FN Window screens FN FN Mosquito nets TP N/A Insecticide-treated clothes FN FN Insecticide-treated nets TP N/A Insect repellent TP N/A Indoor spraying FN FN Insecticide treatment of livestock FN FN Atovaquone/proguanil TP N/A Trimethoprim-sulfamethoxazole FN FN ‘‘Antimalarial agents'' TP N/A Artesunate plus amodiaquine or sulfadoxine-pyrimethamine FN TP Mefloquine TP N/A Sulfadoxine-pyrimethamine TP N/A Amodiaquine TP N/A Pyrimethamine plus dapsone FN TP Routine malaria chemoprophylaxis (i.e., during pregnancy) TP N/A Chloroquine TP N/A Recombinant vaccine based on fusion of circumsporozoite protein and HBsAg FN FN RTS,S/AS02 (vaccine) FN FN RTS,S/ASO2A (vaccine) TP N/A RTS,S/AS01E (vaccine) FN TP RTS,S/AS02D (vaccine) FN TP MSP/RESA (vaccine) TP N/A Vitamin A supplementation TP N/A Recall: 50% Recall: 65% * Recall is the percentage of reference standard interventions found in the system output for each disease topic. { TP5true positive; FN5false negative; N/A5not applicable, found in primary analysis. Table 4 Precision results by disease topic, from primary analysis of data using DynaMed reference standards Disease topic Precision Acute pancreatitis 29% Coronary artery disease 45% Malaria 61% Average precision 45% Using a natural language processing application J Med Lib Assoc 100(2) April 2012 117 eighteen were pharmaceutical-type substances or sup-plements in the associated reference standards. The additional nine true positives consisted of other types of interventions, ranging from behavior issues (e.g., diet) to therapeutic techniques (e.g., guidewire cannulation). In this study, Semantic MEDLINE with the Combo algorithm enhancement was more efficient at express-ing preventive drug and supplement interventions with the ‘‘PREVENTS'' predicate than for other kinds of interventions. The secondary analysis confirmed the hypothesis that some reference standard interventions would be expressed with predicates other than ‘‘PREVENTS.'' The secondary analysis found two of the six inter-ventions not found in the primary analysis for coronary artery disease and four of the thirteen interventions not located for malaria. The relevant semantic predications located in the secondary anal-ysis included the following. Coronary artery disease: Diabetic care_USES_Glucose control Secondary prevention_TREATS_Coronary arteriosclerosis (‘‘Secondary prevention'' referencing smoking cessation) Malaria: Prophylactic treatment_USES_Amodiaquine Prophylactic treatment_USES_Artesunate Prescription of prophylactic anti-malarial_USES_ Pyrimethamine Malaria Vaccines_TREATS_Child Malaria Vaccines_TREATS_Infant As noted earlier, all reference standard interventions for acute pancreatitis were found in the primary analysis. As predicted, in some cases in both analyses, raw Semantic MEDLINE output did not precisely identify a reference standard item, but the associated citation text named the specific intervention. For example, the semantic predication, ‘‘Cannulation_PREVENTS_ Pancreatitis,'' does not specifically name guidewire cannulation for acute pancreatitis; however, the asso-ciated citation text, ‘‘GW [guidewire] cannulation is associated with a higher cannulation success rate and less PEP [post-ERCP pancreatitis] after pancreatic duct entry'' [19], identifies the specific cannulation technique corresponding to the reference standard intervention. Nevertheless, for a reference standard intervention to receive true positive status, the specific intervention had to be named in the citation text. For example, in multiple instances, ‘‘exercise'' was mentioned as a preventive intervention in citations associated with the system output for coronary artery disease. Because the precise term ‘‘aerobic exercise'' did not occur, the reference standard intervention aerobic exercise received a false negative status for recall assessment. To fully utilize Semantic MEDLINE with the Combo enhancement as a decision support tool, a clinician should consult the system's output of semantic predications and its associated citation text. An ideal interface would likely combine both, allowing the user to simultaneously review interesting semantic predications and their associated citations. Precision and variety of output The performance scores reflect in part the percentage of reference standard interventions included in the output. However, a clinician may find the additional preventive interventions mentioned in Semantic MEDLINE's output useful. For example, the reference standard for acute pancreatitis prevention included five interventions (Table 1). Semantic MEDLINE additionally identified antibiotic prophylaxis [20] and ulinastatin [21] as potential preventive interven-tions, based on the findings of randomized controlled trials. The associated DynaMed text does not discuss these potential interventions. However, other inter-ventions in Semantic MEDLINE's output may not suit a clinical need. For example, Semantic MEDLINE also identified nafamostat mesilate [22] as a potential preventive intervention. The associated citation text notes that this intervention is ‘‘partially effective'' and highlights independent risk factors associated with the disease. It is again recommended that a Semantic MEDLINE user consult the citation text (and the original article, if desired) associated with a semantic predication, to assess the relevance and strength of evidence pertaining to the original information need. Ideally, an interface (such as the one used by NLM) would present citation text with its associated semantic predication for simultaneous viewing, along with immediate access to the original PubMed record, where links to full text might be present. CONCLUSION Based on these findings, Semantic MEDLINE with the Combo algorithm enhancement potentially serves as a decision support resource. It is a flexible approach to point-of-care information delivery that could be inte-grated into multiple environments. The authors devel-oped the summarization software with Perl, an interpreted programming language that is compatible with multiple platforms. This Perl application provided adequate computing speeds for this project; however, to increase speed, the software could also be coded with a compiled language like Java. A locally accessible database of SemRep output for several years' worth of MEDLINE data is also needed (for a more detailed description of how the system works, please see Workman and Hurdle [7]). Currently, there is not a publicly accessible, Combo-enhanced Semantic MED-LINE web portal; however, this paper provides a brief description of how a library could customize its own application to serve its particular clients. While a robust Combo-enhanced Semantic MED-LINE is still under development, it offers interesting options for customized search systems. Even though there is no existing application outside of NLM, the framework for a system that accommodates even more information needs exists and could be translated into a product that suits an organization's particular requirements. Libraries could partner with the orga-nizations they serve to customize Combo-enhanced Semantic MEDLINE for their specific user groups. For Workman and Stoddart 118 J Med Lib Assoc 100(2) April 2012 example, a library serving a health care organization could conduct user studies for various clientele groups to determine their information needs and preferences. The outcomes of these user studies would enable a web designer to tailor a graphic interface for each user group. The designer could create an interface for consumers and patients, using the simplified, summarized output as a means to assist users in navigating and understanding PubMed citation text. Another interface could assist clinicians in executing searches and accessing desired data on a single screen, organized according to their collective preferences and workflow-driven needs. Because Semantic MEDLINE, with the Combo algorithm enhancement, is a dynamic application, users would be free to build and execute their own searches. Resources would be needed (e.g., a trained web designer, hardware, and software) to create a system customized for an institution's needs. A parent organization such as a hospital or health care system should contribute these resources if the sponsoring library cannot. Combo-enhanced Semantic MEDLINE could either complement existing decision support products or stand alone. Because it automatically produces infor-mation relevant to multiple topics and subheading refinements, this application can potentially address the information needs of many individual users. A technician could implement the Summarization soft-ware, SemRep semantic predication database, and desired interface to serve clients' information needs. No subscription or licensing fees would be required. Each decision support application contributes point-of- care information in its distinctive way. Each product also has requirements that enable its practical use. Commercial products often require payment of very expensive fees and possibly some onsite techni-cal support. At present, Combo-enhanced Semantic MEDLINE would require substantial onsite technical support to establish the customized, user-centered application described in this paper. Organizations should consider their own resources and needs in choosing what value-added products they provide to their clientele. This study is an example of a technology created in part by librarians and demonstrates a new, dynamic approach to information delivery. It surpasses the functionality of simple information retrieval, freeing users from the difficult, unrealistic task of reviewing many citations, providing instead compact summa-rizations of text that have been filtered for individ-ual information needs. This approach to informa-tion delivery could reinforce the importance of libraries as vital components in the organizations they serve. Limitations This study has limitations that warrant mention. It examined the performance of Combo algorithm-enhanced Semantic MEDLINE in terms of three disease topics, for a single subheading-type refine-ment. However, in an earlier study [7], the applica-tion demonstrated improved performance for a different disease topic (bladder cancer) and subhead-ing- type refinement (genetic disease etiology) over Semantic MEDLINE with conventional, static schema summarization. Other recent research has also examined Combo-enhanced Semantic MEDLINE's performance while processing data for additional disease topics and an additional subheading refine-ment, with positive results. As previously noted, a manuscript describing this larger project has been submitted to another publication. The authors evaluated output using recommendations found in a single product (DynaMed). Similar comparisons using other commercial decision support products may shed additional light on the application's performance. ACKNOWLEDGMENTS The authors express gratitude to Dr. Thomas Rind-flesch and Dr. Marcelo Fiszman for their essential work in text summarization. They also thank the National Library of Medicine for funding this work through grant number T15LM007123. REFERENCES 1. Ely JW, Osheroff JA, Chambliss ML, Ebell MH, Rosenbaum ME. Answering physicians' clinical questions: obstacles and potential solutions. J Am Med Inform Assoc. 2005 Mar-Apr;12(2):217-24. 2. Chambliss ML, Conley J. Answering clinical questions. J Fam Pract. 1996 Aug;43(2):140-4. 3. Golder S, McIntosh HM, Duffy S, Glanville J. Developing efficient search strategies to identify reports of adverse effects in MEDLINE and EMBASE. Health Info Lib J. 2006 Mar;23(1):3-12. 4. Hersh WR, Hickam DH. How well do physicians use electronic information retrieval systems? a framework for investigation and systematic review. JAMA. 1998 Oct 21;280(15):1347-52. 5. Hoogendam A, Stalenhoef AF, Robbe PF, Overbeke AJ. Analysis of queries sent to PubMed at the point of care: observation of search behaviour in a medical teaching hospital. BMC Med Inform Decis Mak. 2008;8:42. 6. Fiszman M, Rindflesch TC, Kilicoglu H. Abstraction summarization for managing the biomedical research literature. Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics; 2004. p. 76-83. 7. Workman TE, Hurdle JF. Dynamic summarization of bibliographic-based data. BMC Med Inform Decis Mak. 2011;11:6. 8. Rindflesch TC, Fiszman M. The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in bio-medical text. J Biomed Inform. 2003 Dec;36(6):462-77. 9. Glinz W, Grob PJ, Nydegger UE, Ricklin T, Stamm F, Stoffel D, Lasance A. Polyvalent immunoglobulins for prophylaxis of bacterial infections in patients following multiple trauma. a randomized, placebo-controlled study. Intensive Care Med. 1985;11(6):288-94. 10. Lindberg DA, Humphreys BL, McCray AT. The Unified Medical Language System. Methods Inf Med. 1993 Aug;32(4):281-91. Using a natural language processing application J Med Lib Assoc 100(2) April 2012 119 11. Kullback S, Leibler RA. On information and sufficiency. Ann Math Stat. 1951;22(1):79-86. 12. Riloff E. Automatically generating extraction patterns from untagged text. Proceedings of the Thirteenth National Conference on Artificial Intelligence. Menlo Park, CA: The AAAI Press/MIT Press; 1996. p. 1044-9. 13. Fiszman M, Demner-Fushman D, Kilicoglu H, Rindflesch TC. Automatic summarization of MEDLINE citations for evidence-based medical treatment: a topic-oriented evaluation. J Biomed Inform. 2009 Oct;42(5):801-13. 14. Fiszman M, Rindflesch TC, Kilicoglu H. Summarizing drug information in MEDLINE citations. AMIA Annu Symp Proc. 2006:254-8. 15. Sneiderman C, Demner-Fushman D, Fiszman M, Rosemblat G, Lang FM, Norwood D, Rindflesch TC. Seman-tic processing to enhance retrieval of diagnosis citations from MEDLINE. AMIA Annu Symp Proc. 2006. p. 1104. 16. Ahlers CB, Fiszman M, Demner-Fushman D, Lang FM, Rindflesch TC. Extracting semantic predications from MEDLINE citations for pharmacogenomics. Pac Symp Biocomput. 2007:209-20. 17. Workman TE, Fiszman M, Hurdle JF, Rindflesch TC. Biomedical text summarization to support genetic database curation: using Semantic MEDLINE to create a secondary database of genetic information. J Med Lib Assoc. 2010 Oct;98(4):273-81. DOI: http://dx.doi.org/10.3163/1536 -5050.98.4.003. 18. Banzi R, Liberati A, Moschetti I, Tagliabue L, Moja L. A review of online evidence-based practice point-of-care infor-mation summary providers. JMed Internet Res. 2010;12(3):e26. 19. Cheung J, Tsoi KK, Quan WL, Lau JY, Sung JJ. Guide-wire versus conventional contrast cannulation of the common bile duct for the prevention of post-ERCP pancreatitis: a systematic review and meta-analysis. Gastro-intest Endosc. 2009 Dec;70(6):1211-9. 20. Raty S, Sand J, Pulkkinen M, Matikainen M, Nordback I. Post-ERCP pancreatitis: reduction by routine antibiotics. J Gastrointest Surg. 2001 Jul-Aug;5(4):339-45; discussion: 45. 21. Tsujino T, Komatsu Y, Isayama H, Hirano K, Sasahira N, Yamamoto N, Toda N, Ito Y, Nakai Y, Tada M, Matsumura M, Yoshida H, Kawabe T, Shiratori Y, Omata M. Ulinastatin for pancreatitis after endoscopic retrograde cholangiopancreato-graphy: a randomized, controlled trial. Clin Gastroenterol Hepatol. 2005 Apr;3(4):376-83. 22. Choi CW, Kang DH, Kim GH, Eum JS, Lee SM, Song GA, Kim DU, Kim ID, Cho M. Nafamostat mesylate in the prevention of post-ERCP pancreatitis and risk factors for post-ERCP pancreatitis. Gastrointest Endosc. 2009 Apr; 69(4):e11-8. AUTHORS' AFFILIATIONS T. Elizabeth Workman, PhD, MLIS, Liz@yahoo.com, Postdoctoral Research Associate, Department of Bio-medical Informatics, University of Utah, HSEB 5775, 26 South 2000 East, Salt Lake City, UT 84112; Joan M. Stoddart, MALS, AHIP, joan.stoddart@utah.edu, Deputy Director, Spencer S. Eccles Health Sciences Library, University of Utah, 10 North 1900 East, Salt Lake City, UT 84112 Received July 2011; accepted November 2011 Workman and Stoddart 120 J Med Lib Assoc 100(2) April 2012
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6sb4qf4