Empirical study of automated dictionary construction for information extraction in three domains

Update Item Information
Publication Type Manuscript
School or College College of Engineering
Department Computing, School of
Creator Riloff, Ellen M.
Title Empirical study of automated dictionary construction for information extraction in three domains
Date 1996
Description A primary goal of natural language processing researchers is to develop a knowledge-based natural language processing (NLP) system that is portable across domains. However, most knowledge-based NLP systems rely on a domain-specific dictionary of concepts, which represents a substantial knowledge-engineering bottleneck. We have developed a system called AutoSlog that addresses the knowledge-engineering bottleneck for a task called information extraction. AutoSlog automatically creates domain-specific dictionaries for information extraction, given an appropriate training corpus. We have used AutoSlog to create a dictionary of extraction patterns for terrorism, which achieved 98% of the performance of a handcrafted dictionary that required approximately 1500 person-hours to build. In this paper, we describe experiments with AutoSlog in two additional domains: joint ventures and microelectronics. We compare the performance of AutoSlog across the three domains, discuss the lessons learned about the generality of this approach, and present results from two experiments which demonstrate that novice users can generate effective dictionaries using AutoSlog.
Type Text
Publisher Elsevier
First Page 1
Last Page 39
Subject Information extraction; AutoSlog; Across domains
Subject LCSH Information retrieval; Natural language processing (Computer science)
Language eng
Bibliographic Citation Riloff, E. M. (1996). Empirical study of automated dictionary construction for information extraction in three domains. Artificial Intelligence Journal, 85, 1-39.
Rights Management (c) Elsevier http://www.elsevier.com
Format Medium application/pdf
Format Extent 5,697,724 bytes
Identifier ir-main,12416
ARK ark:/87278/s6bv810p
Setname ir_uspace
ID 704812
Reference URL https://collections.lib.utah.edu/ark:/87278/s6bv810p
Back to Search Results