Automatically generating extraction patterns from untagged text

Update Item Information
Publication Type Journal Article
School or College College of Engineering
Department Computing, School of
Creator Riloff, Ellen M.
Title Automatically generating extraction patterns from untagged text
Date 1996
Description Many corpus-based natural language processing systems rely on text corpora that have been manually annotated with syntactic or semantic tags. In particular, all previous dictionary construction systems for information extraction have used an annotated training corpus or some form of annotated input. We have developed a system called AutoSlog-TS that creates dictionaries of extraction patterns using only untagged text. AutoSlog-TS is based on the AutoSlog system, which generated extraction patterns using annotated text and a set of heuristic rules. By adapting AutoSlog and combining it with statistical techniques, we eliminated its dependency on tagged text. In experiments with the MUC-4 terrorism domain, AutoSlog-TS created a dictionary of extraction patterns that performed comparably to a dictionary created by AutoSlog, using only preclassified texts as input.
Type Text
Publisher Association for the Advancement of Artificial Intelligence (AAAI)
First Page 1
Last Page 6
Subject Information extraction; Automatically generating; Extraction patterns; Untagged text; Corpus-based; AutoSlog-TS; AutoSlog system; MUC-4; Dictionary construction
Subject LCSH Information retrieval; Corpora (Linguistics); Natural language processing (Computer science)
Language eng
Bibliographic Citation Riloff, E. M. (1996). Automatically generating extraction patterns from untagged text. Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96), 1-6.
Rights Management (c)AAAI http://www.aaai.org/
Format Medium application/pdf
Format Extent 1,976,724 bytes
Identifier ir-main,12415
ARK ark:/87278/s65t43m5
Setname ir_uspace
ID 702594
Reference URL https://collections.lib.utah.edu/ark:/87278/s65t43m5
Back to Search Results