Publication Type |
Journal Article |
School or College |
College of Engineering |
Department |
Computing, School of |
Creator |
Riloff, Ellen M. |
Title |
Automatically generating extraction patterns from untagged text |
Date |
1996 |
Description |
Many corpus-based natural language processing systems rely on text corpora that have been manually annotated with syntactic or semantic tags. In particular, all previous dictionary construction systems for information extraction have used an annotated training corpus or some form of annotated input. We have developed a system called AutoSlog-TS that creates dictionaries of extraction patterns using only untagged text. AutoSlog-TS is based on the AutoSlog system, which generated extraction patterns using annotated text and a set of heuristic rules. By adapting AutoSlog and combining it with statistical techniques, we eliminated its dependency on tagged text. In experiments with the MUC-4 terrorism domain, AutoSlog-TS created a dictionary of extraction patterns that performed comparably to a dictionary created by AutoSlog, using only preclassified texts as input. |
Type |
Text |
Publisher |
Association for the Advancement of Artificial Intelligence (AAAI) |
First Page |
1 |
Last Page |
6 |
Subject |
Information extraction; Automatically generating; Extraction patterns; Untagged text; Corpus-based; AutoSlog-TS; AutoSlog system; MUC-4; Dictionary construction |
Subject LCSH |
Information retrieval; Corpora (Linguistics); Natural language processing (Computer science) |
Language |
eng |
Bibliographic Citation |
Riloff, E. M. (1996). Automatically generating extraction patterns from untagged text. Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96), 1-6. |
Rights Management |
(c)AAAI http://www.aaai.org/ |
Format Medium |
application/pdf |
Format Extent |
1,976,724 bytes |
Identifier |
ir-main,12415 |
ARK |
ark:/87278/s65t43m5 |
Setname |
ir_uspace |
ID |
702594 |
Reference URL |
https://collections.lib.utah.edu/ark:/87278/s65t43m5 |