Publication Type |
Journal Article |
School or College |
College of Engineering |
Department |
Computing, School of |
Creator |
Riloff, Ellen M. |
Other Author |
Patwardhan, Siddharth |
Title |
Effective information extraction with semantic affinity patterns and relevant regions |
Date |
2007 |
Description |
We present an information extraction system that decouples the tasks of finding relevant regions of text and applying extraction patterns. We create a self-trained relevant sentence classifier to identify relevant regions, and use a semantic affinity measure to automatically learn domain-relevant extraction patterns. We then distinguish primary patterns from secondary patterns and apply the patterns selectively in the relevant regions. The resulting IE system achieves good performance on the MUC-4 terrorism corpus and ProMed disease outbreak stories. This approach requires only a few seed extraction patterns and a collection of relevant and irrelevant documents for training. |
Type |
Text |
Publisher |
Association for Computational Linguistics |
First Page |
1 |
Last Page |
11 |
Subject |
Information extraction; Semantic affinity patterns; Relevant regions; MUC-4 terrorism corpus; ProMed disease outbreak stories |
Subject LCSH |
Information retrieval |
Language |
eng |
Bibliographic Citation |
Patwardhan, S., & Riloff, E. M. (2007). Effective information extraction with semantic affinity patterns and relevant regions. Proceedings of the 2007 Conference on Empirical Methods in Natural Language Processing (EMNLP-07), 1-11. |
Rights Management |
(c)Patwardhan, S., & Riloff, E. M. |
Format Medium |
application/pdf |
Format Extent |
108,829 bytes |
Identifier |
ir-main,12405 |
ARK |
ark:/87278/s6hh735g |
Setname |
ir_uspace |
ID |
702515 |
Reference URL |
https://collections.lib.utah.edu/ark:/87278/s6hh735g |