Publication Type |
Journal Article |
School or College |
College of Engineering |
Department |
Computing, School of |
Creator |
Riloff, Ellen M. |
Other Author |
Bean, David L. |
Title |
Corpus-based identification of non-anaphoric noun phrases |
Date |
1999 |
Description |
Coreference resolution involves finding antecedents for anaphoric discourse entities, such as definite noun phrases. But many definite noun phrases are not anaphoric because their meaning can be understood from general world knowledge (e.g., "the White House" or "the news media"). We have developed a corpus-based algorithm for automatically identifying definite noun phrases that are non-anaphoric, which has the potential to improve the efficiency and accuracy of coreference resolution systems. Our algorithm generates lists of nonanaphoric noun phrases and noun phrase patterns from a training corpus and uses them to recognize non-anaphoric noun phrases in new texts. Using 1600 MIX -1 terrorism news articles as the training corpus, our approach achieved 78% recall and 87% precision at identifying such noun phrases in 50 test documents. |
Type |
Text |
Publisher |
Association for Computational Linguistics |
First Page |
373 |
Last Page |
380 |
Subject |
Corpus-based identification; Non-anaphoric noun phrases; Coreference resolution; MUC-4; Discourse entity; DE |
Subject LCSH |
Information retrieval; Natural language processing (Computer science); Corpora (Linguistics) |
Language |
eng |
Bibliographic Citation |
Bean, D. L., & Riloff, E. M. (1999). Corpus-based identification of non-anaphoric noun phrases. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), 373-80. |
Rights Management |
(c)Bean, D. L., & Riloff, E. M. |
Format Medium |
application/pdf |
Format Extent |
1,067,314 bytes |
Identifier |
ir-main,12429 |
ARK |
ark:/87278/s6571wgw |
Setname |
ir_uspace |
ID |
705591 |
Reference URL |
https://collections.lib.utah.edu/ark:/87278/s6571wgw |