Corpus-based identification of non-anaphoric noun phrases

Update Item Information
Publication Type Journal Article
School or College College of Engineering
Department Computing, School of
Creator Riloff, Ellen M.
Other Author Bean, David L.
Title Corpus-based identification of non-anaphoric noun phrases
Date 1999
Description Coreference resolution involves finding antecedents for anaphoric discourse entities, such as definite noun phrases. But many definite noun phrases are not anaphoric because their meaning can be understood from general world knowledge (e.g., "the White House" or "the news media"). We have developed a corpus-based algorithm for automatically identifying definite noun phrases that are non-anaphoric, which has the potential to improve the efficiency and accuracy of coreference resolution systems. Our algorithm generates lists of nonanaphoric noun phrases and noun phrase patterns from a training corpus and uses them to recognize non-anaphoric noun phrases in new texts. Using 1600 MIX -1 terrorism news articles as the training corpus, our approach achieved 78% recall and 87% precision at identifying such noun phrases in 50 test documents.
Type Text
Publisher Association for Computational Linguistics
First Page 373
Last Page 380
Subject Corpus-based identification; Non-anaphoric noun phrases; Coreference resolution; MUC-4; Discourse entity; DE
Subject LCSH Information retrieval; Natural language processing (Computer science); Corpora (Linguistics)
Language eng
Bibliographic Citation Bean, D. L., & Riloff, E. M. (1999). Corpus-based identification of non-anaphoric noun phrases. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), 373-80.
Rights Management (c)Bean, D. L., & Riloff, E. M.
Format Medium application/pdf
Format Extent 1,067,314 bytes
Identifier ir-main,12429
ARK ark:/87278/s6571wgw
Setname ir_uspace
ID 705591
Reference URL https://collections.lib.utah.edu/ark:/87278/s6571wgw
Back to Search Results