Corpus-based semantic lexicon induction with web-based corroboration

Update Item Information
Publication Type Journal Article
School or College College of Engineering
Department Computing, School of
Creator Riloff, Ellen M.
Other Author Igo, Sean P.
Title Corpus-based semantic lexicon induction with web-based corroboration
Date 2009
Description Various techniques have been developed to automatically induce semantic dictionaries from text corpora and from the Web. Our research combines corpus-based semantic lexicon induction with statistics acquired from the Web to improve the accuracy of automatically acquired domain-specific dictionaries. We use a weakly supervised bootstrapping algorithm to induce a semantic lexicon from a text corpus, and then issue Web queries to generate co-occurrence statistics between each lexicon entry and semantically related terms. The Web statistics provide a source of independent evidence to confirm, or disconfirm, that a word belongs to the intended semantic category. We evaluate this approach on 7 semantic categories representing two domains. Our results show that the Web statistics dramatically improve the ranking of lexicon entries, and can also be used to filter incorrect entries.
Type Text
Publisher Association for Computational Linguistics
First Page 1
Last Page 9
Subject Corpus-based; Text corpora; Domain-specific dictionaries; Bootstrapping algorithm
Subject LCSH Programming languages (Electronic computers) -- Semantics; Information retrieval; Corpora (Linguistics)
Language eng
Bibliographic Citation Igo, S. P., & Riloff, E. M. (2009). Corpus-based semantic lexicon induction with web-based corroboration. NAACL-09 Workshop on Unsupervised and Minimally Supervised Learning of Lexical Semantics, 1-9.
Rights Management (c) Igo, S. P., & Riloff, E. M.
Format Medium application/pdf
Format Extent 94,892 bytes
Identifier ir-main,12420
ARK ark:/87278/s6m623bn
Setname ir_uspace
ID 702348
Reference URL https://collections.lib.utah.edu/ark:/87278/s6m623bn
Back to Search Results