Publication Type |
honors thesis |
School or College |
College of Engineering |
Department |
Computing |
Faculty Mentor |
Vivek Srikumar |
Creator |
Yehle, Tobin |
Title |
Memoized parsing with derivatives |
Year graduated |
2016 |
Date |
2016-04 |
Description |
Due to the computational complexity of parsing, constituent parsing is not preferred for tasks involving large corpora. However, the high similarity between sentences in natural language suggests that it is not necessary to pay the full computational price for each new sentence. In this work we present a new parser for phrase structure grammars that takes advantage of this fact by caching partial parses to reuse on later matching sentences. The algorithm we present is the first probabilistic extension of the family of derivative parsers that repeatedly apply the Brzozowski derivative to find a parse tree. We show that the new algorithm is easily adaptable to natural language parsing - we introduce a folded variant of the parser that keeps the size of lexical grammars small, thus allowing for efficient implementations of complex parsing models. |
Type |
Text |
Publisher |
University of Utah |
Subject |
Parsing (Computer grammar) |
Language |
eng |
Rights Management |
(c) Tobin Yehle |
Format Medium |
application/pdf |
Format Extent |
25,055 bytes |
Identifier |
honors/id/87 |
Permissions Reference URL |
https://collections.lib.utah.edu/details?id=1315246 |
ARK |
ark:/87278/s6bw0rwb |
Setname |
ir_htoa |
ID |
205739 |
Reference URL |
https://collections.lib.utah.edu/ark:/87278/s6bw0rwb |