Publication Type |
Journal Article |
School or College |
College of Engineering |
Department |
Computing, School of |
Creator |
Freire, Juliana |
Other Author |
Nguyen, Hoa; Kang, Eun Yong |
Title |
Automatically extracting form labels |
Date |
2008 |
Description |
We describe a machine-learning-based approach for extracting attribute labels from Web form interfaces. Having these labels is a requirement for several techniques that attempt to retrieve and integrate data that reside in online databases and that are hidden behind form interfaces, including schema matching and clustering, and hidden-Web crawlers. Whereas previous approaches to this problem have relied on heuristics and manually specified extraction rules, our technique makes use of learning classifiers to identify form labels. Our preliminary experiments show this approach is promising and has high accuracy. |
Type |
Text |
Publisher |
Institute of Electrical and Electronics Engineers (IEEE) |
First Page |
1498 |
Last Page |
1500 |
Subject |
Learning classifiers |
Subject LCSH |
Data mining; Database management; Machine learning |
Language |
eng |
Bibliographic Citation |
Nguyen, H., Kang, E. Y., & Freire, J. (2008). Automatically extracting form labels. Proceedings of ICDE, 1498-500. |
Rights Management |
(c) 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. |
Format Medium |
application/pdf |
Format Extent |
1,044,983 bytes |
Identifier |
ir-main,12330 |
ARK |
ark:/87278/s6th954d |
Setname |
ir_uspace |
ID |
705669 |
Reference URL |
https://collections.lib.utah.edu/ark:/87278/s6th954d |