Techniques in manifold learning: intrinsic dimension and principal surface estimation

Techniques in manifold learning: intrinsic dimension and principal surface estimation

Title	Techniques in manifold learning: intrinsic dimension and principal surface estimation
Publication Type	dissertation
School or College	College of Science
Department	Mathematics
Author	Purcell, Michael Patrick
Date	2010-05
Description	Intrinsic dimension estimation is a fundamental problem in manifold learning. In applications, high-dimensional data frequently exhibit an underlying lower-dimensional structure that, if understood, would allow for faster and more complete analysis of the data. Understanding of this underlying structure requires first determining what the dimension of that structure is. In this dissertation, a new intrinsic dimension estimator is proposed. This estimator does not estimate the intrinsic dimension of a set directly, rather it estimates the dimension of the sampling measure. This approach acknowledges the fact that in applications the lower-dimensional structure is unknown; the only information about the set that is available to researchers is what is collected via some sampling method. Theoretical performance guarantees for this estimator are proven that show that the estimator will perform well with only very mild restrictions. Finally, the results of several numerical experiments are provided as evidence that the estimator performs as well or better than the estimators that have been proposed in the literature. Having determined the intrinsic dimension of a set, it remains to examine the geometry of the underlying lower-dimensional structure. This dissertation examines a new technique called Kernel Map Manifolds that has been proposed by Samuel Gerber to do precisely this. The Kernel Map Manifolds algorithm uses the complementary ideas of principal surfaces and kernel regression to estimate the geometry of the underlying structure of sample data. This algorithm relies on a conjecture about the nature of the class of minimizers of a distance function. If this conjecture is true, then a gradient descent method can be employed to produce estimated coordinate maps for a principal surface of a distribution. While this conjecture is not addressed directly herein, what is shown is that if the coordinate map of a principal surface of a given distribution is known, then sample data can be used in conjunction with this knowledge to produce accurate estimates of the principal surface thereby showing that if the conjecture is true then the Kernel Map Manifolds algorithm will produce accurate estimates of the underlying lower-dimensional geometry of the set.
Type	Text
Publisher	University of Utah
Subject	Intrinsic dimension; Principal surface; Manifold learning
Subject LCSH	Manifolds (Mathematics;)
Dissertation Institution	University of Utah
Dissertation Name	PhD
Language	eng
Rights Management	©Michael Patrick Purcell
Format	application/pdf
Format Medium	application/pdf
Format Extent	1,075,265 bytes
Identifier	us-etd2,151299
Source	Original in Marriott Library Special Collections, QA3.5 2010.P87
ARK	ark:/87278/s6377qc4
Setname	ir_etd
ID	193754
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6377qc4