Description |
Problems involving network inference and predictions based on network data come up in many domains, such as in biology, ecology, economics, sociology, and neuroscience. These problems are not trivial to solve using existing methods. First, when inferring weighted undirected networks from data, the estimations must satisfy a matrix constraint that ensures the resulting networks are valid. Second, traditional machine learning methods are not designed to naturally work with network objects as input data. There is a need for the development of machine learning methods that can accommodate network-valued input beyond using vectorized correlation matrices, which do not fully convey information about network topology or constraints. The methods described in this work address these concerns with the aim to 1) learn network structure from multivariate data that enforces positive-definiteness, a necessary condition for valid networks; 2) make classification and regression predictions directly on the manifold of valid networks and also from their derived topological features; and 3) properly control for confounding covariates in prediction models. While these methods are applicable to general network problems, they are especially pertinent to brain network analysis from resting-state functional magnetic resonance imaging (rsfMRI) data. The first method presented is for estimating a sparse functional brain network from a subject's image data, as a positive-definite Gaussian graphical model under the challenging setting of high dimensionality and low sample size typical of neuroimaging datasets. Subsequently, knowing the network structures for all subjects in a population dataset, this dissertation explores whether topological features of networks carry information for prediction of phenotype. This dissertation then introduces methods for making classification and regression predictions from network-valued inputs as objects on the Riemannian manifold of symmetric positive-definite matrices. The results on an rsfMRI dataset of autism show that these network representations bring state-of-the art improvement in prediction performance. Lastly, this work looks at how to train and interpret prediction models in the presence of confounding information. The methods discussed here are used to learn the components of brain connectivity that drive the prediction of behavior and disease. |