Developing computational methods for studying nonmodel organism genetics and human disease with next-generation sequencing data

Developing computational methods for studying nonmodel organism genetics and human disease with next-generation sequencing data

Title	Developing computational methods for studying nonmodel organism genetics and human disease with next-generation sequencing data
Publication Type	dissertation
School or College	School of Medicine
Department	Human Genetics
Author	Hu, Hao
Date	2012-12
Description	The rapidly decreasing of costs of sequencing is revolutionizing genetics. Two applications of next-generation sequencing data are of particular importance in this regard. First, high-throughput sequencing now offers a fast and inexpensive means to investigate the genomes and genetics of nonmodel organisms. Second, human personalgenomics data offer a unique opportunity for discovering the genetic basis of human traits and diseases. My PhD research has focused on developing computational methods to study genetics using next-generation sequencing data. In the first chapter of my thesis, I present a series of genome-based studies of the venomous cone snail Conus bullatus, a source of pharmaceutically important small cysteine-rich peptides called conopeptides or conotoxins. Using high-coverage transcriptome sequence from its venom duct together with low-coverage genomic reads, I have developed new methods to characterize key genomic traits in the absence of a complete reference genome, including genome size, sequence diversity, repeat content and mobile element densities. I have also developed an in silico transcriptomics pipeline for conotoxin discovery, and have used it to identify novel conotoxins as well as candidate enzymes that are likely to be involved in the posttranslational processing of conotoxins. In the second and the third chapters of my thesis, I describe a probabilistic disease-gene search algorithm VAAST (the Variant Annotation, Analysis and Search ! ! Tool) for finding damaged genes and their disease-causing variants; I also describe a powerful new extension to the original code-base called VAAST 2.0. In these chapters, I demonstrate that VAAST is both an accurate rare Mendelian disease-gene finder and a powerful means for identifying genes and alleles underlying common diseases. I have also carried systematic population-genetic simulations in order to benchmark the performance of VAAST and VAAST 2.0 under different genetic scenarios, and these demonstrate that VAAST 2.0 is the most robust and broadly applicable method available today for identification of genes involved in common genetic diseases such as breast cancer, hypertriglyceridemia and Crohn disease.
Type	Text
Publisher	University of Utah
Subject MESH	Conus Snail; Mollusk Venoms; Gene Expression Profiling; Genomic Structural Variation; Epistasis, Genetic; Databases, Genetic; Algorithms
Dissertation Institution	University of Utah
Dissertation Name	Doctor of Philosophy
Language	eng
Relation is Version of	Digital reproduction of Developing Computational Methods for Studying nonmodel Organism Genetics and Human Disease with Next-Generation Sequencing Data. Spencer S. Eccles Health Sciences Library. Print version available at J. Willard Marriott Library Special Collections.
Rights Management	Copyright © Hao Hu 2012
Format	application/pdf
Format Medium	application/pdf
Format Extent	6,664,462 bytes
Source	Original in Marriott Library Special Collections,
ARK	ark:/87278/s6f50xbx
Setname	ir_etd
ID	196333
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6f50xbx