Tools and techniques for genome annotation and analysis

Update Item Information
Title Tools and techniques for genome annotation and analysis
Publication Type dissertation
School or College School of Medicine
Department Human Genetics
Author Holt, Carson Hinton
Date 2011-08
Description Whole genome sequencing projects have expanded our understanding of evolution, organism development, and human disease. Now advances in secondgeneration technologies are making whole genome sequencing routine even for small laboratories. However, advances in annotation technology have not kept pace with genome sequencing, and annotation has become the major bottleneck for many genome projects (especially those with limited bioinformatics expertise). At the same time, challenges associated with genomics research extend beyond merely annotating genomes, as annotations must be subjected to diverse downstream analyses, the complexities of which can confound smaller research groups. Additionally, with improvements in genome assembly and the wide availability of next generation transcriptome data (mRNA-seq), researchers have the opportunity to re-annotate previously published genomes, which creates new difficulties for data integration and management that are not well addressed by existing tools. In response to the challenges facing second-generation genome projects, I have developed the annotation pipeline MAKER2 together with accessory software for downstream analysis and data management. The MAKER2 annotation pipeline finds repeats within a genome, aligns ESTs and cDNAs, identifies sites of protein homology, and produces database-ready gene annotations in association with supporting evidence. However MAKER2 can go beyond structural annotation to identify and integrate functional annotations. MAKER2 also provides researchers iv with the capability to re-annotate legacy genome datasets and to incorporate mRNAseq. Additionally, MAKER2 supports distributed parallelization on computer clusters, thus providing a scalable solution for datasets of any size. Annotations produced by MAKER2 can be directly loaded into many popular downstream annotation analysis and management tools from the Generic Model Organism Database Project. By using MAKER2 with these tools, research groups can quickly build genome annotations, perform analyses, and distribute their data to the wider scientific community. Here I describe the internal architecture of MAKER2, and document its computational capabilities. I also describe my work to annotate and analyze eight emerging model organism genomes in collaboration with their associated genome projects. Thus, in the course of my thesis work, I have addressed a specific need within the scientific community for easy-to-use annotation and analysis tools while also expanding our understanding of evolution and biology.
Type Text
Publisher University of Utah
Subject MESH Databases, Genetic; Genome; High-Throughput Nucleotide Sequencing; Molecular Sequence Annotation; Software; Expressed Sequence Tags; Genetic Techniques; Sensitivity and Specificity; Oomycetes; Eukaryota
Dissertation Institution University of Utah
Dissertation Name Doctor of Philosophy
Language eng
Relation is Version of Digital reproduction of Tools and Techniques for Genome Annotation and Analysis. Spencer S. Eccles Health Sciences Library. Print version available at J. Willard Marriott Library Special Collections.
Rights Management Copyright © Carson Hinton Holt 2011
Format application/pdf
Format Medium application/pdf
Format Extent 7,649,831 bytes
Source Original in Marriott Library Special Collections, QH9.7 2011.H65
ARK ark:/87278/s6tj1vwv
Setname ir_etd
ID 196477
Reference URL https://collections.lib.utah.edu/ark:/87278/s6tj1vwv