Tools and techniques for genome annotation and analysis

Tools and techniques for genome annotation and analysis

Title	Tools and techniques for genome annotation and analysis
Publication Type	dissertation
School or College	School of Medicine
Department	Human Genetics
Author	Holt, Carson Hinton
Date	2011-08
Description	Whole genome sequencing projects have expanded our understanding of evolution, organism development, and human disease. Now advances in secondgeneration technologies are making whole genome sequencing routine even for small laboratories. However, advances in annotation technology have not kept pace with genome sequencing, and annotation has become the major bottleneck for many genome projects (especially those with limited bioinformatics expertise). At the same time, challenges associated with genomics research extend beyond merely annotating genomes, as annotations must be subjected to diverse downstream analyses, the complexities of which can confound smaller research groups. Additionally, with improvements in genome assembly and the wide availability of next generation transcriptome data (mRNA-seq), researchers have the opportunity to re-annotate previously published genomes, which creates new difficulties for data integration and management that are not well addressed by existing tools. In response to the challenges facing second-generation genome projects, I have developed the annotation pipeline MAKER2 together with accessory software for downstream analysis and data management. The MAKER2 annotation pipeline finds repeats within a genome, aligns ESTs and cDNAs, identifies sites of protein homology, and produces database-ready gene annotations in association with supporting evidence. However MAKER2 can go beyond structural annotation to identify and integrate functional annotations. MAKER2 also provides researchers iv with the capability to re-annotate legacy genome datasets and to incorporate mRNAseq. Additionally, MAKER2 supports distributed parallelization on computer clusters, thus providing a scalable solution for datasets of any size. Annotations produced by MAKER2 can be directly loaded into many popular downstream annotation analysis and management tools from the Generic Model Organism Database Project. By using MAKER2 with these tools, research groups can quickly build genome annotations, perform analyses, and distribute their data to the wider scientific community. Here I describe the internal architecture of MAKER2, and document its computational capabilities. I also describe my work to annotate and analyze eight emerging model organism genomes in collaboration with their associated genome projects. Thus, in the course of my thesis work, I have addressed a specific need within the scientific community for easy-to-use annotation and analysis tools while also expanding our understanding of evolution and biology.
Type	Text
Publisher	University of Utah
Subject MESH	Databases, Genetic; Genome; High-Throughput Nucleotide Sequencing; Molecular Sequence Annotation; Software; Expressed Sequence Tags; Genetic Techniques; Sensitivity and Specificity; Oomycetes; Eukaryota
Dissertation Institution	University of Utah
Dissertation Name	Doctor of Philosophy
Language	eng
Relation is Version of	Digital reproduction of Tools and Techniques for Genome Annotation and Analysis. Spencer S. Eccles Health Sciences Library. Print version available at J. Willard Marriott Library Special Collections.
Rights Management	Copyright © Carson Hinton Holt 2011
Format	application/pdf
Format Medium	application/pdf
Format Extent	7,649,831 bytes
Source	Original in Marriott Library Special Collections, QH9.7 2011.H65
ARK	ark:/87278/s6tj1vwv
Setname	ir_etd
ID	196477
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6tj1vwv