Biomolecular simulation data management in heterogeneous environments

Update Item Information
Title Biomolecular simulation data management in heterogeneous environments
Publication Type dissertation
School or College School of Medicine
Department Biomedical Informatics
Author Thibault, Julien Charles Victor
Date 2014-12
Description Over 40 years ago, the first computer simulation of a protein was reported: the atomic motions of a 58 amino acid protein were simulated for few picoseconds. With today's supercomputers, simulations of large biomolecular systems with hundreds of thousands of atoms can reach biologically significant timescales. Through dynamics information biomolecular simulations can provide new insights into molecular structure and function to support the development of new drugs or therapies. While the recent advances in high-performance computing hardware and computational methods have enabled scientists to run longer simulations, they also created new challenges for data management. Investigators need to use local and national resources to run these simulations and store their output, which can reach terabytes of data on disk. Because of the wide variety of computational methods and software packages available to the community, no standard data representation has been established to describe the computational protocol and the output of these simulations, preventing data sharing and collaboration. Data exchange is also limited due to the lack of repositories and tools to summarize, index, and search biomolecular simulation datasets. In this dissertation a common data model for biomolecular simulations is proposed to guide the design of future databases and APIs. The data model was then extended to a controlled vocabulary that can be used in the context of the semantic web. Two different approaches to data management are also proposed. The iBIOMES repository offers a distributed environment where input and output files are indexed via common data elements. The repository includes a dynamic web interface to summarize, visualize, search, and download published data. A simpler tool, iBIOMES Lite, was developed to generate summaries of datasets hosted at remote sites where user privileges and/or IT resources might be limited. These two informatics-based approaches to data management offer new means for the community to keep track of distributed and heterogeneous biomolecular simulation data and create collaborative networks.
Type Text
Publisher University of Utah
Subject MESH Computer Simulation; Computational Biology; Biological Ontologies; Biomedical Enhancement; Datasets as Topic; Databases as Topic; Computing Methodologies; Models, Molecular; Molecular Dynamics Simulation; Database Management Systems; Software Design; Programming Languages; Vocabulary, Controlled; Metadata; Information Dissemination; Search Engine
Dissertation Institution University of Utah
Dissertation Name Doctor of Philosophy
Language eng
Relation is Version of Digital version of Biomolecular Simulation Data Management in Heterogeneous Environments
Rights Management Copyright © Julien Charles Victor Thibault 2014
Format application/pdf
Format Medium application/pdf
Format Extent 6,525,862 bytes
Source Original in Marriott Library Special Collections
ARK ark:/87278/s65t80sp
Setname ir_etd
ID 1422299
Reference URL https://collections.lib.utah.edu/ark:/87278/s65t80sp
Back to Search Results