A Conceptual Architecture for Reproducible On-demand Data Integration for Complex Diseases

Update Item Information
Identifier 001_RR2016_Conceptual_Architecture_GOURIPEDDI.pdf
Title A Conceptual Architecture for Reproducible On-demand Data Integration for Complex Diseases
Creator Ramkiran Gouripeddi; Karen Eilbeck; Mollie Cummins; Katherine Sward; Bernie LaSalle; Kathryn Peterson; Randy Madsen; Phillip Warner; Willard Dere; Julio C. Facelli, Department of Biomedical Informatics, University of Utah
Subject Biomedical Research; Research; Research Methods; Rare Disease Research
Description Eosinophilic Esophagitis, which is a complex and emerging condition characterized by poorly defined phenotypes, and associated with both genetic and environmental conditions. Understanding such diseases requires researchers to seamlessly navigate across multiple scales (e.g., metabolome, proteome, genome, phenome, exposome) and models (sources using different stores, formats, and semantics), interrogate existing knowledge bases, and obtain results in formats of choice to answer different types of research questions. All of these would need to be performed to support reproducibility and sharability of methods used for selecting data sources, designing research queries, as well as query execution, understanding results and their quality. We present a higher level of formalizations for building multi-source data platforms on-demand based on the principles of meta-process modeling and provide reproducible and shareable data query and interrogation workflows and artifacts. A framework based on these formalizations consists of a layered abstraction of processes to support administrative and research end users: Top layer (meta-process): An extendable library of computable generic process concepts (PC) stored in a metadata repository1 (MDR) and describe steps/phases in the translational research life cycle; Middle layer (process): Methods to generate on-demand queries by assembling instantiated PC into query processes and rules. Researchers design query processes using PC, and evaluate their feasibility and validity by leveraging metadata content in the MDR; Bottom layer (execution): Interaction with a hyper-generalized federation platform (e.g. OpenFurther1) that performs complex interrogation and integration queries that require consideration of interdependencies and precedence across the selected sources. This framework can be implemented using process exchange formats (e.g., DAX, BPMN); and scientific workflow systems (e.g., Pegasus2, Apache Taverna3). All content (PC, rules, and workflows), assembling, and executing mechanism are sharable. The content, design, and development of the framework is informed by user-centered design methodology and consists of researcher and integration-centric components to provide robust and reproducible workflows. References: 1. Gouripeddi R, Facelli JC, et al. FURTHeR: An Infrastructure for Clinical, Translational and Comparative Effectiveness Research. AMIA Annual Fall Symposium. 2013; Wash, DC. 2. Pegasus. The Pegasus Project. 2016; https://pegasus.isi.edu/. 3. Apache Software Foundation. Apache Taverna. 2016; https://taverna.incubator.apache.org/."
Relation is Part of 2016 Research Reproducibility Conference & Lectures
Publisher Spencer S. Eccles Health Sciences Library, University of Utah
Date Digital 2016
Date 2016
Format application/pdf
Rights Management Copyright 2016. For further information regarding the rights to this collection, please visit: https://NOVEL.utah.edu/about/copyright
Language eng
ARK ark:/87278/s68s92ks
Type Text
Setname ehsl_rr
ID 1400673
Reference URL https://collections.lib.utah.edu/ark:/87278/s68s92ks
Back to Search Results