Publication Date:
2019
abstract:
The continuous growth of experimental data generated by Next Generation Sequencing (NGS) machines has led to the adoption of advanced techniques to intelligently manage them. The advent of the Big Data era posed new challenges that led to the development of novel methods and tools, which were initially born to face with computational science problems, but which nowadays can be widely applied on biomedical data. In this work, we address two biomedical data management issues: (i) how to reduce the redundancy of genomic and clinical data, and (ii) how to make this big amount of data easily accessible. Firstly, we propose an approach to optimally organize genomic and clinical data by taking into account data redundancy and propose a method able to save as much space as possible by exploiting the power of no-SQL technologies. Then, we propose design principles for organizing biomedical data and make them easily accessible through the development of a collection of Application Programming Interfaces (APIs), in order to provide a flexible framework that we called OpenOmics. To prove the validity of our approach, we apply it on data extracted from The Genomic Data Commons repository. OpenOmics is free and open source for allowing everyone to extend the set of provided APIs with new features that may be able to answer specific biological questions. They are hosted on GitHub at the following address https://github.com/fabio-cumbo/open-omics-api/, publicly queryable at http://bioinformatics.iasi.cnr.it/openomics/ api/routes, and their documentation is available at https://openomics. docs.apiary.io/.
Iris type:
01.01 Articolo in rivista
Keywords:
[object Object; [object Object; [object Object; [object Object; [object Object
List of contributors:
Cumbo, Fabio; Cappelli, Eleonora
Published in: