MOTIVATIONS The availability of high-throughput technologies and the application of genomics and pharmacogenomics studies of large populations, are producing an increasing amount of experimental and clinical data, as well as specialized databases spread over the Internet. The storage, preprocessing and analysis of experimental data is becoming the main bottleneck of the analysis pipeline. Managing omics data requires both space for data storing as well as services for data preprocessing, analysis, and sharing. The resulting scenario comprises a set of bioinformatics tools, often implemented as web services, for the management and analysis of data stored in geographically distributed biological databases [1]. Cloud computing may play an important role in many phases of the bioinformatics analysis pipeline, from data management and processing, to data integration and analysis, including data exploration and visualization because it offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, thus it may represent the key technology for facing those issues [2]. METHODS This work reviews main academic and industrial cloud-based bioinformatics solutions developed in the recent years; moreover, it underlines main issues and problems related to the use of such platforms for the storage and analysis of patients’ data. Specifically, the analysed solutions regard: - Data as a Service (DaaS): it provides data storage in a dynamic virtual space hosted by the cloud and allows to have updated data that are accessible from a wide range of connected devices on the web. - Software as a Service (SaaS): several cloud-based tools to execute different bioinformatics tasks, e.g. mapping applications, sequences alignment, gene expression analysis have been proposed and made available. - Platform as a Service (PaaS): unlike SaaS solutions, PaaS solutions allow users to customize the deployment of bioinformatics applications as well as to retain complete control over their instances and associated data. - Infrastructure as a Service (IaaS): this service model is offered in a computing infrastructure that includes servers (typically virtualized) with specific computational capability and/or storage. The user controls all the deployed storage resources, operating systems and bioinformatics applications. For each analysed solution, main technical characteristics as well as security and privacy issues arising when storing and analysing patients data, are reported. RESULTS The application of cloud computing in bioinformatics regards the efficient storage, retrieval and integration of experimental data and their efficient and high-throughput preprocessing and analysis.
Cloud Computing in Bioinformatics: current solutions and challenges
Cannataro M
2016-01-01
Abstract
MOTIVATIONS The availability of high-throughput technologies and the application of genomics and pharmacogenomics studies of large populations, are producing an increasing amount of experimental and clinical data, as well as specialized databases spread over the Internet. The storage, preprocessing and analysis of experimental data is becoming the main bottleneck of the analysis pipeline. Managing omics data requires both space for data storing as well as services for data preprocessing, analysis, and sharing. The resulting scenario comprises a set of bioinformatics tools, often implemented as web services, for the management and analysis of data stored in geographically distributed biological databases [1]. Cloud computing may play an important role in many phases of the bioinformatics analysis pipeline, from data management and processing, to data integration and analysis, including data exploration and visualization because it offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, thus it may represent the key technology for facing those issues [2]. METHODS This work reviews main academic and industrial cloud-based bioinformatics solutions developed in the recent years; moreover, it underlines main issues and problems related to the use of such platforms for the storage and analysis of patients’ data. Specifically, the analysed solutions regard: - Data as a Service (DaaS): it provides data storage in a dynamic virtual space hosted by the cloud and allows to have updated data that are accessible from a wide range of connected devices on the web. - Software as a Service (SaaS): several cloud-based tools to execute different bioinformatics tasks, e.g. mapping applications, sequences alignment, gene expression analysis have been proposed and made available. - Platform as a Service (PaaS): unlike SaaS solutions, PaaS solutions allow users to customize the deployment of bioinformatics applications as well as to retain complete control over their instances and associated data. - Infrastructure as a Service (IaaS): this service model is offered in a computing infrastructure that includes servers (typically virtualized) with specific computational capability and/or storage. The user controls all the deployed storage resources, operating systems and bioinformatics applications. For each analysed solution, main technical characteristics as well as security and privacy issues arising when storing and analysing patients data, are reported. RESULTS The application of cloud computing in bioinformatics regards the efficient storage, retrieval and integration of experimental data and their efficient and high-throughput preprocessing and analysis.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.