A lightweight method for designing and massive processing of genomics pipelines

IRIS

The analysis of genomics data involves multistep complex tasks that, in addition to consume a substantial amount of computational resources, require intervention by the operator for execution. A pipeline, comprising preconfigured steps, is designed to maximize efficiency and reproducibility. In this context, pipelines facilitate the automation and orchestration of complex processes, minimizing the need for manual intervention. In this paper, we propose a FLexible ENgine for massive processing of Pipelines in genomics (FLENP). It orchestrates the processing according to the steps defined in an ad-hoc template, by maximizing the CPU usage during the computation. Our method exhibited high robustness and an inherent flexibility. Furthermore, it has proven not to be expensive in terms of memory, and it does not introduce a significant latency during the execution. We also evaluated its performance by designing a template on a real use case for the Transcript-level Quantification of RNA-seq experiments.

A lightweight method for designing and massive processing of genomics pipelines

Cinaglia P.

2024-01-01

Abstract

The analysis of genomics data involves multistep complex tasks that, in addition to consume a substantial amount of computational resources, require intervention by the operator for execution. A pipeline, comprising preconfigured steps, is designed to maximize efficiency and reproducibility. In this context, pipelines facilitate the automation and orchestration of complex processes, minimizing the need for manual intervention. In this paper, we propose a FLexible ENgine for massive processing of Pipelines in genomics (FLENP). It orchestrates the processing according to the steps defined in an ad-hoc template, by maximizing the CPU usage during the computation. Our method exhibited high robustness and an inherent flexibility. Furthermore, it has proven not to be expensive in terms of memory, and it does not introduce a significant latency during the execution. We also evaluated its performance by designing a template on a real use case for the Transcript-level Quantification of RNA-seq experiments.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Parole chiave
	
				Bioinformatics
Genomics
Parallel processing
Pipelines
Workflow
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12317/102055

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

3

3

social impact