Genes/proteins are essential to activate or inhibit biological pathways both inside or outside the cells in each living organism. The key to understand the functional roles of genes/proteins is the deduction of the relationship between pathways and genes/proteins. To understand the role of genes/proteins in a biological context, we can use pathway enrichment analysis (PEA), an essential method in omics research, to identify the biological role of genes/proteins. A large number of PEA methods and tools are available; nevertheless, only a few can perform PEA exploiting information coming from multiple databases in the same analysis. Many of these databases were initially developed to use their pathway representation format, resulting in a heterogeneous collection of resources that are extremely difficult to combine and use. Soft computing enables approximate solutions for problems challenging to solve precisely, such as merging and integrating structured and unstructured data, or data from different databases. The integration and merging of biological pathways from diverse data sources are challenging due to the different pathway data representations used. The use of parallel preprocessing methods to deal with approximation and imprecision can contribute to integrate heterogeneous pathway data. We implemented an automatic methodology to perform PEA using pathways coming from different databases and a method to compute topological scores to rank enriched pathways. This methodology is available in a software framework called cross-pathway enrichment analysis. The obtained results show good performance in terms of execution times and reduced memory consumption, allowing to improve PEA by using pathways coming from different databases.
cPEA: a parallel method to perform pathway enrichment analysis using multiple pathways databases
Agapito, G.
;Cannataro, M.
2020-01-01
Abstract
Genes/proteins are essential to activate or inhibit biological pathways both inside or outside the cells in each living organism. The key to understand the functional roles of genes/proteins is the deduction of the relationship between pathways and genes/proteins. To understand the role of genes/proteins in a biological context, we can use pathway enrichment analysis (PEA), an essential method in omics research, to identify the biological role of genes/proteins. A large number of PEA methods and tools are available; nevertheless, only a few can perform PEA exploiting information coming from multiple databases in the same analysis. Many of these databases were initially developed to use their pathway representation format, resulting in a heterogeneous collection of resources that are extremely difficult to combine and use. Soft computing enables approximate solutions for problems challenging to solve precisely, such as merging and integrating structured and unstructured data, or data from different databases. The integration and merging of biological pathways from diverse data sources are challenging due to the different pathway data representations used. The use of parallel preprocessing methods to deal with approximation and imprecision can contribute to integrate heterogeneous pathway data. We implemented an automatic methodology to perform PEA using pathways coming from different databases and a method to compute topological scores to rank enriched pathways. This methodology is available in a software framework called cross-pathway enrichment analysis. The obtained results show good performance in terms of execution times and reduced memory consumption, allowing to improve PEA by using pathways coming from different databases.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.