The novel COVID-19 pandemic has posed unprecedented challenges to the society and the health sector all over the globe. Here, we present a new network-based methodology to analyze COVID-19 data measures and its application on a real dataset. The goal of the methodology is to analyze set of homogeneous datasets (i.e. COVID-19 data in several regions) using a statistical test to find similar/dissimilar dataset, mapping such similarity information on a graph and then using community detection algorithm to visualize and analyze the initial dataset. The methodology and its implementation as R function are publicly available at https://github.com/mmilano87/analyzeC19D. We evaluated diverse Italian COVID-19 data made publicly available by the Italian Protezione Civile Department at https://github.com/pcm-dpc/COVID-19/. We considered the data provided for each Italian region in two periods February 24-April 26, 2020 (1st wave), and September 28-November 29, 2020 (2nd wave) and then we compared two periods. Similarity matrices of Italian regions for ten COVID-19 data measures are built by using statistical analysis; then they are mapped to undirected networks. Each node represents an Italian region and an edge connects statistically similar regions. Finally, clusters of regions with similar behaviour were found using network-based community detection algorithms. Experiments depict the communities formed by Italian regions over time and the communities change with respect to the ten data measures and time.
A novel network-based methodology for analysis of COVID-19 data
Milano M.;Cannataro M.
2021-01-01
Abstract
The novel COVID-19 pandemic has posed unprecedented challenges to the society and the health sector all over the globe. Here, we present a new network-based methodology to analyze COVID-19 data measures and its application on a real dataset. The goal of the methodology is to analyze set of homogeneous datasets (i.e. COVID-19 data in several regions) using a statistical test to find similar/dissimilar dataset, mapping such similarity information on a graph and then using community detection algorithm to visualize and analyze the initial dataset. The methodology and its implementation as R function are publicly available at https://github.com/mmilano87/analyzeC19D. We evaluated diverse Italian COVID-19 data made publicly available by the Italian Protezione Civile Department at https://github.com/pcm-dpc/COVID-19/. We considered the data provided for each Italian region in two periods February 24-April 26, 2020 (1st wave), and September 28-November 29, 2020 (2nd wave) and then we compared two periods. Similarity matrices of Italian regions for ten COVID-19 data measures are built by using statistical analysis; then they are mapped to undirected networks. Each node represents an Italian region and an edge connects statistically similar regions. Finally, clusters of regions with similar behaviour were found using network-based community detection algorithms. Experiments depict the communities formed by Italian regions over time and the communities change with respect to the ten data measures and time.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.