Motivation: Prediction of node and graph labels are prominent network science tasks. Data analyzed in these tasks are sometimes related: entities represented by nodes in a higher-level (higher scale) network can themselves be modeled as networks at a lower level. We argue that systems involving such entities should be integrated with a 'network of networks' (NoNs) representation. Then, we ask whether entity label prediction using multi-level NoN data via our proposed approaches is more accurate than using each of single-level node and graph data alone, i.e. than traditional node label prediction on the higher-level network and graph label prediction on the lower-level networks. To obtain data, we develop the first synthetic NoN generator and construct a real biological NoN. We evaluate accuracy of considered approaches when predicting artificial labels from the synthetic NoNs and proteins' functions from the biological NoN. Results: For the synthetic NoNs, our NoN approaches outperform or are as good as node- and network-level ones depending on the NoN properties. For the biological NoN, our NoN approaches outperform the single-level approaches for just under half of the protein functions, and for 30% of the functions, only our NoN approaches make meaningful predictions, while node- and network-level ones achieve random accuracy. So, NoN-based data integration is important. Availability and implementation: The software and data are available at https://nd.edu/~cone/NoNs.

Modeling multi-scale data via a network of networks

Guzzi, Pietro Hiram
Conceptualization
;
2022-01-01

Abstract

Motivation: Prediction of node and graph labels are prominent network science tasks. Data analyzed in these tasks are sometimes related: entities represented by nodes in a higher-level (higher scale) network can themselves be modeled as networks at a lower level. We argue that systems involving such entities should be integrated with a 'network of networks' (NoNs) representation. Then, we ask whether entity label prediction using multi-level NoN data via our proposed approaches is more accurate than using each of single-level node and graph data alone, i.e. than traditional node label prediction on the higher-level network and graph label prediction on the lower-level networks. To obtain data, we develop the first synthetic NoN generator and construct a real biological NoN. We evaluate accuracy of considered approaches when predicting artificial labels from the synthetic NoNs and proteins' functions from the biological NoN. Results: For the synthetic NoNs, our NoN approaches outperform or are as good as node- and network-level ones depending on the NoN properties. For the biological NoN, our NoN approaches outperform the single-level approaches for just under half of the protein functions, and for 30% of the functions, only our NoN approaches make meaningful predictions, while node- and network-level ones achieve random accuracy. So, NoN-based data integration is important. Availability and implementation: The software and data are available at https://nd.edu/~cone/NoNs.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12317/75409
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 36
  • ???jsp.display-item.citation.isi??? 24
social impact