Through an adequate survey of the history of the disease, Narrative Medicine (NM) aims to allow the definition and implementation of an effective, appropriate, and shared treatment path. In the present study different topic modeling techniques are compared, as Latent Dirichlet Allocation (LDA) and topic modeling based on BERT transformer, to extract meaningful insights in the Italian narration of COVID-19 pandemic. In particular, the main focus was the characterization of Post-acute Sequelae of COVID-19, (i.e., PASC) writings as opposed to writings by health professionals and general reflections on COVID-19, (i.e., non-PASC) writings, modeled as a semi-supervised task. The results show that the BERTopic-based approach outperforms the LDA-base approach by grouping in the same cluster the 97.26% of analyzed documents, and reaching an overall accuracy of 91.97%.

Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration

Scarpino, Ileana;Zucco, Chiara;Vallelunga, Rosarina;Luzza, Francesco;Cannataro, Mario
2022-01-01

Abstract

Through an adequate survey of the history of the disease, Narrative Medicine (NM) aims to allow the definition and implementation of an effective, appropriate, and shared treatment path. In the present study different topic modeling techniques are compared, as Latent Dirichlet Allocation (LDA) and topic modeling based on BERT transformer, to extract meaningful insights in the Italian narration of COVID-19 pandemic. In particular, the main focus was the characterization of Post-acute Sequelae of COVID-19, (i.e., PASC) writings as opposed to writings by health professionals and general reflections on COVID-19, (i.e., non-PASC) writings, modeled as a semi-supervised task. The results show that the BERTopic-based approach outperforms the LDA-base approach by grouping in the same cluster the 97.26% of analyzed documents, and reaching an overall accuracy of 91.97%.
2022
BERTopic
LDA
narrative medicine
text mining
topic modeling
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12317/79997
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 7
social impact