Introduction: Breast cancer is characterized by a highly heterogeneous cellular environment composed of diverse malignant clones and components of the tumor microenvironment (TME) that collectively influence disease progression. Single-cell RNA sequencing (scRNA-seq) offers a powerful tool to dissect this complexity, enabling high-resolution characterization of tumor heterogeneity and functional interactions within the TME. Moreover, it supports the discovery of clinically relevant subpopulations and potential therapeutic targets. Methods: In this study, we present a novel scRNA-seq dataset from an infiltrating ductal breast cancer, profiling over 5,000 cells and identifying six distinct clusters spanning cancer and TME populations. To explore the molecular drivers of cell state transitions, we integrate pseudotime trajectory inference with interpretable, tree-based machine learning. This combined approach enables the identification of key genes and expression thresholds associated with dynamic phenotypic shifts. Results: Our analysis identified six distinct cellular clusters representing both malignant and TME populations. The integration of pseudotime inference with interpretable machine learning uncovered key genes and specific expression thresholds associated with transcriptional reprogramming and dynamic phenotypic transitions during tumor evolution. Discussion: Unlike black-box models, our framework provides transparent, rule-based insights into transcriptional reprogramming processes underlying tumor progression. The resulting dataset, together with an accessible and transparent analytical pipeline, represents a valuable resource for the breast cancer research community and establishes a foundation for future studies aimed at refining molecular classification and advancing precision therapy development.
Integrating trajectory inference and self-explainable predictive models to explore cell state transitions in breast cancer at single-cell resolution
Lappano, Rosamaria;
2026-01-01
Abstract
Introduction: Breast cancer is characterized by a highly heterogeneous cellular environment composed of diverse malignant clones and components of the tumor microenvironment (TME) that collectively influence disease progression. Single-cell RNA sequencing (scRNA-seq) offers a powerful tool to dissect this complexity, enabling high-resolution characterization of tumor heterogeneity and functional interactions within the TME. Moreover, it supports the discovery of clinically relevant subpopulations and potential therapeutic targets. Methods: In this study, we present a novel scRNA-seq dataset from an infiltrating ductal breast cancer, profiling over 5,000 cells and identifying six distinct clusters spanning cancer and TME populations. To explore the molecular drivers of cell state transitions, we integrate pseudotime trajectory inference with interpretable, tree-based machine learning. This combined approach enables the identification of key genes and expression thresholds associated with dynamic phenotypic shifts. Results: Our analysis identified six distinct cellular clusters representing both malignant and TME populations. The integration of pseudotime inference with interpretable machine learning uncovered key genes and specific expression thresholds associated with transcriptional reprogramming and dynamic phenotypic transitions during tumor evolution. Discussion: Unlike black-box models, our framework provides transparent, rule-based insights into transcriptional reprogramming processes underlying tumor progression. The resulting dataset, together with an accessible and transparent analytical pipeline, represents a valuable resource for the breast cancer research community and establishes a foundation for future studies aimed at refining molecular classification and advancing precision therapy development.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


