Pathformer: a hierarchical vision transformer for pan-cancer grade classification, survival prediction, and biomarker status inference from whole-slide histopathology images

https://doi.org/10.55529/jaimlnn.61.112.126

Authors

  • Zayyanu Yunusa Computer Science, Iconic Open University of Nigeria, Bakura, Nigeria.

Keywords:

Computational Pathology, Vision Transformer, Whole-Slide Image, Multiple Instance Learning, Survival Prediction, Biomarker Inference.

Abstract

Computational pathology is one of the key areas of artificial intelligence (AI) that is able to assist with the analysis of large and complex whole slide images (WSIs) that visual/naked-eye analysis is typically challenging for pathologists. Current approaches for WSI analysis using deep learning, however, still present a number of limitations, such as the inability to process gigapixel WSIs as a whole, problems accounting for spatial context of WSI patches, and relying on using a single model to optimize one clinical goal. This research aims to solve these problems while presenting PathFormer, a hierarchical vision Transformer, specifically tailored for efficient and interpretable WSI analysis. PathFormer features a windowed, self-attention mechanism with 32 non-overlapping, small patches in lower layers and global attention in higher layers, with computational complexity O(N log N). The gated attention-based multiple instance learning (MIL) aggregator provides a slide-level representation for variable patch sequences (from 8,000 to 25,000 patches/slide). A total of 4,312 WSIs were used to train and validate the model; all obtained from The Cancer Genome Atlas (TCGA) spanning seven cancer types, and 1,247 WSIs were used for external validation from CPTAC and institutional cohorts. Strong performance on multiple clinical tasks was shown. PathFormer obtained a mean AUROC of 0.941 for cancer-grades classification which was significantly higher than the mean AUROC of 0.921 given by TransMIL (p < 0.01). For survival prediction, it had a mean C-index between 0.774 and 0.812, superior to SurvTRACE (0.748). The model was also trained with the same data on MSI-H status prediction with an AUROC of 0.924 and IDH1 mutation inference with an AUROC of 0.938. In addition, 79% of the activation maps produced by the model correlated with the pathologist annotations, which indicates good interpretability. In summary, PathFormer offers a single interpretable and convenient framework for computational pathology.

Published

2026-05-04

How to Cite

Zayyanu Yunusa. (2026). Pathformer: a hierarchical vision transformer for pan-cancer grade classification, survival prediction, and biomarker status inference from whole-slide histopathology images. Journal of Artificial Intelligence,Machine Learning and Neural Network , 6(1), 112–126. https://doi.org/10.55529/jaimlnn.61.112.126

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.