Finding Prognostic Patterns in Gigapixel Images
Risk-stratifying patients through deep learning
From detecting and classifying cells and tissue to predicting biomarkers and patient outcomes, computational pathology applications are becoming increasingly complex. Simpler tasks rely upon pathologists’ annotations of specific features in the tissue – but biomarkers and outcomes are more complex. Algorithms must decipher whole-slide images without any prior knowledge of which characteristics or tissue regions are important. And the task becomes even more difficult when machines are asked to forecast patients’ prognoses over time.
Risk stratification can already be done using cancer staging, molecular features, or clinical variables. Improving prognostic insights – predicting the likely outcome for a patient following the standard treatment – is a greater challenge and an active area of research. Take, for example, ductal carcinoma in situ (DCIS), a pre-invasive form of breast cancer. Many such cancers do not become invasive – but which ones will? There is a great deal of interobserver variability amongst pathologists assessing such lesions (1,2). But researchers at Georgia State University have developed an algorithm that can predict the risk of local recurrence of DCIS within 10 years using digitized whole-slide images (3). Risk stratification could also involve forecasting tasks; for example, predicting whether a distant metastasis will occur or how long a patient is likely to live. Regardless of the target, the challenges in creating such algorithms are similar.
H&E whole slide images are large and tissue appearance is diverse. Unlike methods to find mitoses or segment tissue types, pathologists cannot annotate which regions of the tissue are associated with patient outcome – at least not with any degree of certainty.
Capturing prognostic patterns
Traditional approaches to predicting patient outcomes from histopathology mimic the work of a pathologist. They take features that pathologists already use to stratify patients and create algorithms to automate the extraction of these characteristics. For instance, both human pathologists and automated algorithms will demonstrate an association between cellular diversity and lower survival rates in non-small cell lung cancer (4). One major advantage of this approach is interpretability; because pathologists already understand what these features look like, the results from automated methods make sense.
If pathologists already had a superior ability to predict patient outcomes from histopathology, this approach would be great. However, the visual complexity of these gigapixel whole-slide images presents a challenge because of their huge size, intricate appearance, and internal heterogeneity. So if pathologists cannot reliably stratify tumors by risk, an algorithm that takes the same approach is unlikely to do much better.
Enter deep learning. Over the last decade, deep learning has revolutionized speech recognition, language translation, facial recognition, and many other tasks. The keys to its success are large amounts of data and the end-to-end method for learning models. Image features don’t need to be predefined; the model learns them itself. It can learn complex and abstract properties even beyond the capabilities of human visual processing – all based on a training set of labeled images.
Such models learn to extract patterns that are predictive of some target – for instance, whether a tumor is low- or high-grade. For outcome prediction, the target could be the time to a particular event – such as cancer recurrence or death.
Of course, many factors determine how long a patient lives. The morphology of their tumor is only one piece of the puzzle. Predicting the time to event (recurrence or death) directly is challenging for two additional reasons: only a small fraction of patients will experience the event during the study and not all patients who experience the given event will be recorded. Instead of predicting exactly how long a patient is likely to live, most survival models take a contrastive approach: is patient A likely to live longer than patient B? If the model incorrectly predicts which patient lived longer, it gets penalized. From each incorrect prediction, the model adapts to perform slightly better for subsequent examples.
From hypothesis to risk prediction
Deep learning models consist of multiple layers, with higher-level concepts built upon lower-level ones. Each layer of the network has a set of weights that are used to compute a representation for the next layer. The weights are like a hypothesis regarding what properties to look for in the image. Models often have over 100 layers and, across all layers, upwards of 10 million weights to tune.
After passing an image into the network, each layer computes a new representation based on the output from the previous layer. At the end of the network, it predicts the target – in this case, a patient risk score. The goal in training the network is to minimize erroneous predictions.
Training a model involves adjusting each weight a little bit at a time to lower the total error. This way, the model improves its hypothesis about which image properties are important. With each new patient image and the associated survival time, the model gets a little better. After seeing many images – and artificial variations of each to provide extra examples – the model learns to predict patient risk.
But there is one more challenge in handling gigapixel histopathology images – their size. Whole-slide images can be more than 100,000 pixels across. End-to-end training is not possible with images this large because they won’t fit on the graphics processing unit that trains deep learning models. Most solutions to this problem break the images into small patches. In some studies, a (human) pathologist identifies tumor regions for training, whereas in others, the deep learning model is trained on all tissue patches (5). Another approach is to first cluster the patches by visual appearance, then use a subset of patches from each cluster (6). Regardless of the chosen approach, risk predictions from the patches must then be aggregated to form a final risk score for the patient. Often, this is done by a model that learns to select the most informative patches.
The power of histopathology
Histopathology brings with it some unique challenges – and advantages. Outcome prediction models specific to histopathology data have been built to overcome those challenges. Federated learning models, for instance, can handle datasets located in different centers to preserve privacy (7). And models can learn to predict survival across multiple types of cancer simultaneously (8,9).
Histopathology is, of course, not the only modality with a demonstrated ability to predict patient outcomes. Whole-slide images can be combined with genomic and clinical features to improve outcome predictions, all within the same deep learning model (9,10,11). Some pan-cancer studies have shown that clinical data and gene expression are most beneficial in predicting prognosis and that histopathology features provided no additional predictive power (9,12). However, deep learning-based methods for whole-slide images are still a new innovation – and they have yet to reach their full potential.
Histopathology provides a unique perspective that a single genomic profile cannot: a spatial view of the tumor. Researchers are just beginning to understand the role that intratumoral heterogeneity plays in tumor progression (13,14,15). These spatial variations can be captured from images far more efficiently than with genomic profiling.
Although interpretability is still a challenge for deep learning models, they also benefit from the spatial variations of whole-slide images by indicating which regions of the tissue are most associated with a poor outcome. In some cases, the highlighted regions are not even in the tumor itself, but in the adjacent stroma (16) – information that can provide new insights for pathologists.
H&E histology is a routine part of the pathology pipeline. It is cheaper and faster than molecular analyses. As the transition to digital pathology accelerates, these whole-slide images provide many new opportunities to capitalize on artificial intelligence. Prognostic models with deep learning are promising, even if they are just beginning to show their potential. Perhaps all we need is a larger dataset to allow us to find the most prognostic patterns in these gigapixel images.
- MR Van Bockstal et al., “Interobserver variability in ductal carcinoma in situ of the breast,” Am J Clin Pathol, 154, 596 (2020). PMID: 32566938.
- EJ Groen et al., “Prognostic value of histopathological DCIS features in a large-scale international interrater reliability study,” Breast Cancer Res Treat, 183m 759 (2020). PMID: 32734520.
- S Klimov et al., “A whole slide image-based machine learning approach to predict ductal carcinoma in situ (DCIS) recurrence risk,” Breast Cancer Res, 21, 83 (2019). PMID: 31358020.
- C Lu et al., “A prognostic model for overall survival of patients with early-stage non-small cell lung cancer: a multicentre, retrospective study,” Lancet Digit Health, 2, e594 (2020). PMID: 33163952.
- AZ Shirazi et al., “DeepSurvNet: deep survival convolutional network for brain cancer survival rate classification based on histopathological images,” Med Biol Eng Comput, 58, 1031 (2020). PMID: 32124225.
- J Yao et al., “Whole slide images based cancer survival prediction using attention guided deep multiple instance learning networks,” Med Image Anal, 65, 101789 (2020). PMID: 32739769.
- M Andreux et al., “Federated survival analysis with discrete-time Cox models” (2020). Available at: https://bit.ly/3agkRLe.
- E Wulczyn et al., “Deep learning-based survival prediction for multiple cancer types using histopathology images,” PLoS One, 15, e0233678 (2020). PMID: 32555646.
- LA Vale-Silva, K Rohr, “MultiSurv: Long-term cancer survival prediction using multimodal deep learning” (2020). Available at: https://bit.ly/3r1MBJA.
- J Hao et al., “PAGE-Net: Interpretable and integrative deep learning for survival analysis using histopathological images and genomic data,” Pac Symp Biocomput, 25, 355 (2020). PMID: 31797610.
- RJ Chen et al. "Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis,” IEEE Trans Medical Imaging, [Online ahead of print] (2020). PMID: 32881682.
- T Zhong et al., “Examination of independent prognostic power of gene expressions and histopathological imaging features in cancer,” Cancers (Basel), 11, 361 (2019). PMID: 30871256.
- AA Alizadeh et al., “Toward understanding and exploiting tumor heterogeneity,” Nat Med, 21, 846 (2015). PMID: 26248267.
- N McGranahan, C Swanton, “Biological and therapeutic impact of intratumor heterogeneity in cancer evolution,” Cancer Cell, 27, 15 (2015). PMID: 25584892.
- R Natrajan et al., “Microenvironmental heterogeneity parallels breast cancer progression: a histology–genomic integration analysis,” PLoS Med, 13, e1001961 (2016). PMID: 26881778.
- P Courtiol et al., “Deep learning-based classification of mesothelioma improves prediction of patient outcome,” Nat Med, 25, 1519 (2019). PMID: 31591589.
Founder of machine learning consulting firm Pixel Scientia Labs, which solves image analysis tasks for pathology applications. She recently completed a doctoral degree in Computer Science at the University of North Carolina at Chapel Hill, North Carolina, USA.