Extracting the Right Data for Patient Care
Hidden information in digital H&E images can revolutionize pathology-oncology crosstalk
Applying artificial intelligence (AI) to digital pathology demonstrably streamlines workflows. Image quality improves, image acquisition and viewing are more efficient, teaching and research are augmented, and clinical sample testing is simplified. Nevertheless, significant challenges remain for deployment of AI solutions in digital pathology (1) – and the biggest of these is noise.
“Technical” noise stems from variability in slide and image preparation; sources include debris or contaminants, tissue tears or folds, retraction artifacts, hematoxylin and eosin (H&E) staining protocol variations, differences in staining intensity (e.g., due to tissue thickness or local image defocusing/aberration) and format variations between scanner platforms. “Biological” noise, by contrast, arises from tumor heterogeneity – and therefore harbors clinically useful information. In a perfect world, AI-based digital pathology tools would cancel out technical noise while capturing the diagnostic value hidden in biological noise. How close are we to this ideal?
Unfortunately, standard AI techniques cannot accommodate the pervasive variability of H&E histopathology images; the systematics and abstraction capabilities of current deep learning algorithms are inadequate. Furthermore, if we attempt to break down variability – for instance, from a cell or tissue perspective – we generate a huge number of patterns. This in itself causes significant problems, because we cannot collect enough structured or labeled data of sufficient quality to account for all such variation. Single-task-oriented AI algorithms with binary discrimination capabilities are already challenged by the demands of a multi-task-oriented diagnostic environment; data overload further complicates the situation.
Note, too, that deep learning algorithms use network architectures optimized for fast, heavy parallel computation of spatial and temporal correlations using layers of feed-forward or feedback loops. This means they are not necessarily optimized for pathologists’ needs – the efficient interrogation of complexities associated with tumor biology – and consequently do not support rational decision-making. Moreover, their reliance on graphics processing unit (GPU) clusters and heavy-duty clinical system integration makes them costly.
Clearly, we need a new approach: an analytical framework that accommodates biological noise by quantitatively addressing tumor oncogenic vulnerabilities from the perspective of an objective automated tool with either no or minimal human interaction – other than as quality management of source images.
Beyond the ability of minimizing all the technical noises at the first step, such a platform would need to have the capability of extracting the key tumor microenvironment interactions and the biologic signature of crucial oncogenic drivers from H&E biopsy or resection images as pan-cancer digital biomarkers. These biomarkers need to be explainable, continuous scales that are open to independent validation by direct or indirect orthogonal tests. They also need to be extracted rapidly (preferably in near real-time appropriate to the clinical tasks in hand) without demanding enormous computing power.
By their very nature, tumors are inherently biologically heterogenous entities contributing to the diagnostic challenges faced by pathologists as part of routine reporting. These include tumor grading, staging, and prognostication. Furthermore, evaluation of host tumor responses (e.g., quantifying tumor lymphocyte infiltrates) and certain key molecular subtypes, which have a bearing on both treatment selection and response (e.g., HER2, Ki67, PD-1/PDL1), are vulnerable to this variability. As a result, any diagnostic solution that can address heterogeneity would offer invaluable support to such diagnostically challenging scenarios.
Tumor-infiltrating lymphocyte grade and certain treatment-initiating key molecular profiles (such as HER2, Ki67, or PD-1/PDL1) are affected by biological variability. No doubt, such a tool would be especially valuable in these cases.
In summary, interrogating tumor biology to extract the right kind of hidden data returns information of great value to pathologists and oncologists. With no input beyond pre-treatment biopsies or resection whole-slide images, it delivers outputs that support rational diagnostic and therapeutic decisions. This next step forward in AI-based diagnostic support represents a true patient-centric democratization of digital pathology, avoiding ancillary testing and tissue requirements while still offering universal accessibility, faster turnaround times, better affordability and, crucially, greater diagnostic accuracy for patients.
- HR Tizhoosh, L Pantanowitz, “Artificial intelligence and digital pathology: challenges and opportunities,” J Pathol Inform, 9, 38 (2018). PMID: 30607305.
Founder and Chief Scientist at 4D Path, Newton, Massachusetts, USA.
Founder and CKO/CTO at 4D Path, Newton, Massachusetts, USA.