Home Media News
News October 07, 2025 2 min read
Technical vs. Biological Diversity: Designing for Generalization

Technical vs. Biological Diversity: Designing for Generalization

AI in pathology can achieve high accuracy in controlled environments but often struggles to generalize across scanners, stains, and patient populations. This article explores why both technical and biological diversity are critical for reliable cancer diagnostics.

P
PAICON
From Data to Diagnostics
AI Model Generalization Data Diversity in Pathology Precision Oncology
Share:

AI in pathology has achieved incredible milestones like detecting cancer patterns, classifying tissue types, and even predicting molecular biomarkers from digital slides. Yet, despite its rapid progress, a persistent challenge remains: ensuring that models trained under specific technical conditions can perform consistently across all laboratories and scanners.

In other words, success in precision medicine depends on one thing above all: generalization.

When Technical Diversity Becomes a Challenge

Pathology data are far from uniform. Whole-slide images can differ substantially in resolution, staining intensity, or color tone depending on the laboratory environment, scanner type, or technician expertise. Even subtle differences in tissue thickness or how a slide is digitized can shift pixel values enough to change how an AI model interprets the image.

Such inconsistencies often described as batch effects can cause models to rely on technical cues rather than genuine biological features. For example, a convolutional neural network trained to identify cancer in histopathology slides might associate a particular stain hue or scanner artifact with malignancy, achieving deceptively high internal accuracy but failing when applied to slides from other laboratories.

Recent studies have underscored this risk. Stacke et al. demonstrated that deep learning models trained on histopathology images from one scanner significantly underperform when evaluated on images from other scanners, even within the same cancer type [1]. Similarly, Tellez et al. showed that color normalization and stain augmentation techniques substantially improve cross-domain performance, highlighting the sensitivity of AI to staining variations [2].

These findings make one message clear: technical diversity in pathology imaging can undermine reproducibility unless actively managed through harmonization, standardization, and quality control practices.

Beyond the Scanner: The Biological Dimension

While technical harmonization can stabilize model performance, biological diversity introduces an additional dimension of complexity. Differences in patient demographics such as ethnicity, age, and sex can influence tumor morphology and disease expression.

Models trained predominantly on homogeneous cohorts may fail to generalize to underrepresented populations, leading to uneven diagnostic outcomes. Achieving equity in AI-based diagnostics therefore requires datasets that capture both technical and biological diversity reflecting not only how the data are produced, but also who they represent.

From Diversity to Equity in AI

Technical and biological diversity are not separate problems; they are complementary challenges that must be addressed together to achieve trustworthy and generalizable AI. Models that perform reliably across scanners, stains, and patient backgrounds are essential for making precision oncology accessible to all.

At PAICON, we work to close this gap through robust data harmonization, multi-site validation, and inclusion of globally sourced datasets. These efforts aim to ensure that AI tools for cancer diagnostics perform consistently in every clinical environment.

Learn more about how PAICON addresses data diversity and validation in AI-driven oncology here

References

  1. Stacke K, Eilertsen G, Unger J, Lundström C. Measuring domain shift for deep learning in histopathology. IEEE J Biomed Health Inform. 2020;24(11):3253–62.

  2. Tellez D, Litjens G, Bándi P, Bulten W, Bokhorst JM, Ciompi F, van der Laak J. Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology. Med Image Anal. 2019;58:101544.

Subscribe to Our Monthly Newsletter

Each month, we will send key data updates, stories from the field, and new research on inclusive oncology AI.

We respect your privacy. Unsubscribe at any time.