Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A pathologist–AI collaboration framework for enhancing diagnostic accuracies and efficiencies

Abstract

In pathology, the deployment of artificial intelligence (AI) in clinical settings is constrained by limitations in data collection and in model transparency and interpretability. Here we describe a digital pathology framework, nuclei.io, that incorporates active learning and human-in-the-loop real-time feedback for the rapid creation of diverse datasets and models. We validate the effectiveness of the framework via two crossover user studies that leveraged collaboration between the AI and the pathologist, including the identification of plasma cells in endometrial biopsies and the detection of colorectal cancer metastasis in lymph nodes. In both studies, nuclei.io yielded considerable diagnostic performance improvements. Collaboration between clinicians and AI will aid digital pathology by enhancing accuracies and efficiencies.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the nuclei.io framework and of the study design.
Fig. 2: Evaluation of the performance of the pathologist–AI collaboration in the PC nuclei.io expert-trained ML study.
Fig. 3: Evaluation of the improvement in pathologist–AI collaboration time in the PC study.
Fig. 4: Study design of an individualized ML model for the detection of CRC LN metastasis.
Fig. 5: Time improvement and results of the individualized ML model.

Similar content being viewed by others

Data availability

The data supporting the results in this study are available within the paper and its Supplementary Information. The deidentified nuclei image patches, and pathologists’ annotations are available at https://huangzhii.github.io/nuclei-HAI. Source data are provided with this paper.

Code availability

The source code of nuclei.io is available at https://huangzhii.github.io/nuclei-HAI.

References

  1. Kirillov, A. et al. Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 4015–4026 (IEEE, 2023).

  2. Gamper, J., Alemi Koohbanani, N., Benet, K., Khuram, A. & Rajpoot, N. PanNuke: An open pan-cancer histology dataset for nuclei instance segmentation and classification. In Digital Pathology. ECDP 2019. Lecture Notes in Computer Science Vol. 11435 (eds Reyes-Aldasoro, C. C. et al.) 11–19 (Springer, 2019).

  3. Kather, J. N. et al. Predicting survival from colorectal cancer histology slides using deep learning: a retrospective multicenter study. PLoS Med. 16, e1002730 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T. J. & Zou, J. A visual–language foundation model for pathology image analysis using medical Twitter. Nat. Med. 29, 2307–2316 (2023).

    Article  CAS  PubMed  Google Scholar 

  5. Lu, M. Y. et al. A visual-language foundation model for computational pathology. Nat. Med. 30, 863–874 (2024).

    Article  CAS  PubMed  Google Scholar 

  6. Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).

    Article  CAS  PubMed  Google Scholar 

  7. Amgad, M. et al. A population-level digital histologic biomarker for enhanced prognosis of invasive breast cancer. Nat. Med. 30, 85–97 (2024).

    Article  CAS  PubMed  Google Scholar 

  8. Jiang, X. et al. End-to-end prognostication in colorectal cancer by deep learning: a retrospective, multicentre study. Lancet Digit. Health 6, e33–e43 (2024).

    Article  CAS  PubMed  Google Scholar 

  9. Liu, Y. et al. Artificial intelligence-based breast cancer nodal metastasis detection: insights into the black box for pathologists. Arch. Pathol. Lab. Med. 143, 859–868 (2019).

    Article  CAS  PubMed  Google Scholar 

  10. Krogue, J. D. et al. Predicting lymph node metastasis from primary tumor histology and clinicopathologic factors in colorectal cancer using deep learning. Commun. Med. 3, 59 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Huang, Z. et al. Artificial intelligence reveals features associated with breast cancer neoadjuvant chemotherapy responses from multi-stain histopathologic images. npj Precis. Oncol. 7, 14 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Yamashita, R. et al. Deep learning model for the prediction of microsatellite instability in colorectal cancer: a diagnostic study. Lancet Oncol. 22, 132–141 (2021).

    Article  PubMed  Google Scholar 

  13. He, J. et al. The practical implementation of artificial intelligence technologies in medicine. Nat. Med. 25, 30–36 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Price, W. N. II, Gerke, S. & Cohen, I. G. Potential liability for physicians using artificial intelligence. JAMA 322, 1765–1766 (2019).

  15. Acs, B., Rantalainen, M. & Hartman, J. Artificial intelligence as the next step towards precision pathology. J. Intern. Med. 288, 62–81 (2020).

    Article  CAS  PubMed  Google Scholar 

  16. Steiner, D. F. et al. Impact of deep learning assistance on the histopathologic review of lymph nodes for metastatic breast cancer. Am. J. Surg. Pathol. 42, 1636–1646 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Kiani, A. et al. Impact of a deep learning assistant on the histopathologic classification of liver cancer. npj Digit. Med. 3, 23 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Challa, B. et al. Artificial intelligence-aided diagnosis of breast cancer lymph node metastasis on histologic slides in a digital workflow. Mod. Pathol. 36, 100216 (2023).

    Article  PubMed  Google Scholar 

  19. Bankhead, P., Loughrey, M. B. & Fernández, J. A. QuPath: open source software for digital pathology image analysis. Sci. Rep. 7, 16878 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 9, 671–675 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Chiu, C. & Clack, N. Napari: a Python multi-dimensional image viewer platform for the research community. Microsc. Microanal. 28(S1), 1576–1577 (2022).

    Article  Google Scholar 

  22. Aubreville, M., Bertram, C., Klopfleisch, R. & Maier, A. SlideRunner—a tool for massive cell annotations in whole slide images. in Bildverarbeitung für die Medizin 2018 (eds Maier, A. et al.) 309–314 (Springer, 2018).

  23. Pocock, J. et al. TIAToolbox as an end-to-end library for advanced tissue image analytics. Commun. Med. 2, 120 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).

    Article  CAS  PubMed  Google Scholar 

  25. MONAI model zoo. GitHub https://github.com/Project-MONAI/model-zoo (2022).

  26. Amgad, M. et al. HistomicsTK. GitHub https://digitalslidearchive.github.io/HistomicsTK/ (2016).

  27. Dietvorst, B. J., Simmons, J. P. & Massey, C. Overcoming algorithm aversion: people will use imperfect algorithms if they can (even slightly) modify them. Manage. Sci. 64, 1155–1170 (2018).

    Article  Google Scholar 

  28. Longoni, C., Bonezzi, A. & Morewedge, C. K. Resistance to medical artificial intelligence. J. Consum. Res. 46, 629–650 (2019).

    Article  Google Scholar 

  29. Medela, A. et al. Few shot learning in histopathological images: reducing the need of labeled data on biological datasets. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) 1860–1864 (IEEE, 2019).

  30. van Rijthoven, M. et al. Few-shot weakly supervised detection and retrieval in histopathology whole-slide images. In Medical Imaging 2021: Digital Pathology Vol. 11603, 137–143 (SPIE, 2021).

  31. Chen, J., Jiao, J., He, S., Han, G. & Qin, J. Few-shot breast cancer metastases classification via unsupervised cell ranking. IEEE/ACM Trans. Comput. Biol. Bioinform. 18, 1914–1923 (2021).

    Article  PubMed  Google Scholar 

  32. Zhu, Z. et al. EasierPath: an open-source tool for human-in-the-loop deep learning of renal pathology. In Interpretable and Annotation-Efficient Learning for Medical Image Computing. IMIMIC 2020, MIL3ID 2020, LABELS 2020 Vol. 12446 (eds Cardoso, J., et al.) 214–222 (Springer, 2020).

  33. Singh, H. & Graber, M. L. Improving diagnosis in health care–the next imperative for patient safety. N. Engl. J. Med. 373, 2493–2495 (2015).

    Article  PubMed  Google Scholar 

  34. Erickson, L. A., Mete, O., Juhlin, C. C., Perren, A. & Gill, A. J. Overview of the 2022 WHO classification of parathyroid tumors. Endocr. Pathol. 33, 64–89 (2022).

    Article  PubMed  Google Scholar 

  35. Budd, S., Robinson, E. C. & Kainz, B. A survey on active learning and human-in-the-loop deep learning for medical image analysis. Med. Image Anal. 71, 102062 (2021).

    Article  PubMed  Google Scholar 

  36. van der Wal, D. et al. Biological data annotation via a human-augmenting AI-based labeling system. npj Digit. Med. 4, 145 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Settles, B. Active Learning Literature Survey (University of Wisconsin-Madison Department of Computer Sciences, 2009); https://digital.library.wisc.edu/1793/60660

  38. Go, H. Digital pathology and artificial intelligence applications in pathology. Brain Tumor Res. Treat. 10, 76–82 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Wen, S. et al. Comparison of different classifiers with active learning to support quality control in nucleus segmentation in pathology images. AMIA Jt. Summits Transl. Sci. Proc. 2017, 227–236 (2018).

    PubMed  Google Scholar 

  40. Hamilton, P. W. et al. Digital pathology and image analysis in tissue biomarker research. Methods 70, 59–73 (2014).

    Article  CAS  PubMed  Google Scholar 

  41. Cheng, J. et al. Integrative analysis of histopathological images and genomic data predicts clear cell renal cell carcinoma prognosis. Cancer Res. 77, e91–e100 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. McQueen, D. B., Perfetto, C. O., Hazard, F. K. & Lathi, R. B. Pregnancy outcomes in women with chronic endometritis and recurrent pregnancy loss. Fertil. Steril. 104, 927–931 (2015).

    Article  PubMed  Google Scholar 

  43. Ryan, E. et al. The menstrual cycle phase impacts the detection of plasma cells and the diagnosis of chronic endometritis in endometrial biopsy specimens. Fertil. Steril. 118, 787–794 (2022).

    Article  CAS  PubMed  Google Scholar 

  44. Kim, H. J. & Choi, G.-S. Clinical Implications of lymph node metastasis in colorectal cancer: current status and future perspectives. Ann. Coloproctol. 35, 109–117 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Kiehl, L. et al. Deep learning can predict lymph node status directly from histology in colorectal cancer. Eur. J. Cancer 157, 464–473 (2021).

    Article  PubMed  Google Scholar 

  46. Khan, A. et al. Computer-assisted diagnosis of lymph node metastases in colorectal cancers using transfer learning with an ensemble model. Mod. Pathol. 36, 100118 (2023).

    Article  PubMed  Google Scholar 

  47. Mescoli, C. et al. Isolated tumor cells in regional lymph nodes as relapse predictors in stage I and II colorectal cancer. J. Clin. Oncol. 30, 965–971 (2012).

    Article  PubMed  Google Scholar 

  48. Tizhoosh, H. R. & Pantanowitz, L. Artificial intelligence and digital pathology: challenges and opportunities. J. Pathol. Inform. 9, 38 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Baxi, V. et al. Association of artificial intelligence-powered and manual quantification of programmed death-ligand 1 (PD-L1) expression with outcomes in patients treated with nivolumab ± ipilimumab. Mod. Pathol. 35, 1529–1539 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Graham, S. et al. Screening of normal endoscopic large bowel biopsies with interpretable graph learning: a retrospective study. Gut 72, 1709–1721 (2023).

    Article  PubMed  Google Scholar 

  51. Alemi Koohbanani, N., Jahanifar, M., Zamani Tajadin, N. & Rajpoot, N. NuClick: a deep learning framework for interactive segmentation of microscopic images. Med. Image Anal. 65, 101771 (2020).

    Article  PubMed  Google Scholar 

  52. Schemmer, M., Kühl, N., Benz, C. & Satzger, G. On the influence of explainable AI on automation bias. Preprint at https://arxiv.org/abs/2204.08859 (2022).

  53. Bond, R. R. et al. Automation bias in medicine: the influence of automated diagnoses on interpreter accuracy and uncertainty when reading electrocardiograms. J. Electrocardiol. 51, S6–S11 (2018).

    Article  PubMed  Google Scholar 

  54. Parikh, R. B., Teeple, S. & Navathe, A. S. Addressing bias in artificial intelligence in health care. JAMA 322, 2377–2378 (2019).

    Article  PubMed  Google Scholar 

  55. Alon-Barkat, S. & Busuioc, M. Human–AI interactions in public sector decision making: ‘automation bias’ and ‘selective adherence’ to algorithmic advice. J. Public Adm. Res. Theory 33, 153–169 (2022).

    Article  Google Scholar 

  56. Schmidt, U., Weigert, M., Broaddus, C. & Myers, G. Cell detection with star-convex polygons. In Medical Image Computing and Computer Assisted InterventionMICCAI 2018. Lecture Notes in Computer Science Vol. 11071 (eds Frangi, A. et al.) 265–273 (Springer, 2018).

  57. Haralick, R. M., Shanmugam, K. & Dinstein, H. I. Textural features for image classification. IEEE Trans. Syst. Man Cybern. SMC-3, 610–621 (1973).

    Article  Google Scholar 

  58. Liu, Z. et al. A ConvNet for the 2020s. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 11966–11976 (IEEE, 2022).

  59. Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (Association for Computing Machinery, 2016).

  60. Mosqueira-Rey, E., Hernández-Pereira, E., Alonso-Ríos, D., Bobes-Bascarán, J. & Fernández-Leal, Á. Human-in-the-loop machine learning: a state of the art. Artif. Intell. Rev. 56, 3005–3054 (2023).

    Article  Google Scholar 

  61. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 (IEEE, 2016).

  62. Deng, J. et al. ImageNet: a large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).

  63. Li, W., Zhu, X. & Gong, S. Harmonious attention network for person re-identification. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 2285–2294 (IEEE, 2018).

  64. McHugh, M. L. Interrater reliability: the kappa statistic. Biochem. Med. 22, 276–282 (2012).

    Article  Google Scholar 

  65. Zou, K. H., Fielding, J. R., Silverman, S. G. & Tempany, C. M. C. Hypothesis testing I: proportions. Radiology 226, 609–613 (2003).

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

J.Z. is supported by the Chan-Zuckerberg Biohub Investigator Award. We thank M. Yuksekgonul and F. Bianchi for their helpful suggestions in improving our manuscript.

Author information

Authors and Affiliations

Authors

Contributions

Z.H. conducted study design, software development, experimental setup, data analysis, data visualization and manuscript writing. E.Y. provided numerous insights and participated in the PC study. J.S. provided numerous insights and participated in the CRC LN study. D.G. provided feedback into both studies and participated in the CRC LN study. F.E. helped collect data for the PC study data and participated in it. B.L. and J.N. participated in both studies. D.B., A.M.D., C.K. and R.R. participated in the most time-consuming CRC LN study. A.G., A.L.C.-G., B.E.H., Y.L., E.E.R., T.B.T. and X.Z. participated in the second most time-consuming PC study. A.F. helped with data collection. E.J.F. and K.S.M. were partially involved in designing the study. T.J.M. and J.Z. oversaw the project, conducted study design, experimental setup, data analysis and manuscript writing.

Corresponding authors

Correspondence to Thomas J. Montine or James Zou.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Biomedical Engineering thanks Jakob Nikolas Kather and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Time comparison for colorectal cancer lymph node identification study.

(a) Overall time comparison between AI-assisted mode and unassisted mode. (b) Time comparison between AI-assisted mode and unassisted mode compared within lymph node positive cases (LN+) and lymph node negative cases (LN-). (c) Time comparison between AI-assisted mode and unassisted mode compared across the 8 pathologists. Note: A few slides were missed/skipped by some pathologists during the experiments and were thus excluded from the final comparison, leading to a reduced sample size (N < 137). (d) Time comparison between AI-assisted mode and unassisted mode stratified by different pathologist groups and lymph node status. P-values were calculated using a two-sided t-test without adjustment. For the boxplots, the interior horizontal line represents the median value, the upper and lower box edges represent the 75th and 25th percentile, and the upper and lower bars represent the 90th and 10th percentiles, respectively.

Source data

Extended Data Fig. 2 A lymph node from an experimental slide.

The experimental slide is used for evaluation, with tumor regions highlighted in green rectangles, and tumor cells highlighted in red scatters.

Extended Data Fig. 3 Evaluating individualized model performance to inspection errors (false negatives).

(a) Approach to calculate the ratio of positive nuclei inside the tumor region (green contour) to the lymph node; this ratio is also known as sensitivity (TP/P) (b) Comparison between the ratio of positive nuclei inside tumor region to lymph node when false negatives appear. The tumor regions were manually annotated for all lymph node slides. Abbreviations: lymph node (LN), isolated tumor cells (ITC), micro-metastasis (micromet), macro-metastasis (macromet). P-values were calculated using a two-sided Spearman test without adjustment in Python ‘scipy’ package. For the boxplots, the interior horizontal line represents the median value, the upper and lower box edges represent the 75th and 25th percentile, and the upper and lower bars represent the 90th and 10th percentiles, respectively.

Source data

Extended Data Fig. 4 A screenshot of the plasma cell classifier applied to an external slide from colorectal tissue.

In the screenshot, green squares are generated by the program, which are the prediction results for potential plasma cells (N = 27). Upon further manual verification, we highlighted five potential false positives with red circles.

Supplementary information

Supplementary Information

Supplementary figures and tables.

Reporting Summary

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, Z., Yang, E., Shen, J. et al. A pathologist–AI collaboration framework for enhancing diagnostic accuracies and efficiencies. Nat. Biomed. Eng (2024). https://doi.org/10.1038/s41551-024-01223-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41551-024-01223-5

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics