Comparative Study

. 2017 Dec 12;318(22):2199-2210.

doi: 10.1001/jama.2017.14585.

Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer

Babak Ehteshami Bejnordi¹, Mitko Veta², Paul Johannes van Diest³, Bram van Ginneken¹, Nico Karssemeijer¹, Geert Litjens⁴, Jeroen A W M van der Laak⁴; the CAMELYON16 Consortium; Meyke Hermsen⁴, Quirine F Manson³, Maschenka Balkenhol⁴, Oscar Geessink^{4

5}, Nikolaos Stathonikos³, Marcory Crf van Dijk⁶, Peter Bult⁴, Francisco Beca⁷, Andrew H Beck^{7

8}, Dayong Wang^{7

8}, Aditya Khosla^{8

9}, Rishab Gargeya¹⁰, Humayun Irshad⁷, Aoxiao Zhong¹¹, Qi Dou^{11

12}, Quanzheng Li¹¹, Hao Chen¹², Huang-Jing Lin¹², Pheng-Ann Heng¹², Christian Haß¹³, Elia Bruni¹³, Quincy Wong¹⁴, Ugur Halici^{15

16}, Mustafa Ümit Öner¹⁵, Rengul Cetin-Atalay¹⁷, Matt Berseth¹⁸, Vitali Khvatkov¹⁹, Alexei Vylegzhanin¹⁹, Oren Kraus²⁰, Muhammad Shaban²¹, Nasir Rajpoot^{21

22}, Ruqayya Awan²³, Korsuk Sirinukunwattana²¹, Talha Qaiser²¹, Yee-Wah Tsang²², David Tellez⁴, Jonas Annuscheit²⁴, Peter Hufnagl²⁴, Mira Valkonen²⁵, Kimmo Kartasalo^{24

26}, Leena Latonen²⁷, Pekka Ruusuvuori^{24

28}, Kaisa Liimatainen²⁴, Shadi Albarqouni²⁹, Bharti Mungal²⁹, Ami George²⁹, Stefanie Demirci²⁹, Nassir Navab²⁹, Seiryo Watanabe³⁰, Shigeto Seno³⁰, Yoichi Takenaka³⁰, Hideo Matsuda³⁰, Hady Ahmady Phoulady³¹, Vassili Kovalev³², Alexander Kalinovsky³², Vitali Liauchuk³², Gloria Bueno³³, M Milagro Fernandez-Carrobles³³, Ismael Serrano³³, Oscar Deniz³³, Daniel Racoceanu^{34

35}, Rui Venâncio³⁶

Affiliations

¹ Diagnostic Image Analysis Group, Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Nijmegen, the Netherlands.
² Medical Image Analysis Group, Eindhoven University of Technology, Eindhoven, the Netherlands.
³ Department of Pathology, University Medical Center Utrecht, Utrecht, the Netherlands.
⁴ Department of Pathology, Radboud University Medical Center, Nijmegen, the Netherlands.
⁵ Laboratorium Pathologie Oost Nederland, Hengelo, the Netherlands.
⁶ Rijnstate Hospital, Arnhem, the Netherlands.
⁷ BeckLab, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts.
⁸ PathAI, Cambridge, Massachusetts.
⁹ Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts.
¹⁰ Harker School, San Jose, California.
¹¹ Center for Clinical Data Science, Gordon Center for Medical Imaging, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts.
¹² Chinese University of Hong Kong, Hong Kong, China.
¹³ ExB Research and Development GmbH, Munich, Germany.
¹⁴ Munich Business School, Munich, Germany.
¹⁵ Department of Electrical and Electronics Engineering, Middle East Technical University, Ankara, Turkey.
¹⁶ Neuroscience and Neurotechnology, Graduate School of Natural and Applied Sciences, Middle East Technical University, Ankara, Turkey.
¹⁷ Cancer System Biology Laboratory, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.
¹⁸ NLP LOGIX, Jacksonville, Florida.
¹⁹ Smart Imaging Technologies, Houston, Texas.
²⁰ Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada.
²¹ Tissue Image Analytics Lab, Department of Computer Science, University of Warwick, Coventry, United Kingdom.
²² Department of Pathology, University Hospitals Coventry and Warwickshire National Health Service Foundation Trust, Coventry, United Kingdom.
²³ Department of Computer Science and Engineering, Qatar University, Doha, Qatar.
²⁴ Hochschule für Technik und Wirtschaft, Berlin, Germany.
²⁵ BioMediTech Institute and Faculty of Medicine and Life Sciences, Tampere University of Technology, Tampere, Finland.
²⁶ BioMediTech Institute and Faculty of Biomedical Science and Engineering, Tampere University of Technology, Tampere, Finland.
²⁷ Prostate Cancer Research Center, Faculty of Medicine and Life Sciences and BioMediTech, University of Tampere, Tampere, Finland.
²⁸ Faculty of Computing and Electrical Engineering, Tampere University of Technology, Pori, Finland.
²⁹ Technical University of Munich, Munich, Germany.
³⁰ Department of Bioinformatic Engineering, Osaka University.
³¹ University of South Florida, Tampa, Florida.
³² Biomedical Image Analysis Department, United Institute of Informatics Problems, Belarus National Academy of Sciences, Minsk, Belarus.
³³ Visilab, University of Castilla-La Mancha, Ciudad Real, Spain.
³⁴ INSERM, Laboratoire d'Imagerie Biomédicale, Sorbonne Universiteś, Pierre and Marie Curie University, Paris, France.
³⁵ Pontifical Catholic University of Peru, San Miguel, Lima, Peru.
³⁶ Sorbonne University, Pierre and Marie Curie University, Paris, France.

PMID: 29234806
PMCID: PMC5820737
DOI: 10.1001/jama.2017.14585

Comparative Study

Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer

Babak Ehteshami Bejnordi et al. JAMA. 2017.

. 2017 Dec 12;318(22):2199-2210.

doi: 10.1001/jama.2017.14585.

Authors

Affiliations

¹ Diagnostic Image Analysis Group, Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Nijmegen, the Netherlands.
² Medical Image Analysis Group, Eindhoven University of Technology, Eindhoven, the Netherlands.
³ Department of Pathology, University Medical Center Utrecht, Utrecht, the Netherlands.
⁴ Department of Pathology, Radboud University Medical Center, Nijmegen, the Netherlands.
⁵ Laboratorium Pathologie Oost Nederland, Hengelo, the Netherlands.
⁶ Rijnstate Hospital, Arnhem, the Netherlands.
⁷ BeckLab, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts.
⁸ PathAI, Cambridge, Massachusetts.
⁹ Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts.
¹⁰ Harker School, San Jose, California.
¹¹ Center for Clinical Data Science, Gordon Center for Medical Imaging, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts.
¹² Chinese University of Hong Kong, Hong Kong, China.
¹³ ExB Research and Development GmbH, Munich, Germany.
¹⁴ Munich Business School, Munich, Germany.
¹⁵ Department of Electrical and Electronics Engineering, Middle East Technical University, Ankara, Turkey.
¹⁶ Neuroscience and Neurotechnology, Graduate School of Natural and Applied Sciences, Middle East Technical University, Ankara, Turkey.
¹⁷ Cancer System Biology Laboratory, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.
¹⁸ NLP LOGIX, Jacksonville, Florida.
¹⁹ Smart Imaging Technologies, Houston, Texas.
²⁰ Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario, Canada.
²¹ Tissue Image Analytics Lab, Department of Computer Science, University of Warwick, Coventry, United Kingdom.
²² Department of Pathology, University Hospitals Coventry and Warwickshire National Health Service Foundation Trust, Coventry, United Kingdom.
²³ Department of Computer Science and Engineering, Qatar University, Doha, Qatar.
²⁴ Hochschule für Technik und Wirtschaft, Berlin, Germany.
²⁵ BioMediTech Institute and Faculty of Medicine and Life Sciences, Tampere University of Technology, Tampere, Finland.
²⁶ BioMediTech Institute and Faculty of Biomedical Science and Engineering, Tampere University of Technology, Tampere, Finland.
²⁷ Prostate Cancer Research Center, Faculty of Medicine and Life Sciences and BioMediTech, University of Tampere, Tampere, Finland.
²⁸ Faculty of Computing and Electrical Engineering, Tampere University of Technology, Pori, Finland.
²⁹ Technical University of Munich, Munich, Germany.
³⁰ Department of Bioinformatic Engineering, Osaka University.
³¹ University of South Florida, Tampa, Florida.
³² Biomedical Image Analysis Department, United Institute of Informatics Problems, Belarus National Academy of Sciences, Minsk, Belarus.
³³ Visilab, University of Castilla-La Mancha, Ciudad Real, Spain.
³⁴ INSERM, Laboratoire d'Imagerie Biomédicale, Sorbonne Universiteś, Pierre and Marie Curie University, Paris, France.
³⁵ Pontifical Catholic University of Peru, San Miguel, Lima, Peru.
³⁶ Sorbonne University, Pierre and Marie Curie University, Paris, France.

PMID: 29234806
PMCID: PMC5820737
DOI: 10.1001/jama.2017.14585

Abstract

Importance: Application of deep learning algorithms to whole-slide pathology images can potentially improve diagnostic accuracy and efficiency.

Objective: Assess the performance of automated deep learning algorithms at detecting metastases in hematoxylin and eosin-stained tissue sections of lymph nodes of women with breast cancer and compare it with pathologists' diagnoses in a diagnostic setting.

Design, setting, and participants: Researcher challenge competition (CAMELYON16) to develop automated solutions for detecting lymph node metastases (November 2015-November 2016). A training data set of whole-slide images from 2 centers in the Netherlands with (n = 110) and without (n = 160) nodal metastases verified by immunohistochemical staining were provided to challenge participants to build algorithms. Algorithm performance was evaluated in an independent test set of 129 whole-slide images (49 with and 80 without metastases). The same test set of corresponding glass slides was also evaluated by a panel of 11 pathologists with time constraint (WTC) from the Netherlands to ascertain likelihood of nodal metastases for each slide in a flexible 2-hour session, simulating routine pathology workflow, and by 1 pathologist without time constraint (WOTC).

Exposures: Deep learning algorithms submitted as part of a challenge competition or pathologist interpretation.

Main outcomes and measures: The presence of specific metastatic foci and the absence vs presence of lymph node metastasis in a slide or image using receiver operating characteristic curve analysis. The 11 pathologists participating in the simulation exercise rated their diagnostic confidence as definitely normal, probably normal, equivocal, probably tumor, or definitely tumor.

Results: The area under the receiver operating characteristic curve (AUC) for the algorithms ranged from 0.556 to 0.994. The top-performing algorithm achieved a lesion-level, true-positive fraction comparable with that of the pathologist WOTC (72.4% [95% CI, 64.3%-80.4%]) at a mean of 0.0125 false-positives per normal whole-slide image. For the whole-slide image classification task, the best algorithm (AUC, 0.994 [95% CI, 0.983-0.999]) performed significantly better than the pathologists WTC in a diagnostic simulation (mean AUC, 0.810 [range, 0.738-0.884]; P < .001). The top 5 algorithms had a mean AUC that was comparable with the pathologist interpreting the slides in the absence of time constraints (mean AUC, 0.960 [range, 0.923-0.994] for the top 5 algorithms vs 0.966 [95% CI, 0.927-0.998] for the pathologist WOTC).

Conclusions and relevance: In the setting of a challenge competition, some deep learning algorithms achieved better diagnostic performance than a panel of 11 pathologists participating in a simulation exercise designed to mimic routine pathology workflow; algorithm performance was comparable with an expert pathologist interpreting whole-slide images without time constraints. Whether this approach has clinical utility will require evaluation in a clinical setting.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Dr Veta reported receiving grant funding from Netherlands Organization for Scientific Research. Dr van Ginneken reported being a co-founder of and holding shares from Thirona and receiving grant funding and royalties from Mevis Medical Solutions. Dr Karssemeijer reported receiving holding shares in Volpara Solutions, QView Medical, and ScreenPoint Medical BV; consulting fees from QView Medical; and being an employee of ScreenPoint Medical BV. Dr van der Laak reported receiving personal fees from Philips, ContextVision, and Diagnostic Services Manitoba. Dr Manson reported receiving grant funding from Dutch Cancer Society. Mr Geessink reported receiving grant funding from Dutch Cancer Society. Dr Beca reported receiving personal fees from PathAI and Nvidia and owning stock in Nvidia. Dr Li reported receiving grant funding from the National Institutes of Health. Dr Ruusuvuori reported receiving grant funding from Finnish Funding Agency for Innovation. No other disclosures were reported.

Figures

**Figure 1.. FROC Curves of the Top 5 Performing Algorithms vs Pathologist WOTC for the Metastases Identification Task (Task 1) From the CAMELYON16 Competition**
CAMELYON16 indicates Cancer Metastases in Lymph Nodes Challenge 2016; CULab, Chinese University Lab; FROC, free-response receiver operator characteristic; HMS, Harvard Medical School; MGH, Massachusetts General Hospital; MIT, Massachusetts Institute of Technology; WOTC, without time constraint. The range on the x-axis is linear between 0 and 0.125 (blue) and base 2 logarithmic scale between 0.125 and 8. Teams were those organized in the CAMELYON16 competition. Task 1 was measured on the 129 whole-slide images in the test data set, of which 49 contained metastatic regions. The pathologist did not produce any false-positives and achieved a true-positive fraction of 0.724 for detecting and localizing metastatic regions.

**Figure 2.. Probability Maps Generated by the Top 3 Algorithms From the CAMELYON16 Competition**
For abbreviations, see the legend of Figure 3. The color scale bar (top right) indicates the probability for each pixel to be part of a metastatic region. For additional examples, see eFigure 5 in the Supplement. A, Four annotated micrometastatic regions in whole-slide images of hematoxylin and eosin–stained lymph node tissue sections taken from the test set of Cancer Metastases in Lymph Nodes Challenge 2016 (CAMELYON16) dataset. B-D, Probability maps from each team overlaid on the original images.

**Figure 3.. ROC Curves of the Top-Performing Algorithms vs Pathologists for Metastases Classification (Task 2) From the CAMELYON16 Competition**
AUC indicates area under the receiver operating characteristic curve; CAMELYON16, Cancer Metastases in Lymph Nodes Challenge 2016; CULab, Chinese University Lab; HMS, Harvard Medical School; MGH, Massachusetts General Hospital; MIT, Massachusetts Institute of Technology; WOTC, without time constraint; WTC, with time constraint; ROC, receiver operator characteristic. The blue in the axes on the left panels correspond with the blue on the axes in the right panels. Task 2 was measured on the 129 whole-slide images (for algorithms and the pathologist WTC) and corresponding glass slides (for 11 pathologists WOTC) in the test data set, which 49 contained metastatic regions. A, A machine-learning system achieves superior performance to a pathologist if the operating point of the pathologist lies below the ROC curve of the system. The top 2 deep learning–based systems outperform all the pathologists WTC in this study. All the pathologists WTC scored glass slide images using 5 levels of confidence: definitely normal, probably normal, equivocal, probably tumor, definitely tumor. To generate estimates of sensitivity and specificity for each pathologist, negative was defined as confidence levels of definitely normal and probably normal; all others as positive. B, The mean ROC curve was computed using the pooled mean technique. This mean is obtained by joining all the diagnoses of the pathologists WTC and computing the resulting ROC curve as if it were 1 person analyzing 11 × 129 = 1419 cases.

See this image and copyright information in PMC

Comment in

Deep Learning Algorithms for Detection of Lymph Node Metastases From Breast Cancer: Helping Artificial Intelligence Be Seen.
Golden JA. Golden JA. JAMA. 2017 Dec 12;318(22):2184-2186. doi: 10.1001/jama.2017.14580. JAMA. 2017. PMID: 29234791 No abstract available.
Using Free-Response Receiver Operating Characteristic Curves to Assess the Accuracy of Machine Diagnosis of Cancer.
Moskowitz CS. Moskowitz CS. JAMA. 2017 Dec 12;318(22):2250-2251. doi: 10.1001/jama.2017.18686. JAMA. 2017. PMID: 29234793 No abstract available.
Not Just Digital Pathology, Intelligent Digital Pathology.
Acs B, Rimm DL. Acs B, et al. JAMA Oncol. 2018 Mar 1;4(3):403-404. doi: 10.1001/jamaoncol.2017.5449. JAMA Oncol. 2018. PMID: 29392271 No abstract available.
AI diagnostics need attention.
[No authors listed] [No authors listed] Nature. 2018 Mar 15;555(7696):285. doi: 10.1038/d41586-018-03067-x. Nature. 2018. PMID: 29542717 No abstract available.
Machine Learning Compared With Pathologist Assessment.
van Smeden M, Van Calster B, Groenwold RHH. van Smeden M, et al. JAMA. 2018 Apr 24;319(16):1725-1726. doi: 10.1001/jama.2018.1466. JAMA. 2018. PMID: 29710156 No abstract available.

Cited by

Development and deployment of a histopathology-based deep learning algorithm for patient prescreening in a clinical trial.
Juan Ramon A, Parmar C, Carrasco-Zevallos OM, Csiszer C, Yip SSF, Raciti P, Stone NL, Triantos S, Quiroz MM, Crowley P, Batavia AS, Greshock J, Mansi T, Standish KA. Juan Ramon A, et al. Nat Commun. 2024 Jun 1;15(1):4690. doi: 10.1038/s41467-024-49153-9. Nat Commun. 2024. PMID: 38824132 Free PMC article.
Predicting mortality after transcatheter aortic valve replacement using preprocedural CT.
Brüggemann D, Kuzo N, Anwer S, Kebernik J, Eberhard M, Alkadhi H, Tanner FC, Konukoglu E. Brüggemann D, et al. Sci Rep. 2024 May 31;14(1):12526. doi: 10.1038/s41598-024-63022-x. Sci Rep. 2024. PMID: 38822074 Free PMC article.
Personalized anesthesia and precision medicine: a comprehensive review of genetic factors, artificial intelligence, and patient-specific factors.
Zeng S, Qing Q, Xu W, Yu S, Zheng M, Tan H, Peng J, Huang J. Zeng S, et al. Front Med (Lausanne). 2024 May 9;11:1365524. doi: 10.3389/fmed.2024.1365524. eCollection 2024. Front Med (Lausanne). 2024. PMID: 38784235 Free PMC article. Review.
Histopathologic image-based deep learning classifier for predicting platinum-based treatment responses in high-grade serous ovarian cancer.
Ahn B, Moon D, Kim HS, Lee C, Cho NH, Choi HK, Kim D, Lee JY, Nam EJ, Won D, An HJ, Kwon SY, Shin SJ, Jung HR, Kwon D, Park H, Kim M, Cha YJ, Park H, Lee Y, Noh S, Lee YM, Choi SE, Kim JM, Sung SH, Park E. Ahn B, et al. Nat Commun. 2024 May 18;15(1):4253. doi: 10.1038/s41467-024-48667-6. Nat Commun. 2024. PMID: 38762636 Free PMC article.
A prediction model based on digital breast pathology image information.
Sun G, Cai L, Yan X, Nie W, Liu X, Xu J, Zou X. Sun G, et al. PLoS One. 2024 May 17;19(5):e0294923. doi: 10.1371/journal.pone.0294923. eCollection 2024. PLoS One. 2024. PMID: 38758814 Free PMC article.

See all "Cited by" articles

References

1. Griffin J, Treanor D. Digital pathology in clinical use: where are we now and what is holding us back? Histopathology. 2017;70(1):134-145. - PubMed
1. Madabhushi A, Lee G. Image analysis and machine learning in digital pathology: challenges and opportunities. Med Image Anal. 2016;33:170-175. - PMC - PubMed
1. Gulshan V, Peng L, Coram M, et al. . Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2410. - PubMed
1. Esteva A, Kuprel B, Novoa RA, et al. . Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. - PMC - PubMed
1. Vestjens JHMJ, Pepels MJ, de Boer M, et al. . Relevant impact of central pathology review on nodal classification in individual breast cancer patients. Ann Oncol. 2012;23(10):2561-2566. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
Medical
- Genetic Alliance
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer

Affiliations

Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Comment in

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Abstract

Conflict of interest statement

Figures

Comment in

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical