Skip to main content

Mitigating Calibration Bias Without Fixed Attribute Grouping for Improved Fairness in Medical Imaging Analysis

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 (MICCAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14222))

Abstract

Trustworthy deployment of deep learning medical imaging models into real-world clinical practice requires that they be calibrated. However, models that are well calibrated overall can still be poorly calibrated for a sub-population, potentially resulting in a clinician unwittingly making poor decisions for this group based on the recommendations of the model. Although methods have been shown to successfully mitigate biases across subgroups in terms of model accuracy, this work focuses on the open problem of mitigating calibration biases in the context of medical image analysis. Our method does not require subgroup attributes during training, permitting the flexibility to mitigate biases for different choices of sensitive attributes without re-training. To this end, we propose a novel two-stage method: Cluster-Focal to first identify poorly calibrated samples, cluster them into groups, and then introduce group-wise focal loss to improve calibration bias. We evaluate our method on skin lesion classification with the public HAM10000 dataset, and on predicting future lesional activity for multiple sclerosis (MS) patients. In addition to considering traditional sensitive attributes (e.g. age, sex) with demographic subgroups, we also consider biases among groups with different image-derived attributes, such as lesion load, which are required in medical image analysis. Our results demonstrate that our method effectively controls calibration error in the worst-performing subgroups while preserving prediction performance, and outperforming recent baselines.

C. Shui and J. Szeto—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 84.99
Price excludes VAT (USA)
Softcover Book
USD 109.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Burlina, P., Joshi, N., Paul, W., Pacheco, K.D., Bressler, N.M.: Addressing artificial intelligence bias in retinal diagnostics. Transl. Vision Sci. Technol. 10(2), 13–13 (2021)

    Article  Google Scholar 

  2. Calabresi, P.A., et al.: Pegylated interferon beta-1a for relapsing-remitting multiple sclerosis (ADVANCE): a randomised, phase 3, double-blind study. Lancet Neurol. 13(7), 657–665 (2014)

    Article  Google Scholar 

  3. Codella, N.C., et al.: Skin lesion analysis toward melanoma detection: a challenge at the 2017 international symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC). In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 168–172. IEEE (2018)

    Google Scholar 

  4. Creager, E., Jacobsen, J.H., Zemel, R.: Environment inference for invariant learning. In: International Conference on Machine Learning, pp. 2189–2200. PMLR (2021)

    Google Scholar 

  5. Devonshire, V., et al.: Relapse and disability outcomes in patients with multiple sclerosis treated with fingolimod: subgroup analyses of the double-blind, randomised, placebo-controlled FREEDOMS study. The Lancet Neurology 11(5), 420–428 (2012)

    Article  Google Scholar 

  6. Diana, E., Gill, W., Kearns, M., Kenthapadi, K., Roth, A.: Minimax group fairness: algorithms and experiments. In: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pp. 66–76 (2021)

    Google Scholar 

  7. Gold, R., et al.: Placebo-controlled phase 3 study of oral BG-12 for relapsing multiple sclerosis. N. Engl. J. Med. 367(12), 1098–1107 (2012)

    Article  Google Scholar 

  8. Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: International Conference on Machine Learning, pp. 1321–1330. PMLR (2017)

    Google Scholar 

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  10. Lahoti, P., et al.: Fairness without demographics through adversarially reweighted learning. In: Advances in Neural Information Processing Systems, vol. 33, pp. 728–740 (2020)

    Google Scholar 

  11. Lampl, C., You, X., Limmroth, V.: Weekly IM interferon beta-1a in multiple sclerosis patients over 50 years of age. Eur. J. Neurol. 19(1), 142–148 (2012)

    Article  Google Scholar 

  12. Lampl, C., et al.: Efficacy and safety of interferon beta-1b SC in older RRMS patients: a post hoc analysis of the beyond study. J. Neurol. 260(7), 1838–1845 (2013)

    Article  Google Scholar 

  13. Larrazabal, A.J., Nieto, N., Peterson, V., Milone, D.H., Ferrante, E.: Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc. Natl. Acad. Sci. 117(23), 12592–12594 (2020)

    Article  Google Scholar 

  14. Liu, E.Z., et al.: Just train twice: improving group robustness without training group information. In: International Conference on Machine Learning, pp. 6781–6792. PMLR (2021)

    Google Scholar 

  15. Menze, B.H., et al.: The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans. Med. Imaging 34(10), 1993–2024 (2015)

    Article  Google Scholar 

  16. Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P., Dokania, P.: Calibrating deep neural networks using focal loss. Adv. Neural. Inf. Process. Syst. 33, 15288–15299 (2020)

    Google Scholar 

  17. Nixon, J., Dusenberry, M.W., Zhang, L., Jerfel, G., Tran, D.: Measuring calibration in deep learning. In: CVPR Workshops, vol. 2 (2019)

    Google Scholar 

  18. Ricci Lara, M.A., Echeveste, R., Ferrante, E.: Addressing fairness in artificial intelligence for medical imaging. Nat. Commun. 13(1), 4581 (2022)

    Article  Google Scholar 

  19. Roelofs, R., Cain, N., Shlens, J., Mozer, M.C.: Mitigating bias in calibration error estimation. In: International Conference on Artificial Intelligence and Statistics, pp. 4036–4054. PMLR (2022)

    Google Scholar 

  20. Sagawa, S., Koh, P.W., Hashimoto, T.B., Liang, P.: Distributionally robust neural networks. In: International Conference on Learning Representations (2020)

    Google Scholar 

  21. Sepahvand, N.M., Hassner, T., Arnold, D.L., Arbel, T.: CNN prediction of future disease activity for multiple sclerosis patients from baseline MRI and lesion labels. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) BrainLes 2018. LNCS, vol. 11383, pp. 57–69. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11723-8_6

    Chapter  Google Scholar 

  22. Signori, A., Schiavetti, I., Gallo, F., Sormani, M.P.: Subgroups of multiple sclerosis patients with larger treatment benefits: a meta-analysis of randomized trials. Eur. J. Neurol. 22(6), 960–966 (2015)

    Article  Google Scholar 

  23. Simon, J., et al.: Ten-year follow-up of the ‘minimal MRI lesion’ subgroup from the original CHAMPS Multiple Sclerosis Prevention Trial. Multiple Sclerosis J. 21(4), 415–422 (2015). Publisher: SAGE Publications Ltd. STM

    Google Scholar 

  24. Tousignant, A., Lemaître, P., Precup, D., Arnold, D.L., Arbel, T.: Prediction of disease progression in multiple sclerosis patients using deep learning analysis of MRI data. In: Proceedings of The 2nd International Conference on Medical Imaging with Deep Learning (MIDL), vol. 102, pp. 483–492. PMLR, 08–10 July 2019

    Google Scholar 

  25. Vapnik, V.: Principles of risk minimization for learning theory. In: Advances in Neural Information Processing Systems, vol. 4 (1991)

    Google Scholar 

  26. Vollmer, T.L., et al.: On behalf of the BRAVO study group: a randomized placebo-controlled phase III trial of oral laquinimod for multiple sclerosis. J. Neurol. 261(4), 773–783 (2014)

    Article  Google Scholar 

  27. Zong, Y., Yang, Y., Hospedales, T.: Medfair: benchmarking fairness for medical imaging. In: International Conference on Learning Representations (ICLR) (2023)

    Google Scholar 

  28. Zou, J., Schiebinger, L.: AI can be sexist and racist-it’s time to make it fair. Nature (2018)

    Google Scholar 

Download references

Acknowledgements

This paper was supported by the Canada Institute for Advanced Research (CIFAR) AI Chairs program and the Natural Sciences and Engineering Research Council of Canada (NSERC). The MS portion of this paper was supported by the International Progressive Multiple Sclerosis Alliance (PA-1412-02420), the companies who generously provided the MS data: Biogen, BioMS, MedDay, Novartis, Roche/Genentech, and Teva, Multiple Sclerosis Society of Canada, Calcul Quebec, and the Digital Research Alliance of Canada.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Changjian Shui .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 105 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shui, C., Szeto, J., Mehta, R., Arnold, D.L., Arbel, T. (2023). Mitigating Calibration Bias Without Fixed Attribute Grouping for Improved Fairness in Medical Imaging Analysis. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14222. Springer, Cham. https://doi.org/10.1007/978-3-031-43898-1_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43898-1_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43897-4

  • Online ISBN: 978-3-031-43898-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics