Skip to main content

Debiasing Counterfactuals in the Presence of Spurious Correlations

  • Conference paper
  • First Online:
Clinical Image-Based Procedures, Fairness of AI in Medical Imaging, and Ethical and Philosophical Issues in Medical Imaging (CLIP 2023, EPIMI 2023, FAIMI 2023)

Abstract

Deep learning models can perform well in complex medical imaging classification tasks, even when basing their conclusions on spurious correlations (i.e. confounders), should they be prevalent in the training dataset, rather than on the causal image markers of interest. This would thereby limit their ability to generalize across the population. Explainability based on counterfactual image generation can be used to expose the confounders but does not provide a strategy to mitigate the bias. In this work, we introduce the first end-to-end training framework that integrates both (i) popular debiasing classifiers (e.g. distributionally robust optimization (DRO)) to avoid latching onto the spurious correlations and (ii) counterfactual image generation to unveil generalizable imaging markers of relevance to the task. Additionally, we propose a novel metric, Spurious Correlation Latching Score (SCLS), to quantify the extent of the classifier reliance on the spurious correlation as exposed by the counterfactual images. Through comprehensive experiments on two public datasets (with the simulated and real visual artifacts), we demonstrate that the debiasing method: (i) learns generalizable markers across the population, and (ii) successfully ignores spurious correlations and focuses on the underlying disease pathology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 84.99
Price excludes VAT (USA)
Softcover Book
USD 109.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Burlina, P., Joshi, N., Paul, W., Pacheco, K.D., Bressler, N.M.: Addressing artificial intelligence bias in retinal diagnostics. Transl. Vis. Sci. Technol. 10(2), 13 (2021)

    Article  Google Scholar 

  2. Cohen, J.P., et al.: Gifsplanation via latent shift: a simple autoencoder approach to counterfactual generation for chest X-rays. In: Medical Imaging with Deep Learning, pp. 74–104. PMLR (2021)

    Google Scholar 

  3. DeGrave, A.J., Janizek, J.D., Lee, S.I.: AI for radiographic COVID-19 detection selects shortcuts over signal. Nat. Mach. Intell. 3(7), 610–619 (2021)

    Article  Google Scholar 

  4. Hore, A., Ziou, D.: Image quality metrics: PSNR vs. SSIM. In: 2010 20th International Conference on Pattern Recognition, pp. 2366–2369. IEEE (2010)

    Google Scholar 

  5. Irvin, J., et al.: CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 590–597 (2019)

    Google Scholar 

  6. Jiang, H., et al.: A multi-label deep learning model with interpretable grad-cam for diabetic retinopathy classification. In: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 1560–1563. IEEE (2020)

    Google Scholar 

  7. Kumar, A., et al.: Counterfactual image synthesis for discovery of personalized predictive image markers. In: Kakileti, S.T., et al. (eds.) MIABID AIIIMA 2022 2022. LNCS, vol. 13602, pp. 113–124. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19660-7_11

    Chapter  Google Scholar 

  8. Larrazabal, A.J., Nieto, N., Peterson, V., Milone, D.H., Ferrante, E.: Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc. Natl. Acad. Sci. 117(23), 12592–12594 (2020)

    Article  Google Scholar 

  9. Light, R.W.: Pleural effusion. N. Engl. J. Med. 346(25), 1971–1977 (2002)

    Article  Google Scholar 

  10. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  11. Magesh, P.R., Myloth, R.D., Tom, R.J.: An explainable machine learning model for early detection of Parkinson’s disease using LIME on DaTSCAN imagery. Comput. Biol. Med. 126, 104041 (2020)

    Article  Google Scholar 

  12. Mehta, R., Shui, C., Arbel, T.: Evaluating the fairness of deep learning uncertainty estimates in medical image analysis. In: Medical Imaging with Deep Learning (2023)

    Google Scholar 

  13. Mertes, S., Huber, T., Weitz, K., Heimerl, A., André, E.: GANterfactual-counterfactual explanations for medical non-experts using generative adversarial learning. Front. Artif. Intell. 5, 825565 (2022)

    Article  Google Scholar 

  14. Mothilal, R.K., Sharma, A., Tan, C.: Explaining machine learning classifiers through diverse counterfactual explanations. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 607–617 (2020)

    Google Scholar 

  15. Nemirovsky, D., Thiebaut, N., Xu, Y., Gupta, A.: CounteRGAN: Generating realistic counterfactuals with residual generative adversarial nets. arXiv preprint arXiv:2009.05199 (2020)

  16. Panwar, H., Gupta, P., Siddiqui, M.K., Morales-Menendez, R., Bhardwaj, P., Singh, V.: A deep learning and grad-CAM based color visualization approach for fast detection of COVID-19 cases using chest X-ray and CT-scan images. Chaos, Solitons Fractals 140, 110190 (2020)

    Article  MathSciNet  Google Scholar 

  17. Ricci Lara, M.A., Echeveste, R., Ferrante, E.: Addressing fairness in artificial intelligence for medical imaging. Nat. Commun. 13(1), 4581 (2022)

    Article  Google Scholar 

  18. Sagawa, S., Koh, P.W., Hashimoto, T.B., Liang, P.: Distributionally robust neural networks for group shifts: on the importance of regularization for worst-case generalization. In: International Conference on Learning Representations (2019)

    Google Scholar 

  19. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)

    Google Scholar 

  20. Shui, C., Szeto, J., Mehta, R., Arnold, D., Arbel, T.: Mitigating calibration bias without fixed attribute grouping for improved fairness in medical imaging analysis. arXiv preprint arXiv:2307.01738 (2023)

  21. Singla, S., Eslami, M., Pollack, B., Wallace, S., Batmanghelich, K.: Explaining the black-box smoothly-a counterfactual approach. Med. Image Anal. 84, 102721 (2023)

    Article  Google Scholar 

  22. Targ, S., Almeida, D., Lyman, K.: Resnet in resnet: Generalizing residual architectures. arXiv preprint arXiv:1603.08029 (2016)

  23. Thiagarajan, J.J., Thopalli, K., Rajan, D., Turaga, P.: Training calibration-based counterfactual explainers for deep learning models in medical image analysis. Sci. Rep. 12(1), 597 (2022)

    Article  Google Scholar 

  24. Vapnik, V.: Principles of risk minimization for learning theory. In: Advances in Neural Information Processing Systems, vol. 4 (1991)

    Google Scholar 

  25. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)

    Google Scholar 

  26. Zong, Y., Yang, Y., Hospedales, T.: MEDFAIR: benchmarking fairness for medical imaging. In: International Conference on Learning Representations (2023)

    Google Scholar 

  27. Zou, J., Schiebinger, L.: AI can be sexist and racist-it’s time to make it fair (2018)

    Google Scholar 

Download references

Acknowledgements

The authors are grateful for funding provided by the Natural Sciences and Engineering Research Council of Canada, the Canadian Institute for Advanced Research (CIFAR) Artificial Intelligence Chairs program, the Mila - Quebec AI Institute technology transfer program, Microsoft Research, Calcul Quebec, and the Digital Research Alliance of Canada. S.A. Tsaftaris acknowledges the support of Canon Medical and the Royal Academy of Engineering and the Research Chairs and Senior Research Fellowships scheme (grant RCSRF1819 / 8 / 25), and the UK’s Engineering and Physical Sciences Research Council (EPSRC) support via grant EP/X017680/1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amar Kumar .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 304 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumar, A. et al. (2023). Debiasing Counterfactuals in the Presence of Spurious Correlations. In: Wesarg, S., et al. Clinical Image-Based Procedures, Fairness of AI in Medical Imaging, and Ethical and Philosophical Issues in Medical Imaging. CLIP EPIMI FAIMI 2023 2023 2023. Lecture Notes in Computer Science, vol 14242. Springer, Cham. https://doi.org/10.1007/978-3-031-45249-9_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-45249-9_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-45248-2

  • Online ISBN: 978-3-031-45249-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics