SlideShare a Scribd company logo
Human-centered AI: how can we support end-users to
interact with AI?
TRAIL seminar – Paris – 7 April 2023
Katrien Verbert
Augment/HCI – Department of Computer Science – KU Leuven
@katrien_v
2
for the invitation!
Human-centered AI: how can we support end-users to interact with AI?
Human-Computer Interaction group
Explainable AI - recommender systems – visualization – intelligent user interfaces
Augment Katrien Verbert
ARIA Adalberto Simeone
Computer
Graphics
Phil Dutré
LIIR Sien Moens
NLP Miryam de Lhoneux
E-media
Vero Vanden Abeele
Luc Geurts
Human-centered AI
5
Human-Centered AI (HCAI) is an emerging discipline intent on creating AI
systems that amplify and augment rather than displace human abilities.
HCAI seeks to preserve human control in a way that ensures artificial
intelligence meets our needs while also operating transparently, delivering
equitable outcomes, and respecting privacy.
https://research.ibm.com/blog/what-is-human-centered-ai
 Explaining model outcomes to increase user trust and acceptance
 Enable users to interact with the explanation process to improve the model
New forms of human-AI interactions
Models
Explaining prediction models
7
Gutiérrez, F., Ochoa, X., Seipp, K., Broos, T., & Verbert, K. (2019). Benefits and trade-offs of different
model representations in decision support systems for non-expert users. In Human-Computer Interaction–
INTERACT 2019
Learning
analytics &
human
resources
Media
consumption
Wellbeing,
food &
health
Precision
agriculture
FinTech &
Insurtech
What do end-users really need?
Explanations
9
Millecamp, M., Htun, N. N., Conati, C., & Verbert, K. (2019, March). To explain or not to explain: the
effects of personal characteristics when explaining music recommendations. In Proceedings of the 2019
Conference on Intelligent User Interface (pp. 397-407). ACM.
media
Personal characteristics
Need for cognition
• Measurement of the tendency for an individual to engage in, and enjoy, effortful cognitive activities
• Measured by test of Cacioppo et al. [1984]
Visualisation literacy
• Measurement of the ability to interpret and make meaning from information presented in the form of
images and graphs
• Measured by test of Boy et al. [2014]
Locus of control (LOC)
• Measurement of the extent to which people believe they have power over events in their lives
• Measured by test of Rotter et al. [1966]
Visual working memory
• Measurement of the ability to recall visual patterns [Tintarev and Mastoff, 2016]
• Measured by Corsi block-tapping test
Musical experience
• Measurement of the ability to engage with music in a flexible, effective and nuanced way
[Müllensiefen et al., 2014]
• Measured using the Goldsmiths Musical Sophistication Index (Gold-MSI)
Tech savviness
• Measured by confidence in trying out new technology
10
Study design
 Within-subjects design: 105 participants recruited with Amazon Mechanical Turk
 Baseline version (without explanations) compared with explanation interface
 Pre-study questionnaire for all personal characteristics
 Task: Based on a chosen scenario for creating a play-list, explore songs and rate all
songs in the final playlist
 Post-study questionnaire:
 Recommender effectiveness
 Trust
 Good understanding
 Use intentions
 Novelty
 Satisfaction
 Confidence
Results
12
The interaction effect between NFC (divided into
4 quartiles Q1-Q4) and interfaces in terms of confidence
Design implications
 Explanations should be personalised for different groups of end-
users.
 Users should be able to choose whether or not they want to see
explanations.
 Explanation components should be flexible enough to present
varying levels of details depending on a user’s preference.
13
User control
Users tend to be more satisfied when they have control over
how recommender systems produce suggestions for them
Control recommendations
Douban FM
Control user profile
Spotify
Control algorithm parameters
TasteWeights
media
Controllability Cognitive load
Additional controls may increase cognitive load
(Andjelkovic et al. 2016)
Ivana Andjelkovic, Denis Parra, andJohn O’Donovan. 2016. Moodplay: Interactive mood-based music
discovery and recommendation. In Proc. of UMAP’16. ACM, 275–279.
Different levels of user control
16
Level
Recommender
components
Controls
low Recommendations (REC) Rating, removing, and sorting
medium User profile (PRO)
Select which user profile data
will be considered by the
recommender
high
Algorithm parameters
(PAR)
Modify the weight of different
parameters
Jin, Y., Tintarev, N., & Verbert, K. (2018, September). Effects of personal characteristics on music recommender
systems with different levels of controllability. In Proceedings of the 12th ACM Conference on Recommender
Systems (pp. 13-21). ACM.
User profile (PRO) Algorithm parameters (PAR) Recommendations (REC)
8 control settings
No control
REC
PAR
PRO
REC*PRO
REC*PAR
PRO*PAR
REC*PRO*PAR
Study design
 Between-subjects – 240 participants recruited with AMT
 Independent variable: settings of user control
 2x2x2 factorial design
 Dependent variables:
 Acceptance (ratings)
 Cognitive load (NASA-TLX), Musical Sophistication, Visual Memory
 Framework Knijnenburg et al. [2012]
Results
 Main effects: from REC to PRO to PAR → higher cognitive load
 Two-way interaction: does not necessarily result in higher
cognitive load. Adding an additional control component to
PAR increases the acceptance. PRO*PAR has less cognitive
load than PRO and PAR
 High musical sophistication leads to higher quality, and thereby
result in higher acceptance
19
Jin, Y., Tintarev, N., & Verbert, K. (2018, September). Effects of personal characteristics on music
recommender systems with different levels of controllability. In Proceedings of the 12th ACM Conference on
Recommender Systems (pp. 13-21). ACM.
What if the stakes are higher?
20
Learning
analytics &
human
resources
Media
consumption
health
Precision
agriculture
FinTech &
Insurtech
Learning analytics
Src: Steve Schoettler
22
Gutiérrez Hernández F., Seipp K., Ochoa X., Chiluiza K., De Laet T., Verbert K. (2018). LADA: A learning
analytics dashboard for academic advising. Computers in Human Behavior, pp 1-13. doi:
10.1016/j.chb.2018.12.004
LADA: a learning analytics dashboard for
study advisors
Study design
23
Evaluation @KU Leuven Monitoraat
N = 12
6 Experts (4F, 2M)
6 Laymen (1F, 5M)
Evaluation @ESPOL (Ecuador)
N = 14
8 Experts (3F, 5M)
6 Laymen (6M)
Results
What worked
✚ valuable tool for more
accurate and efficient
decision making.
✚ Users evaluated significantly
more scenarios.
What didn’t work
− More transparency needed
increase trust.
− Model didn’t behave as
expected
− LADA didn’t meet our users
needs
24
Gutiérrez Hernández F., Seipp K., Ochoa X., Chiluiza K., De Laet T., Verbert K. (2018). LADA: A learning
analytics dashboard for academic advising. Computers in Human Behavior, pp 1-13. doi:
10.1016/j.chb.2018.12.004
Strategies
25
Design science research
26
Fraefel, U. (2014, November). Professionalization of pre-service teachers through university-school partnerships. In
Conference Proceedings of WERA Focal Meeting, Edinburgh.
Data-centric explanations
Charleer, S., Moere, A. V., Klerkx, J., Verbert, K., & De Laet, T. (2018). Learning analytics
dashboards to support adviser-student dialogue. IEEE Transactions on Learning
Technologies, 11(3), 389-399.
Do not oversimplify: show uncertainty
 reality is complex
 measurement is limited
 individual circumstances
 need for nuance
 trigger reflection
29
Charleer S., Gutiérrez Hernández F., Verbert K. (2019). Supporting job mediator and job seeker through an actionable dashboard. In:
Proceedings of the 24th IUI conference on Intelligent User Interfaces Presented at the ACM IUI 2019
actionalable
explanations
30
User-centred design
31
Final evaluation
32
66 job seekers (age 33.9 ± 9.5, 18F)
8 Training Programs, 4 Groups, 1 Hour.
1
2
3
4
5
6
7
8
ResQue Questionnaire + two open questions.
Users explored the tool freely.
All interactions were logged.
33
Results
34
Results
Take away messages
 Explanations contribute to user empowerment
 Key difference between actionable and non-actionable
parameters
 Need for customization and contextualization
 Need for simplification
35
Strategies
36
Explaining
model
behaviour
In-situ
decision
support
Explaining model behavior
37
Precision
agriculture
AHMoSe
Rojo, D., Htun, N. N., Parra, D., De Croon, R., & Verbert, K. (2021). AHMoSe: A knowledge-based visual
support system for selecting regression machine learning models. Computers and Electronics in Agriculture,
187, 106183.
AHMoSe Visual Encodings
39
Model Explanations
(SHAP)
Model + Knowledge Summary
Case Study – Grape Quality Prediction
40
 Grape Quality Prediction Scenario [Tag14]
 Data
 Years 2010, 2011 (train) 2012 (test)
 48 cells (Central Greece)
 Knowledge-based rules
[Tag14] Tagarakis, A., et al. "A fuzzy inference system to model
grape quality in vineyards." Precision Agriculture 15.5 (2014):
555-578.
Source: [Tag14]
Simulation Study
 AHMoSe vs full AutoML approach to support model selection.
41
RMSE (AutoML) RMSE (AHMoSe) Difference %
Scenario A
Complete
Knowledge
0.430 0.403 ▼ 6.3%
Scenario B
Incomplete
Knowledge
0.458 0.385 ▼ 16.0%
Qualitative Evaluation
 10 open ended questions
 5 viticulture experts and 4 ML experts.
 Thematic Analysis: potential use cases, trust, usability, and
understandability.
Qualitative Evaluation - Trust
43
 Showing the dis/agreement of model outputs with expert’s
knowledge can promote trust.
“The thing that makes us trust the models is the fact that most of
the time, there is a good agreement between the values
predicted by the model and the ones obtained for the knowledge
of the experts.”
– Viticulture Expert
In-situ decision support
44
45
https://augment.cs.kuleuven.be/demos
Design and Evaluation
46
Gutiérrez F., Cardoso B., Verbert K. (2017). PHARA: a personal health augmented reality assistant to support
decision-making at grocery stores. In: Proceedings of the International Workshop on Health Recommender
Systems co-located with ACM RecSys 2017 (Paper No. 4) (10-13).
What if the stakes are really high?
Learning
analytics &
human
resources
Media
consumption
health
Precision
agriculture
FinTech &
Insurtech
Explaining predictions
48
https://www.imec-int.com/en/what-we-offer/research-portfolio/discrete
health
RECOMMENDE
R ALGORITHMS
MACHINE
LEARNING
INTERACTIVE
DASHBOARDS
SMART ALERTS
RICH CARE PLANS
OPEN IoT
ARCHITECTURE
Explaining predictions health
50
Gutiérrez Hernández, F. S., Htun, N. N., Vanden Abeele, V., De Croon, R., & Verbert, K. (2021). Explaining call
recommendations in nursing homes: a user-centered design approach for interacting with knowledge-based
health decision support systems. In Proceedings of the 27th Annual Conference on Intelligent User Interfaces.
ACM.
Explaining predictions health
Evaluation
 12 nurses used the app for three months
 Data collection
 Interaction logs
 Resque questions
 Semi-structured interviews
51
 12 nurses during 3 months
52
Results
 Iterative design process identified several important features, such as the pending
list, overview and the feedback shortcut to encourage feedback.
 Explanations seem to contribute well to better support the healthcare professionals.
 Results indicate a better understanding of the call notifications by being able to see the
reasons of the calls.
 More trust in the recommendations and increased perceptions of transparency and control
 Interaction patterns indicate that users engaged well with the interface, although some
users did not use all features to interact with the system.
 Need for further simplification and personalization.
53
54
55
Explaining recommendations
Word cloud Feature importance Feature importance+ %
Maxwell Szymanski, Vero Vanden Abeele and Katrien Verbert Explaining health
recommendations to lay users: The dos and don’ts – Apex-IUI 2022
health
Why do users (dis)like the explanation?
56
Textual or visual?
57
health
Study design
58
Results
 Hybrid explanations more useful compared to both the textual and
visual explanations.
 Users with a higher NFC tend to score the hybrid explanations
lower in terms of trust, transparency and usefulness compared to
the unimodal explanation.
59
Results
 Participants with low NFC have a better perception of hybrid
explanations
 Participants with high NFC have a better perception of
unimodal explanations
60
61
Data-centric explanations Health
Combining XAI methods to address different
dimensions of explainability
 Increasing actionability through interactive what-if analysis
 Explanations through actionable features instead of non-
actionable features
 Color-coded visual indicators for easy identification of patients
with high risk
 Data-centric directive explanations
62
Bhattacharya, A., Ooge, J., Stiglic, G., & Verbert, K. (2023, March). Directive Explanations for Monitoring the Risk of Diabetes
Onset: Introducing Directive Data-Centric Explanations and Combinations to Support What-If Explorations. In Proceedings of the
28th International Conference on Intelligent User Interfaces (pp. 204-219).
Next steps: model steering
63
Steering Approaches
Steering approaches
64
Feature Selection Feature filtering
Steering Approaches
65
Awareness through:
• Scores and
measures
• Visualizations
• Issue description
Steering approaches
66
Next steps: Fair AI
FinTech - InsurTech
Data-centric explanation methods for fraud detection
Explanations in high-stake domains will become mandatory by EU
regulations
Transparent and interactive data matching
 Insurance premium simulations
 Link with external data sources
E.g. occupational accidents, absenteeism data
67
https://human-centered.ai/project/explainable-ai-fwf-32554/
Take-away messages
 Involvement of end-users has been key to come up with
interfaces tailored to the needs of non-expert users
 Actionable vs non-actionable parameters
 Domain expertise of users and need for cognition important
personal characteristics
 Need for personalisation and simplification
 Data-centric explanations provide powerful solution
68
Peter BrusliovskyNava Tintarev Cristina Conati
Denis Parra
Collaborations
Bart KnijnenburgJurgen Ziegler
Questions?
katrien.verbert@cs.kuleuven.be
@katrien_v
Thank you!
https://augment.cs.kuleuven.be/

More Related Content

Human-centered AI: how can we support end-users to interact with AI?

  • 1. Human-centered AI: how can we support end-users to interact with AI? TRAIL seminar – Paris – 7 April 2023 Katrien Verbert Augment/HCI – Department of Computer Science – KU Leuven @katrien_v
  • 4. Human-Computer Interaction group Explainable AI - recommender systems – visualization – intelligent user interfaces Augment Katrien Verbert ARIA Adalberto Simeone Computer Graphics Phil Dutré LIIR Sien Moens NLP Miryam de Lhoneux E-media Vero Vanden Abeele Luc Geurts
  • 5. Human-centered AI 5 Human-Centered AI (HCAI) is an emerging discipline intent on creating AI systems that amplify and augment rather than displace human abilities. HCAI seeks to preserve human control in a way that ensures artificial intelligence meets our needs while also operating transparently, delivering equitable outcomes, and respecting privacy. https://research.ibm.com/blog/what-is-human-centered-ai
  • 6.  Explaining model outcomes to increase user trust and acceptance  Enable users to interact with the explanation process to improve the model New forms of human-AI interactions Models
  • 7. Explaining prediction models 7 Gutiérrez, F., Ochoa, X., Seipp, K., Broos, T., & Verbert, K. (2019). Benefits and trade-offs of different model representations in decision support systems for non-expert users. In Human-Computer Interaction– INTERACT 2019
  • 9. Explanations 9 Millecamp, M., Htun, N. N., Conati, C., & Verbert, K. (2019, March). To explain or not to explain: the effects of personal characteristics when explaining music recommendations. In Proceedings of the 2019 Conference on Intelligent User Interface (pp. 397-407). ACM. media
  • 10. Personal characteristics Need for cognition • Measurement of the tendency for an individual to engage in, and enjoy, effortful cognitive activities • Measured by test of Cacioppo et al. [1984] Visualisation literacy • Measurement of the ability to interpret and make meaning from information presented in the form of images and graphs • Measured by test of Boy et al. [2014] Locus of control (LOC) • Measurement of the extent to which people believe they have power over events in their lives • Measured by test of Rotter et al. [1966] Visual working memory • Measurement of the ability to recall visual patterns [Tintarev and Mastoff, 2016] • Measured by Corsi block-tapping test Musical experience • Measurement of the ability to engage with music in a flexible, effective and nuanced way [Müllensiefen et al., 2014] • Measured using the Goldsmiths Musical Sophistication Index (Gold-MSI) Tech savviness • Measured by confidence in trying out new technology 10
  • 11. Study design  Within-subjects design: 105 participants recruited with Amazon Mechanical Turk  Baseline version (without explanations) compared with explanation interface  Pre-study questionnaire for all personal characteristics  Task: Based on a chosen scenario for creating a play-list, explore songs and rate all songs in the final playlist  Post-study questionnaire:  Recommender effectiveness  Trust  Good understanding  Use intentions  Novelty  Satisfaction  Confidence
  • 12. Results 12 The interaction effect between NFC (divided into 4 quartiles Q1-Q4) and interfaces in terms of confidence
  • 13. Design implications  Explanations should be personalised for different groups of end- users.  Users should be able to choose whether or not they want to see explanations.  Explanation components should be flexible enough to present varying levels of details depending on a user’s preference. 13
  • 14. User control Users tend to be more satisfied when they have control over how recommender systems produce suggestions for them Control recommendations Douban FM Control user profile Spotify Control algorithm parameters TasteWeights media
  • 15. Controllability Cognitive load Additional controls may increase cognitive load (Andjelkovic et al. 2016) Ivana Andjelkovic, Denis Parra, andJohn O’Donovan. 2016. Moodplay: Interactive mood-based music discovery and recommendation. In Proc. of UMAP’16. ACM, 275–279.
  • 16. Different levels of user control 16 Level Recommender components Controls low Recommendations (REC) Rating, removing, and sorting medium User profile (PRO) Select which user profile data will be considered by the recommender high Algorithm parameters (PAR) Modify the weight of different parameters Jin, Y., Tintarev, N., & Verbert, K. (2018, September). Effects of personal characteristics on music recommender systems with different levels of controllability. In Proceedings of the 12th ACM Conference on Recommender Systems (pp. 13-21). ACM.
  • 17. User profile (PRO) Algorithm parameters (PAR) Recommendations (REC) 8 control settings No control REC PAR PRO REC*PRO REC*PAR PRO*PAR REC*PRO*PAR
  • 18. Study design  Between-subjects – 240 participants recruited with AMT  Independent variable: settings of user control  2x2x2 factorial design  Dependent variables:  Acceptance (ratings)  Cognitive load (NASA-TLX), Musical Sophistication, Visual Memory  Framework Knijnenburg et al. [2012]
  • 19. Results  Main effects: from REC to PRO to PAR → higher cognitive load  Two-way interaction: does not necessarily result in higher cognitive load. Adding an additional control component to PAR increases the acceptance. PRO*PAR has less cognitive load than PRO and PAR  High musical sophistication leads to higher quality, and thereby result in higher acceptance 19 Jin, Y., Tintarev, N., & Verbert, K. (2018, September). Effects of personal characteristics on music recommender systems with different levels of controllability. In Proceedings of the 12th ACM Conference on Recommender Systems (pp. 13-21). ACM.
  • 20. What if the stakes are higher? 20 Learning analytics & human resources Media consumption health Precision agriculture FinTech & Insurtech
  • 22. 22 Gutiérrez Hernández F., Seipp K., Ochoa X., Chiluiza K., De Laet T., Verbert K. (2018). LADA: A learning analytics dashboard for academic advising. Computers in Human Behavior, pp 1-13. doi: 10.1016/j.chb.2018.12.004 LADA: a learning analytics dashboard for study advisors
  • 23. Study design 23 Evaluation @KU Leuven Monitoraat N = 12 6 Experts (4F, 2M) 6 Laymen (1F, 5M) Evaluation @ESPOL (Ecuador) N = 14 8 Experts (3F, 5M) 6 Laymen (6M)
  • 24. Results What worked ✚ valuable tool for more accurate and efficient decision making. ✚ Users evaluated significantly more scenarios. What didn’t work − More transparency needed increase trust. − Model didn’t behave as expected − LADA didn’t meet our users needs 24 Gutiérrez Hernández F., Seipp K., Ochoa X., Chiluiza K., De Laet T., Verbert K. (2018). LADA: A learning analytics dashboard for academic advising. Computers in Human Behavior, pp 1-13. doi: 10.1016/j.chb.2018.12.004
  • 26. Design science research 26 Fraefel, U. (2014, November). Professionalization of pre-service teachers through university-school partnerships. In Conference Proceedings of WERA Focal Meeting, Edinburgh.
  • 27. Data-centric explanations Charleer, S., Moere, A. V., Klerkx, J., Verbert, K., & De Laet, T. (2018). Learning analytics dashboards to support adviser-student dialogue. IEEE Transactions on Learning Technologies, 11(3), 389-399.
  • 28. Do not oversimplify: show uncertainty  reality is complex  measurement is limited  individual circumstances  need for nuance  trigger reflection
  • 29. 29 Charleer S., Gutiérrez Hernández F., Verbert K. (2019). Supporting job mediator and job seeker through an actionable dashboard. In: Proceedings of the 24th IUI conference on Intelligent User Interfaces Presented at the ACM IUI 2019 actionalable explanations
  • 30. 30
  • 32. Final evaluation 32 66 job seekers (age 33.9 ± 9.5, 18F) 8 Training Programs, 4 Groups, 1 Hour. 1 2 3 4 5 6 7 8 ResQue Questionnaire + two open questions. Users explored the tool freely. All interactions were logged.
  • 35. Take away messages  Explanations contribute to user empowerment  Key difference between actionable and non-actionable parameters  Need for customization and contextualization  Need for simplification 35
  • 38. AHMoSe Rojo, D., Htun, N. N., Parra, D., De Croon, R., & Verbert, K. (2021). AHMoSe: A knowledge-based visual support system for selecting regression machine learning models. Computers and Electronics in Agriculture, 187, 106183.
  • 39. AHMoSe Visual Encodings 39 Model Explanations (SHAP) Model + Knowledge Summary
  • 40. Case Study – Grape Quality Prediction 40  Grape Quality Prediction Scenario [Tag14]  Data  Years 2010, 2011 (train) 2012 (test)  48 cells (Central Greece)  Knowledge-based rules [Tag14] Tagarakis, A., et al. "A fuzzy inference system to model grape quality in vineyards." Precision Agriculture 15.5 (2014): 555-578. Source: [Tag14]
  • 41. Simulation Study  AHMoSe vs full AutoML approach to support model selection. 41 RMSE (AutoML) RMSE (AHMoSe) Difference % Scenario A Complete Knowledge 0.430 0.403 ▼ 6.3% Scenario B Incomplete Knowledge 0.458 0.385 ▼ 16.0%
  • 42. Qualitative Evaluation  10 open ended questions  5 viticulture experts and 4 ML experts.  Thematic Analysis: potential use cases, trust, usability, and understandability.
  • 43. Qualitative Evaluation - Trust 43  Showing the dis/agreement of model outputs with expert’s knowledge can promote trust. “The thing that makes us trust the models is the fact that most of the time, there is a good agreement between the values predicted by the model and the ones obtained for the knowledge of the experts.” – Viticulture Expert
  • 46. Design and Evaluation 46 Gutiérrez F., Cardoso B., Verbert K. (2017). PHARA: a personal health augmented reality assistant to support decision-making at grocery stores. In: Proceedings of the International Workshop on Health Recommender Systems co-located with ACM RecSys 2017 (Paper No. 4) (10-13).
  • 47. What if the stakes are really high? Learning analytics & human resources Media consumption health Precision agriculture FinTech & Insurtech
  • 49. RECOMMENDE R ALGORITHMS MACHINE LEARNING INTERACTIVE DASHBOARDS SMART ALERTS RICH CARE PLANS OPEN IoT ARCHITECTURE Explaining predictions health
  • 50. 50 Gutiérrez Hernández, F. S., Htun, N. N., Vanden Abeele, V., De Croon, R., & Verbert, K. (2021). Explaining call recommendations in nursing homes: a user-centered design approach for interacting with knowledge-based health decision support systems. In Proceedings of the 27th Annual Conference on Intelligent User Interfaces. ACM. Explaining predictions health
  • 51. Evaluation  12 nurses used the app for three months  Data collection  Interaction logs  Resque questions  Semi-structured interviews 51
  • 52.  12 nurses during 3 months 52
  • 53. Results  Iterative design process identified several important features, such as the pending list, overview and the feedback shortcut to encourage feedback.  Explanations seem to contribute well to better support the healthcare professionals.  Results indicate a better understanding of the call notifications by being able to see the reasons of the calls.  More trust in the recommendations and increased perceptions of transparency and control  Interaction patterns indicate that users engaged well with the interface, although some users did not use all features to interact with the system.  Need for further simplification and personalization. 53
  • 54. 54
  • 55. 55 Explaining recommendations Word cloud Feature importance Feature importance+ % Maxwell Szymanski, Vero Vanden Abeele and Katrien Verbert Explaining health recommendations to lay users: The dos and don’ts – Apex-IUI 2022 health
  • 56. Why do users (dis)like the explanation? 56
  • 59. Results  Hybrid explanations more useful compared to both the textual and visual explanations.  Users with a higher NFC tend to score the hybrid explanations lower in terms of trust, transparency and usefulness compared to the unimodal explanation. 59
  • 60. Results  Participants with low NFC have a better perception of hybrid explanations  Participants with high NFC have a better perception of unimodal explanations 60
  • 62. Combining XAI methods to address different dimensions of explainability  Increasing actionability through interactive what-if analysis  Explanations through actionable features instead of non- actionable features  Color-coded visual indicators for easy identification of patients with high risk  Data-centric directive explanations 62 Bhattacharya, A., Ooge, J., Stiglic, G., & Verbert, K. (2023, March). Directive Explanations for Monitoring the Risk of Diabetes Onset: Introducing Directive Data-Centric Explanations and Combinations to Support What-If Explorations. In Proceedings of the 28th International Conference on Intelligent User Interfaces (pp. 204-219).
  • 63. Next steps: model steering 63
  • 65. Steering Approaches 65 Awareness through: • Scores and measures • Visualizations • Issue description Steering approaches
  • 66. 66 Next steps: Fair AI FinTech - InsurTech
  • 67. Data-centric explanation methods for fraud detection Explanations in high-stake domains will become mandatory by EU regulations Transparent and interactive data matching  Insurance premium simulations  Link with external data sources E.g. occupational accidents, absenteeism data 67 https://human-centered.ai/project/explainable-ai-fwf-32554/
  • 68. Take-away messages  Involvement of end-users has been key to come up with interfaces tailored to the needs of non-expert users  Actionable vs non-actionable parameters  Domain expertise of users and need for cognition important personal characteristics  Need for personalisation and simplification  Data-centric explanations provide powerful solution 68
  • 69. Peter BrusliovskyNava Tintarev Cristina Conati Denis Parra Collaborations Bart KnijnenburgJurgen Ziegler

Editor's Notes

  1. The prediction model shows the impact of the food product on the weight of the participant. Opacity is used to represent the uncertainty of this prediction. (POINT to third card)
  2. “Insight vs. information overload” Most users prefer more information (holistic overview of inputs) However, some users experienced information overload → Future work - Do personal characteristics such as NFC influence this?