Skip to main content

Showing 1–20 of 20 results for author: Ribeiro, M T

  1. arXiv:2306.03280  [pdf, other

    cs.HC

    AHA!: Facilitating AI Impact Assessment by Generating Examples of Harms

    Authors: Zana Buçinca, Chau Minh Pham, Maurice Jakesch, Marco Tulio Ribeiro, Alexandra Olteanu, Saleema Amershi

    Abstract: While demands for change and accountability for harmful AI consequences mount, foreseeing the downstream effects of deploying AI systems remains a challenging task. We developed AHA! (Anticipating Harms of AI), a generative framework to assist AI practitioners and decision-makers in anticipating potential harms and unintended consequences of AI systems prior to development or deployment. Given an… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  2. arXiv:2305.17804  [pdf, other

    cs.CL

    Targeted Data Generation: Finding and Fixing Model Weaknesses

    Authors: Zexue He, Marco Tulio Ribeiro, Fereshte Khani

    Abstract: Even when aggregate accuracy is high, state-of-the-art NLP models often fail systematically on specific subgroups of data, resulting in unfair outcomes and eroding user trust. Additional data collection may not help in addressing these weaknesses, as such challenging subgroups may be unknown to users, and underrepresented in the existing and new data. We propose Targeted Data Generation (TDG), a f… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  3. arXiv:2305.12219  [pdf, other

    cs.LG cs.AI cs.CL

    Collaborative Development of NLP models

    Authors: Fereshte Khani, Marco Tulio Ribeiro

    Abstract: Despite substantial advancements, Natural Language Processing (NLP) models often require post-training adjustments to enforce business rules, rectify undesired behavior, and align with user values. These adjustments involve operationalizing "concepts"--dictating desired model responses to certain inputs. However, it's difficult for a single entity to enumerate and define all possible concepts, ind… ▽ More

    Submitted 24 May, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

  4. arXiv:2304.09991  [pdf, other

    cs.HC cs.AI cs.CL

    Supporting Human-AI Collaboration in Auditing LLMs with LLMs

    Authors: Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi

    Abstract: Large language models are becoming increasingly pervasive and ubiquitous in society via deployment in sociotechnical systems. Yet these language models, be it for classification or generation, have been shown to be biased and behave irresponsibly, causing harm to people at scale. It is crucial to audit these language models rigorously. Existing auditing tools leverage either or both humans and AI… ▽ More

    Submitted 30 November, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: 21 pages, 3 figures

    Journal ref: In Proceedings of the 2023 AAAI and ACM Conference on AI, Ethics, and Society. Association for Computing Machinery, New York, NY, USA, 913-926

  5. arXiv:2303.12712  [pdf, other

    cs.CL cs.AI

    Sparks of Artificial General Intelligence: Early experiments with GPT-4

    Authors: Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang

    Abstract: Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an earl… ▽ More

    Submitted 13 April, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

  6. arXiv:2303.09014  [pdf, other

    cs.CL

    ART: Automatic multi-step reasoning and tool-use for large language models

    Authors: Bhargavi Paranjape, Scott Lundberg, Sameer Singh, Hannaneh Hajishirzi, Luke Zettlemoyer, Marco Tulio Ribeiro

    Abstract: Large language models (LLMs) can perform complex reasoning in few- and zero-shot settings by generating intermediate chain of thought (CoT) reasoning steps. Further, each reasoning step can rely on external tools to support computation beyond the core LLM capabilities (e.g. search/running code). Prior work on CoT prompting and tool use typically requires hand-crafting task-specific demonstrations… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

  7. ScatterShot: Interactive In-context Example Curation for Text Transformation

    Authors: Tongshuang Wu, Hua Shen, Daniel S. Weld, Jeffrey Heer, Marco Tulio Ribeiro

    Abstract: The in-context learning capabilities of LLMs like GPT-3 allow annotators to customize an LLM to their specific tasks with a small number of examples. However, users tend to include only the most obvious patterns when crafting examples, resulting in underspecified in-context functions that fall short on unseen cases. Further, it is hard to know when "enough" examples have been included even for kno… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

    Comments: IUI 2023: 28th International Conference on Intelligent User Interfaces

  8. arXiv:2212.04089  [pdf, other

    cs.LG cs.CL cs.CV

    Editing Models with Task Arithmetic

    Authors: Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, Ali Farhadi

    Abstract: Changing how pre-trained models behave -- e.g., improving their performance on a downstream task or mitigating biases learned during pre-training -- is a common practice when developing machine learning systems. In this work, we propose a new paradigm for steering the behavior of neural networks, centered around \textit{task vectors}. A task vector specifies a direction in the weight space of a pr… ▽ More

    Submitted 31 March, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: In Proceedings of the 11th International Conference on Learning Representations (ICLR 2023)

  9. arXiv:2212.02774  [pdf, other

    cs.CV

    Adaptive Testing of Computer Vision Models

    Authors: Irena Gao, Gabriel Ilharco, Scott Lundberg, Marco Tulio Ribeiro

    Abstract: Vision models often fail systematically on groups of data that share common semantic characteristics (e.g., rare objects or unusual scenes), but identifying these failure modes is a challenge. We introduce AdaVision, an interactive process for testing vision models which helps users identify and fix coherent failure modes. Given a natural language description of a coherent group, AdaVision retriev… ▽ More

    Submitted 16 August, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

    Comments: ICCV camera-ready

  10. arXiv:2211.03318  [pdf, other

    cs.CL

    Fixing Model Bugs with Natural Language Patches

    Authors: Shikhar Murty, Christopher D. Manning, Scott Lundberg, Marco Tulio Ribeiro

    Abstract: Current approaches for fixing systematic problems in NLP models (e.g. regex patches, finetuning on more data) are either brittle, or labor-intensive and liable to shortcuts. In contrast, humans often provide corrections to each other through natural language. Taking inspiration from this, we explore natural language patches -- declarative statements that allow developers to provide corrective feed… ▽ More

    Submitted 20 November, 2022; v1 submitted 7 November, 2022; originally announced November 2022.

    Comments: Accepted at EMNLP 2022 [Fixed fig-1]

  11. arXiv:2205.00130  [pdf, other

    cs.CL cs.LG

    ExSum: From Local Explanations to Model Understanding

    Authors: Yilun Zhou, Marco Tulio Ribeiro, Julie Shah

    Abstract: Interpretability methods are developed to understand the working mechanisms of black-box models, which is crucial to their responsible deployment. Fulfilling this goal requires both that the explanations generated by these methods are correct and that people can easily and reliably understand them. While the former has been addressed in prior work, the latter is often overlooked, resulting in info… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

    Comments: NAACL 2022. The project website is at https://yilunzhou.github.io/exsum/

  12. arXiv:2106.02112  [pdf, other

    cs.LG

    Finding and Fixing Spurious Patterns with Explanations

    Authors: Gregory Plumb, Marco Tulio Ribeiro, Ameet Talwalkar

    Abstract: Image classifiers often use spurious patterns, such as "relying on the presence of a person to detect a tennis racket, which do not generalize. In this work, we present an end-to-end pipeline for identifying and mitigating spurious patterns for such models, under the assumption that we have access to pixel-wise object-annotations. We start by identifying patterns such as "the model's prediction fo… ▽ More

    Submitted 17 August, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

  13. arXiv:2104.14403  [pdf, other

    cs.LG cs.CV

    Do Feature Attribution Methods Correctly Attribute Features?

    Authors: Yilun Zhou, Serena Booth, Marco Tulio Ribeiro, Julie Shah

    Abstract: Feature attribution methods are popular in interpretable machine learning. These methods compute the attribution of each input feature to represent its importance, but there is no consensus on the definition of "attribution", leading to many competing methods with little systematic evaluation, complicated in particular by the lack of ground truth attribution. To address this, we propose a dataset… ▽ More

    Submitted 15 December, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: AAAI 2022. Video summary at https://www.youtube.com/watch?v=kAodFw6jvvo

  14. arXiv:2101.00288  [pdf, other

    cs.CL

    Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models

    Authors: Tongshuang Wu, Marco Tulio Ribeiro, Jeffrey Heer, Daniel S. Weld

    Abstract: While counterfactual examples are useful for analysis and training of NLP models, current generation methods either rely on manual labor to create very few counterfactuals, or only instantiate limited types of perturbations such as paraphrases or word substitutions. We present Polyjuice, a general-purpose counterfactual generator that allows for control over perturbation types and locations, train… ▽ More

    Submitted 1 June, 2021; v1 submitted 1 January, 2021; originally announced January 2021.

    Comments: ACL 2021, main conference, long paper

  15. arXiv:2006.14779  [pdf, other

    cs.AI cs.CL cs.HC cs.LG

    Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance

    Authors: Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, Daniel S. Weld

    Abstract: Many researchers motivate explainable AI with studies showing that human-AI team performance on decision-making tasks improves when the AI explains its recommendations. However, prior studies observed improvements from explanations only when the AI, alone, outperformed both the human and the best team. Can explanations help lead to complementary performance, where team accuracy is higher than eith… ▽ More

    Submitted 12 January, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: CHI'21

  16. arXiv:2005.04118  [pdf, other

    cs.CL cs.LG

    Beyond Accuracy: Behavioral Testing of NLP models with CheckList

    Authors: Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, Sameer Singh

    Abstract: Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models either focus on individual tasks or on specific behaviors. Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a task-agnostic methodology for testing NLP mod… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Journal ref: Association for Computational Linguistics (ACL), 2020

  17. arXiv:1611.07579  [pdf, other

    stat.ML cs.AI cs.LG

    Programs as Black-Box Explanations

    Authors: Sameer Singh, Marco Tulio Ribeiro, Carlos Guestrin

    Abstract: Recent work in model-agnostic explanations of black-box machine learning has demonstrated that interpretability of complex models does not have to come at the cost of accuracy or model flexibility. However, it is not clear what kind of explanations, such as linear models, decision trees, and rule lists, are the appropriate family to consider, and different tasks and models may benefit from differe… ▽ More

    Submitted 22 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

  18. arXiv:1611.05817  [pdf, other

    stat.ML cs.AI cs.LG

    Nothing Else Matters: Model-Agnostic Explanations By Identifying Prediction Invariance

    Authors: Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

    Abstract: At the core of interpretable machine learning is the question of whether humans are able to make accurate predictions about a model's behavior. Assumed in this question are three properties of the interpretable output: coverage, precision, and effort. Coverage refers to how often humans think they can predict the model's behavior, precision to how accurate humans are in those predictions, and effo… ▽ More

    Submitted 17 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

  19. arXiv:1606.05386  [pdf, other

    stat.ML cs.LG

    Model-Agnostic Interpretability of Machine Learning

    Authors: Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

    Abstract: Understanding why machine learning models behave the way they do empowers both system designers and end-users in many ways: in model selection, feature engineering, in order to trust and act upon the predictions, and in more intuitive user interfaces. Thus, interpretability has become a vital concern in machine learning, and work in the area of interpretable models has found renewed interest. In s… ▽ More

    Submitted 16 June, 2016; originally announced June 2016.

    Comments: presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY

  20. arXiv:1602.04938  [pdf, other

    cs.LG cs.AI stat.ML

    "Why Should I Trust You?": Explaining the Predictions of Any Classifier

    Authors: Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

    Abstract: Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy… ▽ More

    Submitted 9 August, 2016; v1 submitted 16 February, 2016; originally announced February 2016.