Enhancing Counterfactual Explanations for Improved Understanding

Slide Note

This article explores the concept of generating interpretable, diverse, and plausible counterfactual explanations within explainable AI (XAI). It highlights the challenges with current methods, introduces an instance-guided approach, and emphasizes the importance of good counterfactuals. The discussion covers the characteristics of good counterfactual explanations and related work in XAI, focusing on feature-based and example-based methods. The need for maximizing similarity, sparsity, availability, and diversity in counterfactuals is emphasized to provide valuable insights for users. Lastly, it examines different approaches, such as perturbation and instance-based methods, aiming to improve the quality of counterfactual explanations.

len_bra Follow

Uploaded on Sep 11, 2024 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

A Few Good Counterfactuals: Generating Interpretable, Plausible and Diverse Counterfactual Explanations Norwegian University of Science and Technology TDT55 Sep 26, 2023

Context Authors: Barry Smyth University College Dublin, Ireland AI (CBR, ML, recommender systems, user modelling and personalization) Mark T. Keane University College Dublin, Ireland Cognitive science and AI (CBR, ML and XAI) Published 2021 2

Counterfactual explanations Counterfactuals can be used in XAI Perturbation-based approaches are the most popular Manipulate feature values to change the prediction Can generate invalid data points or involve feature-values that do not naturally occur Instance-based approaches Generate synthetic counterfactuals grounded in the dataset and from naturally occurring feature values Current methods are incomplete (fail to identify good counterfactuals) and inefficient (no guarantee to generate the best possible counterfactuals) 3

Goal Present an instance-guided method for generating counterfactuals that addresses the shortcomings of other similar approaches Case-based solution Multi-class problems 4

Good counterfactual explanations Similar Maximally similar to the target query Sparse Differ in as few features as possible Plausible Modify features/values that make sense to the user Available Available for a majority of targets Diverse Use a variety of features to offer counterfactuals that highlight different perspectives 5

Related work XAI in general: Feature-based methods Example-based explanations Counterfactual explanations Discussed from a psychological perspective and a legal perspective 6

Related work Counterfactuals: Perturbation approaches Blind perturbation can generate counterfactuals that lack sparsity, diversity and with out-of-distribution feature values Modify the loss function to minimize the number of different features Extend the optimization function to deal with diversity Counterfactuals: Instance-based approaches FACE (Feasible and Actionable Counterfactual Explanations) KS20 7

Related work KS20 Good counterfactual p to p exists if class(p ) class(p) and they differ with 2 features Idea: Good counterfactual pairs are rare, but any that do exist can be adapted in different ways to construct new good counterfactuals Method: find a nearest like neighbour (NLN) q (class(p) = class(q)) such that there exists a q where class(q ) class(q) and q and q differ with 2 features Differences between q and q are used to identify the feature values in p that need to be changed to produce p . 8

Related work KS20 Limitations: Generates only one type of counterfactual because the nearest counterfactial-pair specifies one set of different features For multi-class problems it is desirable to consider counterfactuals from several classes The nearest counterfactual pair q-q may not construct the best available counterfactual Generate a counterfatual from another counterfactual pair q2-q2 that is even more similar to p Counterfactuals from not the nearest conuterfactual pair may be preferable if the set of different features are more plausible/ actionable 9

Method Build on KS20 by considering the k > 1 nearest neighbour counterfactual-pairs Definitions: I = set of training cases Explanation case xc = contains a target query instance x and a nearby counterfactual instance x class(x) class(x ) x and x differ by 2 features xc is also associated with a set of match features (matches(x, x )) and a set of difference features (diffs(x, x )) XC contains xc from instances in I xc acts as a template for generating new counterfactuals by identifying which features can be changed (diffs) and those that cannot (matches) 10

Method - Algorithm 1) Identify the k > 1 nearest xcs with d differences 2) Construct candidate counterfactuals 3) Validate the candidate counterfactuals 11

Method Algorithm (step 1) Find nearest xcs based on the similarity between target query p and each xc.x (class(p) = class(xc.x)) Select the k nearest-neighbours 12

Method Algorithm (step 2) Define a set of nearest unlike neighbours, NUN. Consists of all xc.x Include instances in I with the same class as xc.x Generate a counterfactual (cf) from each instanse in the NUN-set Use the sets matches and diffs for the instance with the target p. The counterfactual is the union of these sets (if the number of elements in diffs is d). 13

Method Algorithm (step 3) Validate that class(cf) class(p) If there are more than two classes 1) The couterfactual is valid if it s class differs from p s class 2) M(cf) = class(xc.x ), i.e., confirm that the counterfactual has the same class as the counterfactual instance used to produce it The second alternative is a stronger test because it accepts only one class compared to n 1 valid classes in the first alternative. 14

Method This method gives better coverage, plausibility and diversity than KS20. Coverage: Generating more counterfactual candidates improves the chances of producing valid counterfactuals. Plausibility: The approach has the potential to generate counterfactuals that are even more similar to the target problem than those associated with a single nearest explanation case. Diversity: Different explanation cases may rely on different combinations of match/difference features. 15

Experiments 10 common datasets d = 2 and d = 3 1 k 100 16

Evaluation 10 % of training instances are used as target problems, XC consist of two times as many instances as there are target problems, and the remaining instances are used to train the underlying classifier 17

Evaluation Metrics Test Coverage: Fraction of target problems associated with a good counterfactual. Relative Distance: Ratio of the distance between the closest counterfactual produced and its target problem p, and the distance between the target problem xc.x and xc.x . A relative distance < 1 means the new counterfactual is closer to p than xc.x was to xc.x . Feature Diversity: Fraction of unique difference features that appear in the produced counterfactuals. 18

Results Fig a: The ability to produce good conuterfactuals generally increases with k. Fig b: The relative distance is reduced for increasing k. Thus, by considering additional explanation cases, even those that are further away from the target problem, they can generate good counterfactuals that are closer to the target.

Results Fig c: The feature diversity increases for increasing k, but not all datasets produce counterfactuals with high level of diversity. Fig d: The best value for k varies, but in practice values of k in the range 10 k 20 perform well, in terms of coverage, relative distance and diversity, for all datasets.

Conclusion Contributions Present a unifying approach for instance-based counterfactual generation Systematic evaluation of instance-based techniques Future research: Prediction tasks, not only classification tasks Compare with perturbation-based methods 21