Reference
Shapley Residuals: Quantifying the limits of the Shapley value for explanations,
Advances in neural information processing systems(2021)
Abstract
Popular feature importance techniques compute additive approximations to nonlin-
ear models by first defining a cooperative game describing the value of different
subsets of the model’s features, then calculating the resulting game’s Shapley
values to attribute credit additively between the features. However, the specific
modeling settings in which the Shapley values are a poor approximation for the
true game have not been well-described. In this paper we utilize an interpretation
of Shapley values as the result of an orthogonal projection between vector spaces
to calculate a residual representing the kernel component of that projection. We
provide an algorithm for computing these residuals, characterize different modeling
settings based on the value of the residuals, and demonstrate that they capture infor-
mation about model predictions that Shapley values cannot. Shapley residuals can
thus act as a warning to practitioners against overestimating the degree to which
Shapley-value-based explanations give them insight into a model.