Reference
Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with Dirichlet calibration,
Advances in Neural Information Processing Systems 32(2019)
Abstract
Class probabilities predicted by most multiclass classifiers are uncalibrated, often
tending towards over-confidence. With neural networks, calibration can be im proved by temperature scaling, a method to learn a single corrective multiplicative
factor for inputs to the last softmax layer. On non-neural models the existing
methods apply binary calibration in a pairwise or one-vs-rest fashion. We propose
a natively multiclass calibration method applicable to classifiers from any model
class, derived from Dirichlet distributions and generalising the beta calibration
method from binary classification. It is easily implemented with neural nets since it
is equivalent to log-transforming the uncalibrated probabilities, followed by one lin ear layer and softmax. Experiments demonstrate improved probabilistic predictions
according to multiple measures (confidence-ECE, classwise-ECE, log-loss, Brier
score) across a wide range of datasets and classifiers. Parameters of the learned
Dirichlet calibration map provide insights to the biases in the uncalibrated model.