Aleatoric and epistemic uncertainty in machine learning

This talk introduces the notions of aleatoric and epistemic uncertainty for probabilistic models and gives an overview of recent developments in this area. It is mainly based on the corresponding review paper.

Aleatoric uncertainty arises from inherent randomness or variability in the data, such as measurement errors or natural variations, and cannot be easily reduced. Epistemic uncertainty, on the other hand, stems from lack of knowledge or information about the data or model, such as model misspecification or insufficient data, and can often be decreased by including more data or using better models. Unfortunately, in real-world applications, it is often difficult to distinguish between these two types of uncertainty.

Several methods for estimating and quantifying uncertainty are discussed, including calibration, likelihood-based methods, and conformal prediction. Less well-known methods, rooted outside of probability theory, are also introduced and their advantages and disadvantages are discussed.

References

In this series