Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Easy Uncertainty Quantification (EasyUQ): Generating Predictive Distributions from Single-Valued Model Output
SIAM Review ( IF 10.8 ) Pub Date : 2024-02-08 , DOI: 10.1137/22m1541915 Eva-Maria Walz , Alexander Henzi , Johanna Ziegel , Tilmann Gneiting
SIAM Review ( IF 10.8 ) Pub Date : 2024-02-08 , DOI: 10.1137/22m1541915 Eva-Maria Walz , Alexander Henzi , Johanna Ziegel , Tilmann Gneiting
SIAM Review, Volume 66, Issue 1, Page 91-122, February 2024.
How can we quantify uncertainty if our favorite computational tool---be it a numerical, statistical, or machine learning approach, or just any computer model---provides single-valued output only? In this article, we introduce the Easy Uncertainty Quantification (EasyUQ) technique, which transforms real-valued model output into calibrated statistical distributions, based solely on training data of model output--outcome pairs, without any need to access model input. In its basic form, EasyUQ is a special case of the recently introduced isotonic distributional regression (IDR) technique that leverages the pool-adjacent-violators algorithm for nonparametric isotonic regression. EasyUQ yields discrete predictive distributions that are calibrated and optimal in finite samples, subject to stochastic monotonicity. The workflow is fully automated, without any need for tuning. The Smooth EasyUQ approach supplements IDR with kernel smoothing, to yield continuous predictive distributions that preserve key properties of the basic form, including both stochastic monotonicity with respect to the original model output and asymptotic consistency. For the selection of kernel parameters, we introduce multiple one-fit grid search, a computationally much less demanding approximation to leave-one-out cross-validation. We use simulation examples and forecast data from weather prediction to illustrate the techniques. In a study of benchmark problems from machine learning, we show how EasyUQ and Smooth EasyUQ can be integrated into the workflow of neural network learning and hyperparameter tuning, and we find EasyUQ to be competitive with conformal prediction as well as more elaborate input-based approaches.
中文翻译:
轻松不确定性量化 (EasyUQ):从单值模型输出生成预测分布
SIAM Review,第 66 卷,第 1 期,第 91-122 页,2024 年 2 月。
如果我们最喜欢的计算工具(无论是数值、统计或机器学习方法,还是任何计算机模型),我们如何量化不确定性仅提供单值输出?在本文中,我们介绍了简单不确定性量化(EasyUQ)技术,该技术仅基于模型输出-结果对的训练数据,将实值模型输出转换为校准的统计分布,而无需访问模型输入。就其基本形式而言,EasyUQ 是最近引入的等渗分布回归 (IDR) 技术的一个特例,该技术利用池相邻违规者算法进行非参数等渗回归。 EasyUQ 产生离散预测分布,这些分布在有限样本中经过校准和优化,并服从随机单调性。工作流程完全自动化,无需任何调整。 Smooth EasyUQ 方法通过核平滑补充了 IDR,以产生连续的预测分布,保留基本形式的关键属性,包括相对于原始模型输出的随机单调性和渐近一致性。对于内核参数的选择,我们引入了多个一次性网格搜索,这是一种计算要求较低的留一交叉验证近似。我们使用模拟示例和天气预报的预测数据来说明这些技术。在机器学习基准问题的研究中,我们展示了 EasyUQ 和 Smooth EasyUQ 如何集成到神经网络学习和超参数调整的工作流程中,我们发现 EasyUQ 与保形预测以及更精细的基于输入的方法相比具有竞争力。
更新日期:2024-02-08
How can we quantify uncertainty if our favorite computational tool---be it a numerical, statistical, or machine learning approach, or just any computer model---provides single-valued output only? In this article, we introduce the Easy Uncertainty Quantification (EasyUQ) technique, which transforms real-valued model output into calibrated statistical distributions, based solely on training data of model output--outcome pairs, without any need to access model input. In its basic form, EasyUQ is a special case of the recently introduced isotonic distributional regression (IDR) technique that leverages the pool-adjacent-violators algorithm for nonparametric isotonic regression. EasyUQ yields discrete predictive distributions that are calibrated and optimal in finite samples, subject to stochastic monotonicity. The workflow is fully automated, without any need for tuning. The Smooth EasyUQ approach supplements IDR with kernel smoothing, to yield continuous predictive distributions that preserve key properties of the basic form, including both stochastic monotonicity with respect to the original model output and asymptotic consistency. For the selection of kernel parameters, we introduce multiple one-fit grid search, a computationally much less demanding approximation to leave-one-out cross-validation. We use simulation examples and forecast data from weather prediction to illustrate the techniques. In a study of benchmark problems from machine learning, we show how EasyUQ and Smooth EasyUQ can be integrated into the workflow of neural network learning and hyperparameter tuning, and we find EasyUQ to be competitive with conformal prediction as well as more elaborate input-based approaches.
中文翻译:
轻松不确定性量化 (EasyUQ):从单值模型输出生成预测分布
SIAM Review,第 66 卷,第 1 期,第 91-122 页,2024 年 2 月。
如果我们最喜欢的计算工具(无论是数值、统计或机器学习方法,还是任何计算机模型),我们如何量化不确定性仅提供单值输出?在本文中,我们介绍了简单不确定性量化(EasyUQ)技术,该技术仅基于模型输出-结果对的训练数据,将实值模型输出转换为校准的统计分布,而无需访问模型输入。就其基本形式而言,EasyUQ 是最近引入的等渗分布回归 (IDR) 技术的一个特例,该技术利用池相邻违规者算法进行非参数等渗回归。 EasyUQ 产生离散预测分布,这些分布在有限样本中经过校准和优化,并服从随机单调性。工作流程完全自动化,无需任何调整。 Smooth EasyUQ 方法通过核平滑补充了 IDR,以产生连续的预测分布,保留基本形式的关键属性,包括相对于原始模型输出的随机单调性和渐近一致性。对于内核参数的选择,我们引入了多个一次性网格搜索,这是一种计算要求较低的留一交叉验证近似。我们使用模拟示例和天气预报的预测数据来说明这些技术。在机器学习基准问题的研究中,我们展示了 EasyUQ 和 Smooth EasyUQ 如何集成到神经网络学习和超参数调整的工作流程中,我们发现 EasyUQ 与保形预测以及更精细的基于输入的方法相比具有竞争力。