当前位置:
X-MOL 学术
›
Transp. Res. Part C Emerg. Technol.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Inferring heterogeneous treatment effects of crashes on highway traffic: A doubly robust causal machine learning approach
Transportation Research Part C: Emerging Technologies ( IF 7.6 ) Pub Date : 2024-03-01 , DOI: 10.1016/j.trc.2024.104537 Shuang Li , Ziyuan Pu , Zhiyong Cui , Seunghyeon Lee , Xiucheng Guo , Dong Ngoduy
Transportation Research Part C: Emerging Technologies ( IF 7.6 ) Pub Date : 2024-03-01 , DOI: 10.1016/j.trc.2024.104537 Shuang Li , Ziyuan Pu , Zhiyong Cui , Seunghyeon Lee , Xiucheng Guo , Dong Ngoduy
Accurate estimating causal effects of crashes on highway traffic is crucial for mitigating the negative impacts of crashes. Previous studies have built up a series of methods via traditional causal inference theory and machine learning methods to estimate the impacts of crashes. Since the structures and variable dimensions of traditional causal inference models are pre-defined, they can not accommodate the characteristics of individual crashes. They only can estimate the average causal effects for the crashes in certain categories, e.g., crash types, crash severity, and occurring locations. For machine learning-based algorithms, they cannot be used for causal reasoning due to their reliance on correlation rather than causation. However, considering the impacts of crashes on traffic status vary across influential factors, such as time periods and locations, heterogeneous causal effects are essential for a better understanding of the effects on traffic status and crash intervention strategy development. To address the aforementioned issues, this study proposes a novel doubly robust causal machine learning framework to infer heterogeneous treatment effects of crashes on highway traffic status. Doubly Robust Learning (DRL), integrating machine learning techniques to perform predictive tasks, is applied into the framework due to its stronger robustness. Considerning treatment predictors and colliders may bring bias in estimation results, Conditional Shapley Value Index (CSVI) is proposed for selecting confounders from numerous factors. A 3-year crah dataset collected by 3594 real highway crashes in Washington is utilized for demonstrating the designed experiments, including construting confidence intervals, estimated errors evaluation, and sensitivity analysis of variable selection for various thresholds of CSVI. According to the results, the distinctive propagation and dissipation processes of congestion caused by various types of crashes can be achieved. The results also validate the effectiveness of variable selection, and the superiority in estimation accuracy compared to the selected baseline models. Future study includes considering spatial–temporal causal relationships and predicting counterfactual real-time traffic conditions.
中文翻译:
推断高速公路交通事故的异质处理效果:一种双重稳健的因果机器学习方法
准确估计车祸对高速公路交通的因果影响对于减轻车祸的负面影响至关重要。先前的研究已经通过传统的因果推理理论和机器学习方法建立了一系列方法来估计碰撞的影响。由于传统因果推理模型的结构和变量维度是预先定义的,因此无法适应个别事故的特征。他们只能估计某些类别的事故的平均因果影响,例如事故类型、事故严重程度和发生地点。对于基于机器学习的算法,它们不能用于因果推理,因为它们依赖于相关性而不是因果关系。然而,考虑到碰撞对交通状况的影响因时间段和地点等影响因素而异,异质因果效应对于更好地理解碰撞对交通状况的影响和碰撞干预策略的制定至关重要。为了解决上述问题,本研究提出了一种新颖的双鲁棒因果机器学习框架,以推断碰撞事故对高速公路交通状况的异质处理效果。双鲁棒学习(DRL)集成了机器学习技术来执行预测任务,由于其更强的鲁棒性而被应用到框架中。考虑到治疗预测因素和碰撞因素可能会给估计结果带来偏差,提出条件沙普利值指数(CSVI)用于从众多因素中选择混杂因素。利用华盛顿州 3594 起真实高速公路事故收集的 3 年 Crah 数据集来演示设计的实验,包括构建置信区间、估计误差评估以及针对 CSVI 各种阈值的变量选择的敏感性分析。根据结果,可以实现不同类型碰撞引起的拥塞的独特传播和消散过程。结果还验证了变量选择的有效性,以及与所选基线模型相比估计精度的优越性。未来的研究包括考虑时空因果关系和预测反事实的实时交通状况。
更新日期:2024-03-01
中文翻译:
推断高速公路交通事故的异质处理效果:一种双重稳健的因果机器学习方法
准确估计车祸对高速公路交通的因果影响对于减轻车祸的负面影响至关重要。先前的研究已经通过传统的因果推理理论和机器学习方法建立了一系列方法来估计碰撞的影响。由于传统因果推理模型的结构和变量维度是预先定义的,因此无法适应个别事故的特征。他们只能估计某些类别的事故的平均因果影响,例如事故类型、事故严重程度和发生地点。对于基于机器学习的算法,它们不能用于因果推理,因为它们依赖于相关性而不是因果关系。然而,考虑到碰撞对交通状况的影响因时间段和地点等影响因素而异,异质因果效应对于更好地理解碰撞对交通状况的影响和碰撞干预策略的制定至关重要。为了解决上述问题,本研究提出了一种新颖的双鲁棒因果机器学习框架,以推断碰撞事故对高速公路交通状况的异质处理效果。双鲁棒学习(DRL)集成了机器学习技术来执行预测任务,由于其更强的鲁棒性而被应用到框架中。考虑到治疗预测因素和碰撞因素可能会给估计结果带来偏差,提出条件沙普利值指数(CSVI)用于从众多因素中选择混杂因素。利用华盛顿州 3594 起真实高速公路事故收集的 3 年 Crah 数据集来演示设计的实验,包括构建置信区间、估计误差评估以及针对 CSVI 各种阈值的变量选择的敏感性分析。根据结果,可以实现不同类型碰撞引起的拥塞的独特传播和消散过程。结果还验证了变量选择的有效性,以及与所选基线模型相比估计精度的优越性。未来的研究包括考虑时空因果关系和预测反事实的实时交通状况。