Nature Machine Intelligence ( IF 18.8 ) Pub Date : 2024-12-13 , DOI: 10.1038/s42256-024-00932-5 Shaohua Fan, Renzhe Xu, Qian Dong, Yue He, Cheng Chang, Peng Cui
Survival analysis aims to estimate the impact of covariates on the expected time until an event occurs, which is broadly utilized in disciplines such as life sciences and healthcare, substantially influencing decision-making and improving survival outcomes. Existing methods, usually assuming similar training and testing distributions, nevertheless face challenges with real-world varying data sources, creating unpredictable shifts that undermine their reliability. This urgently necessitates that survival analysis methods should utilize stable features across diverse cohorts for predictions, rather than relying on spurious correlations. To this end, we propose a stable Cox model with theoretical guarantees to identify stable variables, which jointly optimizes an independence-driven sample reweighting module and a weighted Cox regression model. Through extensive evaluation on simulated and real-world omics and clinical data, stable Cox not only shows strong generalization ability across diverse independent test sets but also stratifies the subtype of patients significantly with the identified biomarker panels.
中文翻译:
用于分布偏移下生存分析的稳定 Cox 回归
生存分析旨在估计协变量对事件发生前预期时间的影响,这在生命科学和医疗保健等学科中得到广泛应用,对决策产生重大影响并改善生存结果。现有方法通常假设相似的训练和测试分布,但面临着现实世界中不同数据源的挑战,从而产生不可预测的变化,从而破坏了其可靠性。这迫切需要生存分析方法应该利用不同队列中的稳定特征进行预测,而不是依赖虚假的相关性。为此,我们提出了一个具有理论保证的稳定 Cox 模型来识别稳定变量,该模型联合优化了独立驱动的样本重新加权模块和加权 Cox 回归模型。通过对模拟和真实世界组学和临床数据的广泛评估,稳定的 Cox 不仅在各种独立测试集中表现出强大的泛化能力,而且还使用已识别的生物标志物面板对患者亚型进行显着分层。