npj Digital Medicine ( IF 12.4 ) Pub Date : 2024-12-18 , DOI: 10.1038/s41746-024-01360-w Kirsten Zantvoort, Barbara Nacke, Dennis Görlich, Silvan Hornstein, Corinna Jacobi, Burkhardt Funk
Artificial intelligence promises to revolutionize mental health care, but small dataset sizes and lack of robust methods raise concerns about result generalizability. To provide insights on minimal necessary data set sizes, we explore domain-specific learning curves for digital intervention dropout predictions based on 3654 users from a single study (ISRCTN13716228, 26/02/2016). Prediction performance is analyzed based on dataset size (N = 100–3654), feature groups (F = 2–129), and algorithm choice (from Naive Bayes to Neural Networks). The results substantiate the concern that small datasets (N ≤ 300) overestimate predictive power. For uninformative feature groups, in-sample prediction performance was negatively correlated with dataset size. Sophisticated models overfitted in small datasets but maximized holdout test results in larger datasets. While N = 500 mitigated overfitting, performance did not converge until N = 750–1500. Consequently, we propose minimum dataset sizes of N = 500–1000. As such, this study offers an empirical reference for researchers designing or interpreting AI studies on Digital Mental Health Intervention data.
中文翻译:
数字心理健康干预中机器学习预测的最小数据集大小的估计
人工智能有望彻底改变心理健康护理,但较小的数据集规模和缺乏稳健的方法引发了人们对结果泛化性的担忧。为了提供关于最小必要数据集大小的见解,我们根据一项研究中的 3654 名用户(ISRCTN13716228,2016 年 2 月 26 日)探索了数字干预退出预测的特定领域的学习曲线。根据数据集大小 (N = 100–3654)、特征组 (F = 2–129) 和算法选择(从朴素贝叶斯到神经网络)分析预测性能。结果证实了对小型数据集 (N ≤ 300) 高估预测能力的担忧。对于信息量不足的特征组,样本内预测性能与数据集大小呈负相关。复杂的模型在小型数据集中过度拟合,但在较大的数据集中最大化了保持测试结果。虽然 N = 500 减轻了过拟合,但性能直到 N = 750–1500 才收敛。因此,我们建议最小数据集大小为 N = 500–1000。因此,本研究为设计或解释数字心理健康干预数据的 AI 研究的研究人员提供了实证参考。