npj Digital Medicine ( IF 12.4 ) Pub Date : 2024-07-17 , DOI: 10.1038/s41746-024-01176-8 Vanessa R Weir 1 , Katelyn Dempsey 2 , Judy Wawira Gichoya 3 , Veronica Rotemberg 1 , An-Kwok Ian Wong 2, 4
Increasing evidence supports reduced accuracy of noninvasive assessment tools, such as pulse oximetry, temperature probes, and AI skin diagnosis benchmarks, in patients with darker skin tones. The FDA is exploring potential strategies for device regulation to improve performance across diverse skin tones by including skin tone criteria. However, there is no consensus about how prospective studies should perform skin tone assessment in order to take this bias into account. There are several tools available to conduct skin tone assessments including administered visual scales (e.g., Fitzpatrick Skin Type, Pantone, Monk Skin Tone) and color measurement tools (e.g., reflectance colorimeters, reflectance spectrophotometers, cameras), although none are consistently used or validated across multiple medical domains. Accurate and consistent skin tone measurement depends on many factors including standardized environments, lighting, body parts assessed, patient conditions, and choice of skin tone assessment tool(s). As race and ethnicity are inadequate proxies for skin tone, these considerations can be helpful in standardizing the effect of skin tone on studies such as AI dermatology diagnoses, pulse oximetry, and temporal thermometers. Skin tone bias in medical devices is likely due to systemic factors that lead to inadequate validation across diverse skin tones. There is an opportunity for researchers to use skin tone assessment methods with standardized considerations in prospective studies of noninvasive tools that may be affected by skin tone. We propose considerations that researchers must take in order to improve device robustness to skin tone bias.
中文翻译:
前瞻性研究中肤色评估的调查
越来越多的证据表明,对于肤色较深的患者,脉搏血氧仪、温度探头和人工智能皮肤诊断基准等无创评估工具的准确性会降低。 FDA 正在探索潜在的设备监管策略,通过纳入肤色标准来提高不同肤色的性能。然而,关于前瞻性研究应如何进行肤色评估以考虑到这种偏差,目前尚未达成共识。有多种工具可用于进行肤色评估,包括管理视觉量表(例如菲茨帕特里克肤色、潘通色、和尚肤色)和颜色测量工具(例如反射比色计、反射分光光度计、相机),尽管没有一个工具得到一致使用或验证跨越多个医学领域。准确且一致的肤色测量取决于许多因素,包括标准化环境、照明、评估的身体部位、患者状况以及肤色评估工具的选择。由于种族和民族不足以代表肤色,因此这些考虑因素有助于标准化肤色对人工智能皮肤病诊断、脉搏血氧饱和度和时间温度计等研究的影响。医疗设备中的肤色偏差可能是由于系统因素导致对不同肤色的验证不充分。研究人员有机会在可能受肤色影响的非侵入性工具的前瞻性研究中使用标准化考虑因素的肤色评估方法。我们提出了研究人员必须考虑的注意事项,以提高设备对肤色偏差的鲁棒性。