当前位置:
X-MOL 学术
›
Clin. Orthop. Relat. Res.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Is Information About Musculoskeletal Malignancies From Large Language Models or Web Resources at a Suitable Reading Level for Patients?
Clinical Orthopaedics and Related Research ( IF 4.2 ) Pub Date : 2024-09-25 , DOI: 10.1097/corr.0000000000003263 Paul G Guirguis,Mark P Youssef,Ankit Punreddy,Mina Botros,Mattie Raiford,Susan McDowell
Clinical Orthopaedics and Related Research ( IF 4.2 ) Pub Date : 2024-09-25 , DOI: 10.1097/corr.0000000000003263 Paul G Guirguis,Mark P Youssef,Ankit Punreddy,Mina Botros,Mattie Raiford,Susan McDowell
BACKGROUND
Patients and caregivers may experience immense distress when receiving the diagnosis of a primary musculoskeletal malignancy and subsequently turn to internet resources for more information. It is not clear whether these resources, including Google and ChatGPT, offer patients information that is readable, a measure of how easy text is to understand. Since many patients turn to Google and artificial intelligence resources for healthcare information, we thought it was important to ascertain whether the information they find is readable and easy to understand. The objective of this study was to compare readability of Google search results and ChatGPT answers to frequently asked questions and assess whether these sources meet NIH recommendations for readability.
QUESTIONS/PURPOSES
(1) What is the readability of ChatGPT-3.5 as a source of patient information for the three most common primary bone malignancies compared with top online resources from Google search? (2) Do ChatGPT-3.5 responses and online resources meet NIH readability guidelines for patient education materials?
METHODS
This was a cross-sectional analysis of the 12 most common online questions about osteosarcoma, chondrosarcoma, and Ewing sarcoma. To be consistent with other studies of similar design that utilized national society frequently asked questions lists, questions were selected from the American Cancer Society and categorized based on content, including diagnosis, treatment, and recovery and prognosis. Google was queried using all 36 questions, and top responses were recorded. Author types, such as hospital systems, national health organizations, or independent researchers, were recorded. ChatGPT-3.5 was provided each question in independent queries without further prompting. Responses were assessed with validated reading indices to determine readability by grade level. An independent t-test was performed with significance set at p < 0.05.
RESULTS
Google (n = 36) and ChatGPT-3.5 (n = 36) answers were recorded, 12 for each of the three cancer types. Reading grade levels based on mean readability scores were 11.0 ± 2.9 and 16.1 ± 3.6, respectively. This corresponds to the eleventh grade reading level for Google and a fourth-year undergraduate student level for ChatGPT-3.5. Google answers were more readable across all individual indices, without differences in word count. No difference in readability was present across author type, question category, or cancer type. Of 72 total responses across both search modalities, none met NIH readability criteria at the sixth-grade level.
CONCLUSION
Google material was presented at a high school reading level, whereas ChatGPT-3.5 was at an undergraduate reading level. The readability of both resources was inadequate based on NIH recommendations. Improving readability is crucial for better patient understanding during cancer treatment. Physicians should assess patients' needs, offer them tailored materials, and guide them to reliable resources to prevent reliance on online information that is hard to understand.
LEVEL OF EVIDENCE
Level III, prognostic study.
中文翻译:
来自大型语言模型或网络资源的有关肌肉骨骼恶性肿瘤的信息是否适合患者阅读?
背景技术患者和护理人员在收到原发性肌肉骨骼恶性肿瘤的诊断时可能会经历巨大的痛苦,并随后转向互联网资源以获取更多信息。目前尚不清楚这些资源(包括 Google 和 ChatGPT)是否为患者提供了可读的信息,衡量文本是否容易理解。由于许多患者转向谷歌和人工智能资源来获取医疗保健信息,我们认为确定他们找到的信息是否可读且易于理解非常重要。本研究的目的是比较 Google 搜索结果和 ChatGPT 对常见问题的解答的可读性,并评估这些来源是否符合 NIH 的可读性建议。问题/目的 (1) 与 Google 搜索的顶级在线资源相比,ChatGPT-3.5 作为三种最常见的原发性骨恶性肿瘤患者信息来源的可读性如何? (2) ChatGPT-3.5 响应和在线资源是否符合 NIH 患者教育材料可读性指南?方法 这是对有关骨肉瘤、软骨肉瘤和尤文肉瘤的 12 个最常见在线问题的横断面分析。为了与利用国家协会常见问题列表的类似设计的其他研究保持一致,问题选自美国癌症协会,并根据内容进行分类,包括诊断、治疗、恢复和预后。使用全部 36 个问题对 Google 进行了查询,并记录了最热门的回答。记录了作者类型,例如医院系统、国家卫生组织或独立研究人员。 ChatGPT-3.5 在独立查询中提供了每个问题,无需进一步提示。 使用经过验证的阅读指数对回答进行评估,以确定按年级水平的可读性。进行了独立 t 检验,显着性设置为 p < 0.05。结果 记录了 Google (n = 36) 和 ChatGPT-3.5 (n = 36) 的答案,三种癌症类型各有 12 个答案。基于平均可读性分数的阅读等级水平分别为 11.0 ± 2.9 和 16.1 ± 3.6。这相当于 Google 的十一年级阅读水平和 ChatGPT-3.5 的四年级本科生水平。谷歌的答案在所有单独的索引中都更具可读性,字数没有差异。不同作者类型、问题类别或癌症类型之间的可读性没有差异。在这两种搜索方式的 72 条回复中,没有一条符合 NIH 六年级水平的可读性标准。结论 Google 材料是针对高中阅读水平的,而 ChatGPT-3.5 是针对本科阅读水平的。根据 NIH 的建议,这两种资源的可读性都不够。提高可读性对于更好地理解癌症治疗期间的患者至关重要。医生应评估患者的需求,为他们提供量身定制的材料,并引导他们获得可靠的资源,以防止依赖难以理解的在线信息。证据级别 III 级,预后研究。
更新日期:2024-09-25
中文翻译:
来自大型语言模型或网络资源的有关肌肉骨骼恶性肿瘤的信息是否适合患者阅读?
背景技术患者和护理人员在收到原发性肌肉骨骼恶性肿瘤的诊断时可能会经历巨大的痛苦,并随后转向互联网资源以获取更多信息。目前尚不清楚这些资源(包括 Google 和 ChatGPT)是否为患者提供了可读的信息,衡量文本是否容易理解。由于许多患者转向谷歌和人工智能资源来获取医疗保健信息,我们认为确定他们找到的信息是否可读且易于理解非常重要。本研究的目的是比较 Google 搜索结果和 ChatGPT 对常见问题的解答的可读性,并评估这些来源是否符合 NIH 的可读性建议。问题/目的 (1) 与 Google 搜索的顶级在线资源相比,ChatGPT-3.5 作为三种最常见的原发性骨恶性肿瘤患者信息来源的可读性如何? (2) ChatGPT-3.5 响应和在线资源是否符合 NIH 患者教育材料可读性指南?方法 这是对有关骨肉瘤、软骨肉瘤和尤文肉瘤的 12 个最常见在线问题的横断面分析。为了与利用国家协会常见问题列表的类似设计的其他研究保持一致,问题选自美国癌症协会,并根据内容进行分类,包括诊断、治疗、恢复和预后。使用全部 36 个问题对 Google 进行了查询,并记录了最热门的回答。记录了作者类型,例如医院系统、国家卫生组织或独立研究人员。 ChatGPT-3.5 在独立查询中提供了每个问题,无需进一步提示。 使用经过验证的阅读指数对回答进行评估,以确定按年级水平的可读性。进行了独立 t 检验,显着性设置为 p < 0.05。结果 记录了 Google (n = 36) 和 ChatGPT-3.5 (n = 36) 的答案,三种癌症类型各有 12 个答案。基于平均可读性分数的阅读等级水平分别为 11.0 ± 2.9 和 16.1 ± 3.6。这相当于 Google 的十一年级阅读水平和 ChatGPT-3.5 的四年级本科生水平。谷歌的答案在所有单独的索引中都更具可读性,字数没有差异。不同作者类型、问题类别或癌症类型之间的可读性没有差异。在这两种搜索方式的 72 条回复中,没有一条符合 NIH 六年级水平的可读性标准。结论 Google 材料是针对高中阅读水平的,而 ChatGPT-3.5 是针对本科阅读水平的。根据 NIH 的建议,这两种资源的可读性都不够。提高可读性对于更好地理解癌症治疗期间的患者至关重要。医生应评估患者的需求,为他们提供量身定制的材料,并引导他们获得可靠的资源,以防止依赖难以理解的在线信息。证据级别 III 级,预后研究。