当前位置: X-MOL 学术medRxiv. Health Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enhancing Dietary Supplement Question Answer via Retrieval-Augmented Generation (RAG) with LLM
medRxiv - Health Informatics Pub Date : 2024-09-12 , DOI: 10.1101/2024.09.11.24313513
Yu Hou , Rui Zhang

Objective: To enhance the accuracy and reliability of dietary supplement (DS) question answering by integrating a novel Retrieval-Augmented Generation (RAG) LLM system with an updated and integrated DS knowledge base and providing a user-friendly interface. With. Materials and Methods: We developed iDISK2.0 by integrating updated data from multiple trusted sources, including NMCD, MSKCC, DSLD, and NHPD, and applied advanced integration strategies to reduce noise. We then applied the iDISK2.0 with a RAG system, leveraging the strengths of large language models (LLMs) and a biomedical knowledge graph (BKG) to address the hallucination issues inherent in standalone LLMs. The system enhances answer generation by using LLMs (GPT-4.0) to retrieve contextually relevant subgraphs from the BKG based on identified entities in the query. A user-friendly interface was built to facilitate easy access to DS knowledge through conversational text inputs. Results: The iDISK2.0 encompasses 174,317 entities across seven types, six types of relationships, and 471,063 attributes. The iDISK2.0-RAG system significantly improved the accuracy of DS-related information retrieval. Our evaluations showed that the system achieved over 95% accuracy in answering True/False and multiple-choice questions, outperforming standalone LLMs. Additionally, the user-friendly interface enabled efficient interaction, allowing users to input free-form text queries and receive accurate, contextually relevant responses. The integration process minimized data noise and ensured the most up-to-date and comprehensive DS information was available to users. Conclusion: The integration of iDISK2.0 with an RAG system effectively addresses the limitations of LLMs, providing a robust solution for accurate DS information retrieval. This study underscores the importance of combining structured knowledge graphs with advanced language models to enhance the precision and reliability of information retrieval systems, ultimately supporting better-informed decisions in DS-related research and healthcare.

中文翻译:


LLM通过检索增强生成 (RAG) 增强膳食补充剂问题解答



目的:通过将新颖的检索增强生成(RAG) LLM系统与更新和集成的DS知识库相集成并提供用户友好的界面,提高膳食补充剂(DS)问答的准确性和可靠性。和。材料和方法:我们通过集成来自多个可信来源(包括 NMCD、MSKCC、DSLD 和 NHPD)的更新数据来开发 iDISK2.0,并应用先进的集成策略来减少噪音。然后,我们将 iDISK2.0 与 RAG 系统一起应用,利用大型语言模型 ( LLMs ) 和生物医学知识图 (BKG) 的优势来解决独立LLMs固有的幻觉问题。该系统通过使用LLMs (GPT-4.0) 根据查询中已识别的实体从 BKG 中检索上下文相关的子图来增强答案生成。建立了一个用户友好的界面,以便通过对话文本输入轻松访问 DS 知识。结果:iDISK2.0 包含 7 种类型的 174,317 个实体、六种关系类型和 471,063 个属性。 iDISK2.0-RAG系统显着提高了DS相关信息检索的准确性。我们的评估表明,该系统在回答判断题和多项选择题方面的准确率超过 95%,优于独立的LLMs 。此外,用户友好的界面实现了高效的交互,允许用户输入自由格式的文本查询并接收准确的、上下文相关的响应。集成过程最大限度地减少了数据噪音,并确保用户可以获得最新、最全面的 DS 信息。结论:iDISK2的集成。0与RAG系统有效地解决了LLMs的局限性,为准确的DS信息检索提供了强大的解决方案。这项研究强调了将结构化知识图与高级语言模型相结合的重要性,以提高信息检索系统的精度和可靠性,最终支持 DS 相关研究和医疗保健中更明智的决策。
更新日期:2024-09-13
down
wechat
bug