Moving Beyond ChatGPT: Local Large Language Models (LLMs) and the Secure Analysis of Confidential Unstructured Text Data in Social Work Research,Research on Social Work Practice

当前位置： X-MOL 学术 › Research on Social Work Practice › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Moving Beyond ChatGPT: Local Large Language Models (LLMs) and the Secure Analysis of Confidential Unstructured Text Data in Social Work Research
Research on Social Work Practice ( IF 1.7 ) Pub Date : 2024-09-30 , DOI: 10.1177/10497315241280686
Brian E. Perron, Hui Luan, Bryan G. Victor, Oliver Hiltz-Perron, Joseph Ryan

Purpose: Large language models (LLMs) have demonstrated remarkable abilities in natural language tasks. However, their use in social work research is limited by confidentiality and security concerns when processing sensitive data. This study addresses these challenges by evaluating the performance of local LLMs (LocalLLMs) in classifying and extracting substance-related problems from unstructured child welfare investigation summaries. LocalLLMs allow researchers to analyze data on their own computers without transmitting information to external servers for processing. Methods: Four state-of-the-art LocalLLMs—Mistral-7b, Mixtral-8 × 7b, LLama3-8b, and Llama3-70b—were tested using zero-shot prompting on 2,956 manually coded summaries. Results: The LocalLLMs achieved exceptional results comparable to human experts in classification and extraction, demonstrating their potential to unlock valuable insights from confidential, unstructured child welfare data. Conclusions: This study highlights the feasibility of using LocalLLMs to efficiently analyze large amounts of textual data while addressing the confidentiality issues associated with proprietary LLMs.

中文翻译：

超越 ChatGPT：本地大语言模型 ( LLMs ) 和社会工作研究中机密非结构化文本数据的安全分析

目的：大型语言模型（ LLMs ）在自然语言任务中表现出了卓越的能力。然而，它们在社会工作研究中的使用在处理敏感数据时受到保密和安全问题的限制。本研究通过评估当地LLMs （LocalLLM）在从非结构化儿童福利调查摘要中分类和提取与物质相关的问题方面的表现来解决这些挑战。 LocalLLM 允许研究人员在自己的计算机上分析数据，而无需将信息传输到外部服务器进行处理。方法：使用零样本提示对 2,956 个手动编码摘要进行了测试，对四种最先进的 LocalLLM（Mistral-7b、Mixtral-8 × 7b、LLama3-8b 和 Llama3-70b）进行了测试。结果：LocalLLM 在分类和提取方面取得了可与人类专家相媲美的卓越结果，展示了它们从机密、非结构化儿童福利数据中释放宝贵见解的潜力。结论：本研究强调了使用 LocalLLM 有效分析大量文本数据同时解决与专有LLMs相关的机密性问题的可行性。

更新日期：2024-09-30

点击分享查看原文

点击收藏

阅读更多本刊新发论文