当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ProtChat: An AI Multi-Agent for Automated Protein Analysis Leveraging GPT-4 and Protein Language Model.
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2024-12-17 , DOI: 10.1021/acs.jcim.4c01345
Huazhen Huang,Xianguo Shi,Hongyang Lei,Fan Hu,Yunpeng Cai

Large language models (LLMs) have transformed natural language processing, enabling advanced human-machine communication. Similarly, in computational biology, protein sequences are interpreted as natural language, facilitating the creation of protein large language models (PLLMs). However, applying PLLMs requires specialized preprocessing and script development, increasing the complexity of their use. Researchers have integrated LLMs with PLLMs to develop automated protein analysis tools to address these challenges, simplifying analytical workflows. Existing technologies often require substantial human intervention for specific protein-related tasks, maintaining high barriers to implementing automated protein analysis systems. Here, we propose ProtChat, an AI multiagent system for protein analysis that integrates the inference capabilities of PLLMs with the task-planning abilities of LLMs. ProtChat integrates GPT-4 with multiple PLLMs, like ESM and MASSA, to automate tasks such as protein property prediction and protein-drug interactions without human intervention. This AI agent enables users to input instructions directly, significantly improving efficiency and usability, making it suitable for researchers without a computational background. Experiments demonstrate that ProtChat can automate complex protein tasks accurately, avoiding manual intervention and delivering results rapidly. This advancement opens new research avenues in computational biology and drug discovery. Future applications may extend ProtChat's capabilities to broader biological data analysis. Our code and data are publicly available at github.com/SIAT-code/ProtChat.

中文翻译:


ProtChat:一种利用 GPT-4 和蛋白质语言模型进行自动蛋白质分析的 AI 多代理。



大型语言模型 (LLMs) 改变了自然语言处理,实现了高级人机通信。同样,在计算生物学中,蛋白质序列被解释为自然语言,从而促进了蛋白质大语言模型 (PLLM) 的创建。但是,应用 PLLM 需要专门的预处理和脚本开发,这增加了其使用的复杂性。研究人员已将 LLMs 与 PLLM 集成,以开发自动化蛋白质分析工具来应对这些挑战,从而简化分析工作流程。现有技术通常需要大量的人工干预才能完成特定的蛋白质相关任务,因此在实施自动化蛋白质分析系统方面保持了很高的障碍。在这里,我们提出了 ProtChat,这是一种用于蛋白质分析的 AI 多代理系统,它将 PLLM 的推理能力与 LLMs。ProtChat 将 GPT-4 与多个 PLLM(如 ESM 和 MASSA)集成,无需人工干预即可自动执行蛋白质特性预测和蛋白质-药物相互作用等任务。该 AI 代理使用户能够直接输入指令,显著提高效率和可用性,使其适合没有计算背景的研究人员。实验表明,ProtChat 可以准确地自动执行复杂的蛋白质任务,避免人工干预并快速提供结果。这一进步为计算生物学和药物发现开辟了新的研究途径。未来的应用可能会将 ProtChat 的功能扩展到更广泛的生物数据分析。我们的代码和数据在 github.com/SIAT-code/ProtChat 上公开提供。
更新日期:2024-12-17
down
wechat
bug