当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Block Copolymer Phase Behavior Database
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2024-08-10 , DOI: 10.1021/acs.jcim.4c00242
Nathan J Rebello 1 , Akash Arora 1 , Hidenobu Mochigase 1 , Tzyy-Shyang Lin 1 , Jiale Shi 1 , Debra J Audus 2 , Eric S Muckley 3 , Ardiana Osmani 1 , Bradley D Olsen 1
Affiliation  

The Block Copolymer Database (BCDB) is a platform that allows users to search, submit, visualize, benchmark, and download experimental phase measurements and their associated characterization information for di- and multiblock copolymers. To the best of our knowledge, there is no widely accepted data model for publishing experimental and simulation data on block copolymer self-assembly. This proposed data schema with traceable information can accommodate any number of blocks and at the time of publication contains over 5400 block copolymer total melt phase measurements mined from the literature and manually curated and simulation data points of the phase diagram generated from self-consistent field theory that can rapidly be augmented. This database can be accessed via the Community Resource for Innovation in Polymer Technology (CRIPT) web application and the Materials Data Facility. The chemical structure of the polymer is encoded in BigSMILES, an extension of the Simplified Molecular-Input Line-Entry System (SMILES) into the macromolecular domain, and the user can search repeat units and functional groups using the SMARTS search syntax (SMILES Arbitrary Target Specification). The user can also query characterization and phase information using Structured Query Language (SQL) and download custom sets of block copolymer data to train machine learning models. Finally, a protocol is presented in which GPT-4, an AI-powered large language model, can be used to rapidly screen and identify block copolymer papers from the literature using only the abstract text and determine whether they have BCDB data, allowing the database to grow as the number of published papers on the World Wide Web increases. The F1 score for this model is 0.74. This platform is an important step in making polymer data more accessible to the broader community.

中文翻译:


嵌段共聚物相行为数据库



嵌段共聚物数据库 (BCDB) 是一个平台,允许用户搜索、提交、可视化、基准测试和下载二嵌段和多嵌段共聚物的实验相测量及其相关表征信息。据我们所知,还没有广泛接受的数据模型来发布嵌段共聚物自组装的实验和模拟数据。这个提出的具有可追踪信息的数据模式可以容纳任意数量的嵌段,并且在发布时包含从文献中挖掘的 5400 多个嵌段共聚物总熔体相测量值以及从自洽场理论生成的相图的手动策划和模拟数据点可以迅速增强。该数据库可以通过聚合物技术创新社区资源 (CRIPT) 网络应用程序和材料数据设施访问。聚合物的化学结构在 BigSMILES 中编码,BigSMILES 是简化分子输入行输入系统 (SMILES) 向大分子域的扩展,用户可以使用 SMARTS 搜索语法(SMILES 任意目标)搜索重复单元和官能团规格)。用户还可以使用结构化查询语言 (SQL) 查询表征和相信息,并下载自定义的嵌段共聚物数据集来训练机器学习模型。最后,提出了一种协议,其中 GPT-4(一种人工智能驱动的大语言模型)可用于仅使用摘要文本从文献中快速筛选和识别嵌段共聚物论文,并确定它们是否具有 BCDB 数据,从而允许数据库随着万维网上发表的论文数量的增加而增长。该模型的 F1 分数为 0.74。 该平台是让更广泛的社区更容易获取聚合物数据的重要一步。
更新日期:2024-08-10
down
wechat
bug