当前位置: X-MOL 学术J. Magnes. Alloys › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Introducing MagBERT: A language model for magnesium textual data mining and analysis
Journal of Magnesium and Alloys ( IF 15.8 ) Pub Date : 2024-08-24 , DOI: 10.1016/j.jma.2024.08.010
Surjeet Kumar, Russlan Jaafreh, Nirpendra Singh, Kotiba Hamad, Dae Ho Yoon

Magnesium (Mg) based materials hold immense potential for various applications due to their lightweight and high strength-to-weight ratio. However, to fully harness the potential of Mg alloys, structured analytics are essential to gain valuable insights from centuries of accumulated knowledge. Efficient information extraction from the vast corpus of scientific literature is crucial for this purpose. In this work, we introduce MagBERT, a BERT-based language model specifically trained for Mg-based materials. Utilizing a dataset of approximately 370,000 abstracts focused on Mg and its alloys, MagBERT is designed to understand the intricate details and specialized terminology of this domain. Through rigorous evaluation, we demonstrate the effectiveness of MagBERT for information extraction using a fine-tuned named entity recognition (NER) model, named MagNER. This NER model can extract mechanical, microstructural, and processing properties related to Mg alloys. For instance, we have created an Mg alloy dataset that includes properties such as ductility, yield strength, and ultimate tensile strength (UTS), along with standard alloy names. The introduction of MagBERT is a novel advancement in the development of Mg-specific language models, marking a significant milestone in the discovery of Mg alloys and textual information extraction. By making the pre-trained weights of MagBERT publicly accessible, we aim to accelerate research and innovation in the field of Mg-based materials through efficient information extraction and knowledge discovery.

中文翻译:


MagBERT 简介:用于镁文本数据挖掘和分析的语言模型



镁(Mg)基材料由于其轻质和高强度重量比而在各种应用中具有巨大的潜力。然而,为了充分利用镁合金的潜力,结构化分析对于从数百年积累的知识中获得有价值的见解至关重要。为此,从大量科学文献中有效提取信息至关重要。在这项工作中,我们介绍了 MagBERT,这是一种基于 BERT 的语言模型,专门针对镁基材料进行了训练。 MagBERT 利用包含约 370,000 个镁及其合金摘要的数据集,旨在了解该领域的复杂细节和专业术语。通过严格的评估,我们使用名为 MagNER 的微调命名实体识别 (NER) 模型证明了 MagBERT 在信息提取方面的有效性。该 NER 模型可以提取与镁合金相关的机械、微观结构和加工性能。例如,我们创建了一个镁合金数据集,其中包括延展性、屈服强度和极限拉伸强度 (UTS) 等属性以及标准合金名称。 MagBERT 的推出是镁专用语言模型开发的一个新进展,标志​​着镁合金发现和文本信息提取的一个重要里程碑。通过公开 MagBERT 的预训练权重,我们的目标是通过有效的信息提取和知识发现来加速镁基材料领域的研究和创新。
更新日期:2024-08-24
down
wechat
bug