当前位置: X-MOL 学术Environ. Sci. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Microplastics and Trash Cleaning and Harmonization (MaTCH): Semantic Data Ingestion and Harmonization Using Artificial Intelligence
Environmental Science & Technology ( IF 10.8 ) Pub Date : 2024-11-11 , DOI: 10.1021/acs.est.4c02406
Hannah Hapich, Win Cowger, Andrew B. Gray

With the rapid expansion of microplastic research and reliance on semantic descriptors, there is an increasing need for plastic pollution data harmonization. Data standards have been developed but are seldom implemented across research sectors, geographic regions, environmental media, or size classes of plastic pollution. Harmonization of existing data is currently hindered by increasingly large datasets using thousands of different categorical variable descriptors, as well as various metrics used to describe particle abundance and differing size ranges studied across groups. For this study, we used manually developed relational databases to build an algorithm utilizing artificial intelligence capable of automatically curating harmonized, more usable datasets describing micro to macro plastic pollution in the environment. The study algorithm MaTCH (microplastics and trash cleaning and harmonization) can harmonize datasets with different formats, nomenclature, methods, and measured particle characteristics with an accuracy of 71–94% when matching semantically. All other non-semantic corrections are reported within a 95% confidence interval and with model uncertainty. All steps of the algorithm are integrated in an open-source software tool for the benefit of the scientific community and ease of integration for all plastic pollution data.

中文翻译:


微塑料和垃圾清理与协调 (MaTCH):使用人工智能进行语义数据摄取和协调



随着微塑料研究的快速扩展和对语义描述符的依赖,对塑料污染数据协调的需求越来越大。已经制定了数据标准,但很少在研究部门、地理区域、环境介质或塑料污染的大小类别中实施。目前,使用数千个不同分类变量描述符的大型数据集以及用于描述各组研究的颗粒丰度和不同粒径范围的各种指标阻碍了现有数据的协调。在这项研究中,我们使用手动开发的关系数据库来构建一种利用人工智能的算法,该算法能够自动管理描述环境中微观到宏观塑料污染的协调、更有用的数据集。研究算法 MaTCH(微塑料和垃圾清理和协调)可以协调具有不同格式、命名、方法和测量颗粒特性的数据集,语义匹配时的准确率为 71-94%。所有其他非语义校正均在 95% 置信区间内报告,并且具有模型不确定性。该算法的所有步骤都集成在一个开源软件工具中,以造福科学界并易于集成所有塑料污染数据。
更新日期:2024-11-12
down
wechat
bug