Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
A reinforcement learning approach with masked agents for chemical process flowsheet design
AIChE Journal ( IF 3.5 ) Pub Date : 2024-08-30 , DOI: 10.1002/aic.18584 Simone Reynoso‐Donzelli 1 , Luis Alberto Ricardez‐Sandoval 1
AIChE Journal ( IF 3.5 ) Pub Date : 2024-08-30 , DOI: 10.1002/aic.18584 Simone Reynoso‐Donzelli 1 , Luis Alberto Ricardez‐Sandoval 1
Affiliation
This study introduces two novel Reinforcement Learning (RL) agents for the design and optimization of chemical process flowsheets (CPFs): a discrete masked Proximal Policy Optimization (mPPO) and a hybrid masked Proximal Policy Optimization (mHPPO). The novelty of this work lies in the use of masking within the hybrid framework, i.e., the incorporation of expert input or design rules that allows the exclusion of actions from the agent's decision spectrum. This work distinguishes from others by seamlessly integrating masked agents with rigorous unit operations (UOs) models, that is, advanced thermodynamic and conservation balance equations, in its simulation environment to design and optimize CPF. The efficacy of these agents, along with performance comparisons, is evaluated through case studies, including one that employs a chemical engineering simulator such as ASPEN Plus®. The results of these case studies reveal learning on the part of the agents, that is, the agent is able to find viable flowsheet designs that meet the stipulated process flowsheet design requirements, for example, achieve a user-defined product quality.
中文翻译:
用于化学工艺流程设计的带有掩蔽代理的强化学习方法
本研究引入了两种用于化学工艺流程图 (CPF) 设计和优化的新型强化学习 (RL) 代理:离散屏蔽近端策略优化 (mPPO) 和混合屏蔽近端策略优化 (mHPPO)。这项工作的新颖性在于在混合框架内使用屏蔽,即结合专家输入或设计规则,允许从代理的决策范围中排除动作。这项工作与其他工作的不同之处在于,在其模拟环境中将蒙面代理与严格的单元操作(UO)模型(即先进的热力学和守恒平衡方程)无缝集成,以设计和优化 CPF。这些药剂的功效以及性能比较是通过案例研究进行评估的,其中包括采用 ASPEN Plus® 等化学工程模拟器的案例研究。这些案例研究的结果揭示了代理的学习,即代理能够找到满足规定工艺流程设计要求的可行流程图设计,例如达到用户定义的产品质量。
更新日期:2024-08-30
中文翻译:
用于化学工艺流程设计的带有掩蔽代理的强化学习方法
本研究引入了两种用于化学工艺流程图 (CPF) 设计和优化的新型强化学习 (RL) 代理:离散屏蔽近端策略优化 (mPPO) 和混合屏蔽近端策略优化 (mHPPO)。这项工作的新颖性在于在混合框架内使用屏蔽,即结合专家输入或设计规则,允许从代理的决策范围中排除动作。这项工作与其他工作的不同之处在于,在其模拟环境中将蒙面代理与严格的单元操作(UO)模型(即先进的热力学和守恒平衡方程)无缝集成,以设计和优化 CPF。这些药剂的功效以及性能比较是通过案例研究进行评估的,其中包括采用 ASPEN Plus® 等化学工程模拟器的案例研究。这些案例研究的结果揭示了代理的学习,即代理能够找到满足规定工艺流程设计要求的可行流程图设计,例如达到用户定义的产品质量。