当前位置:
X-MOL 学术
›
Chem. Eng. Sci.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Reinforcement learning based gasoline blending optimization: Achieving more efficient nonlinear online blending of fuels
Chemical Engineering Science ( IF 4.1 ) Pub Date : 2024-07-27 , DOI: 10.1016/j.ces.2024.120574 Muyi Huang , Renchu He , Xin Dai , Wenli Du , Feng Qian
Chemical Engineering Science ( IF 4.1 ) Pub Date : 2024-07-27 , DOI: 10.1016/j.ces.2024.120574 Muyi Huang , Renchu He , Xin Dai , Wenli Du , Feng Qian
The online optimization of gasoline blending provides substantial benefits for refineries. However, the effectiveness of the gasoline blending optimization is challenged by the nonlinear blending mechanism, the oil property fluctuations, and the blending model mismatch. A novel online optimization method based on the deep reinforcement learning (DRL) framework is proposed in this paper to solve the above issues. Constraint mapping mechanism integrated gasoline blending environment is established based on real-world data. The historical data incorporated state design and dynamic coefficients embedded reward design are applied in the Markov Decision Process (MDP) expression of the blending system. The Soft Actor-Critic (SAC) algorithm with high training stability and data utilization is employed to learn a blending recipe adjustment policy. The SAC agent aims to maximize rewards and action entropy, thereby exploring the environment widely and learning more comprehensive policies. Compared with a traditional method, the proposed method has better economic performance and shows robustness under property fluctuations and component oil switching. Furthermore, the proposed method maintains performance by automatically adapting to system drift.
中文翻译:
基于强化学习的汽油调配优化:实现更高效的非线性在线燃料调配
汽油混合的在线优化为炼油厂带来了巨大的好处。然而,汽油调和优化的有效性受到非线性调和机制、油品性质波动和调和模型不匹配的挑战。为了解决上述问题,本文提出了一种基于深度强化学习(DRL)框架的新型在线优化方法。基于真实数据建立了集成汽油调合环境的约束映射机制。结合状态设计和动态系数嵌入奖励设计的历史数据应用于混合系统的马尔可夫决策过程(MDP)表达。采用具有高训练稳定性和数据利用率的Soft Actor-Critic(SAC)算法来学习混合配方调整策略。 SAC智能体的目标是最大化奖励和行动熵,从而广泛地探索环境并学习更全面的策略。与传统方法相比,该方法具有更好的经济性,并且在性能波动和成分油切换下表现出鲁棒性。此外,所提出的方法通过自动适应系统漂移来保持性能。
更新日期:2024-07-27
中文翻译:
基于强化学习的汽油调配优化:实现更高效的非线性在线燃料调配
汽油混合的在线优化为炼油厂带来了巨大的好处。然而,汽油调和优化的有效性受到非线性调和机制、油品性质波动和调和模型不匹配的挑战。为了解决上述问题,本文提出了一种基于深度强化学习(DRL)框架的新型在线优化方法。基于真实数据建立了集成汽油调合环境的约束映射机制。结合状态设计和动态系数嵌入奖励设计的历史数据应用于混合系统的马尔可夫决策过程(MDP)表达。采用具有高训练稳定性和数据利用率的Soft Actor-Critic(SAC)算法来学习混合配方调整策略。 SAC智能体的目标是最大化奖励和行动熵,从而广泛地探索环境并学习更全面的策略。与传统方法相比,该方法具有更好的经济性,并且在性能波动和成分油切换下表现出鲁棒性。此外,所提出的方法通过自动适应系统漂移来保持性能。