个人简介
教育及工作经历:
2015年于南京大学工程管理学院获工学学士学位,2019年于香港城市大学系统工程及工程管理学系获哲学博士学位,2019年11月起加入南京大学工程管理学院控制与系统工程系任教,曾于2019年8月至10月赴澳大利亚新南威尔士大学做短期访问研究。
研究成果发表在 IEEE Transactions on Neural Networks and Learning Systems, IEEE/ASME Transactions on Mechatronics, IEEE Transactions on Cybernetics, IEEE Transactions on Systems, Man, and Cybernetics: Systems 等国际期刊和 AAAI Conference on Artificial Intelligence 等国际会议上。
教学及人才培养:目前承担《深度强化学习》、《自动化导论》等本科生、研究生课程。
研究领域
强化学习(Reinforcement Learning, RL)
1. 迁移强化学习(Transfer RL): 运用贝叶斯推理(Bayesian inference)、分层贝叶斯模型(Hierarchical Bayesian model)、隐变量模型(Latent variable model)、自适应权重(Adaptive re-weighting)、元学习(Meta-learning)等原理和方法,实现强化学习智能体之间知识的有效迁移。与之相近或交叉的学习模式有增量式学习(Incremental learning)、多任务学习(Multi-task learning)、持续学习(continual learning)、终身学习(lifelong learning)等。
2. 分层强化学习(Hierarchical RL): 运用模型集成(Model ensemble)、多专家模型(Mixture-of-experts)、贝叶斯推理(Bayesian inference)等原理和方法,在基于半马尔科夫决策过程(Semi-Markov decision process)的选项(Option)框架下实现高效而实用的决策分层机制。
3. 基于演化计算的强化学习(Evolutionary computation for RL): 运用高度可并行化的演化算法,如演化策略(Evolution strategies, ES)、进化算法(Genetic algorithms, GA)等,为强化学习问题提供可扩展性强、运算时间短的解决方案。
4. 基于规则的强化学习(Rule-based RL): 运用规则信息(Rules)、专家知识(Expert knowledge)、人类经验(Human demonstrations)等仿生学原理和方法,来改善强化学习的性能,使之更为接近人类学习的模式。
5. 多智能体强化学习(Multi-agent RL): 运用深度卷积(Depthwise convolution)、平均场近似(Mean-field approximation)、博弈论(Game theory)等原理和方法,提高多智能体之间实时通信与协同决策能力。
近期论文
查看导师新发文章
(温馨提示:请注意重名现象,建议点开原文通过作者单位确认)
[1] Zhi Wang, Chunlin Chen*, and Daoyi Dong, "Lifelong incremental reinforcement learning with online Bayesian inference," IEEE Transactions on Neural Networks and Learning Systems, DOI: 10.1109/TNNLS.2021.3055499, 2021. [pdf][code]
[2] Yuanyang Zhu, Zhi Wang*, Chunlin Chen, and Daoyi Dong, "Rule-based reinforcement learning for efficient robot navigation with space reduction," IEEE/ASME Transactions on Mechatronics, DOI: 10.1109/TMECH.2021.3072675, 2021. [pdf] [supplementary materials]
[3] Zhi Wang, Han-Xiong Li*, and Chunlin Chen, "Incremental reinforcement learning in continuous spaces via policy relaxation and importance weighting," IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 6, pp. 1870-1883, 2020. [pdf] [code]
[4] Zhi Wang, Chunlin Chen*, Han-Xiong Li, Daoyi Dong, and Tzyh-Jong Tarn, "Incremental reinforcement learning with prioritized sweeping for dynamic environments," IEEE/ASME Transactions on Mechatronics, vol. 24, no. 2, pp. 621-632, 2019. [pdf] [code]
[5] Zhi Wang, Han-Xiong Li*, and Chunlin Chen, "Reinforcement learning based optimal sensor placement for spatiotemporal modeling," IEEE Transactions on Cybernetics, vol. 50, no. 6, pp. 2861-2871, 2020. [pdf]
[6] Zhi Wang, and Han-Xiong Li*, "Incremental learning for online modeling of distributed parameter systems," IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 49, no. 12, pp. 2612-2622, 2019. [pdf]
[7] Zhi Wang, and Han-Xiong Li*, "Dissimilarity analysis based multimode modeling for complex distributed parameter systems," IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 5, pp. 2789-2797, 2021. [pdf]
Conference Papers:
[1] Zhi Wang, Wei Bi*, Yan Wang, and Xiaojiang Liu, "Better fine-tuning via instance weighting for text classification," in: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), 2019, pp. 7241-7248. [pdf] [supplementary materials]
[2] Donghan Xie, Zhi Wang, Chunlin Chen and Daoyi Dong, "IEDQN: Information exchange DQN with a centralized coordinator for traffic signal control," in: Proceedings of International Joint Conference on Neural Networks (IJCNN), 2020.
[3] Zhi Wang, and Han-Xiong Li, "Incremental learning based subspace modeling for distributed parameter systems," in: Proceedings of International Joint Conference on Neural Networks (IJCNN), 2019.
[4] Zhi Wang, Chunlin Chen, Han-Xiong Li, Daoyi Dong, and Tzyh-Jong Tarn, "A novel incremental learning scheme for reinforcement learning in dynamic environments," in: Proceedings of World Congress on Intelligent Control and Automation (WCICA), 2016.