王志 - 南京大学 - 工程管理学院

个人简介

教育及工作经历： 2015年于南京大学工程管理学院获工学学士学位，2019年于香港城市大学系统工程及工程管理学系获哲学博士学位，2019年11月起加入南京大学工程管理学院控制与系统工程系任教，曾于2019年8月至10月赴澳大利亚新南威尔士大学做短期访问研究。研究成果发表在 IEEE Transactions on Neural Networks and Learning Systems, IEEE/ASME Transactions on Mechatronics, IEEE Transactions on Cybernetics, IEEE Transactions on Systems, Man, and Cybernetics: Systems 等国际期刊和 AAAI Conference on Artificial Intelligence 等国际会议上。教学及人才培养：目前承担《深度强化学习》、《自动化导论》等本科生、研究生课程。

研究领域

强化学习(Reinforcement Learning, RL) 1. 迁移强化学习(Transfer RL): 运用贝叶斯推理(Bayesian inference)、分层贝叶斯模型(Hierarchical Bayesian model)、隐变量模型(Latent variable model)、自适应权重(Adaptive re-weighting)、元学习(Meta-learning)等原理和方法，实现强化学习智能体之间知识的有效迁移。与之相近或交叉的学习模式有增量式学习(Incremental learning)、多任务学习(Multi-task learning)、持续学习(continual learning)、终身学习(lifelong learning)等。 2. 分层强化学习(Hierarchical RL): 运用模型集成(Model ensemble)、多专家模型(Mixture-of-experts)、贝叶斯推理(Bayesian inference)等原理和方法，在基于半马尔科夫决策过程(Semi-Markov decision process)的选项(Option)框架下实现高效而实用的决策分层机制。 3. 基于演化计算的强化学习(Evolutionary computation for RL): 运用高度可并行化的演化算法，如演化策略(Evolution strategies, ES)、进化算法(Genetic algorithms, GA)等，为强化学习问题提供可扩展性强、运算时间短的解决方案。 4. 基于规则的强化学习(Rule-based RL): 运用规则信息(Rules)、专家知识(Expert knowledge)、人类经验(Human demonstrations)等仿生学原理和方法，来改善强化学习的性能，使之更为接近人类学习的模式。 5. 多智能体强化学习(Multi-agent RL): 运用深度卷积(Depthwise convolution)、平均场近似(Mean-field approximation)、博弈论(Game theory)等原理和方法，提高多智能体之间实时通信与协同决策能力。

近期论文

查看导师最新文章（温馨提示：请注意重名现象，建议点开原文通过作者单位确认）

[1] Zhi Wang, Chunlin Chen*, and Daoyi Dong, "Lifelong incremental reinforcement learning with online Bayesian inference," IEEE Transactions on Neural Networks and Learning Systems, DOI: 10.1109/TNNLS.2021.3055499, 2021. [pdf][code] [2] Yuanyang Zhu, Zhi Wang*, Chunlin Chen, and Daoyi Dong, "Rule-based reinforcement learning for efficient robot navigation with space reduction," IEEE/ASME Transactions on Mechatronics, DOI: 10.1109/TMECH.2021.3072675, 2021. [pdf] [supplementary materials] [3] Zhi Wang, Han-Xiong Li*, and Chunlin Chen, "Incremental reinforcement learning in continuous spaces via policy relaxation and importance weighting," IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 6, pp. 1870-1883, 2020. [pdf] [code] [4] Zhi Wang, Chunlin Chen*, Han-Xiong Li, Daoyi Dong, and Tzyh-Jong Tarn, "Incremental reinforcement learning with prioritized sweeping for dynamic environments," IEEE/ASME Transactions on Mechatronics, vol. 24, no. 2, pp. 621-632, 2019. [pdf] [code] [5] Zhi Wang, Han-Xiong Li*, and Chunlin Chen, "Reinforcement learning based optimal sensor placement for spatiotemporal modeling," IEEE Transactions on Cybernetics, vol. 50, no. 6, pp. 2861-2871, 2020. [pdf] [6] Zhi Wang, and Han-Xiong Li*, "Incremental learning for online modeling of distributed parameter systems," IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 49, no. 12, pp. 2612-2622, 2019. [pdf] [7] Zhi Wang, and Han-Xiong Li*, "Dissimilarity analysis based multimode modeling for complex distributed parameter systems," IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 5, pp. 2789-2797, 2021. [pdf] Conference Papers: [1] Zhi Wang, Wei Bi*, Yan Wang, and Xiaojiang Liu, "Better fine-tuning via instance weighting for text classification," in: Proceedings of AAAI Conference on Artificial Intelligence (AAAI), 2019, pp. 7241-7248. [pdf] [supplementary materials] [2] Donghan Xie, Zhi Wang, Chunlin Chen and Daoyi Dong, "IEDQN: Information exchange DQN with a centralized coordinator for traffic signal control," in: Proceedings of International Joint Conference on Neural Networks (IJCNN), 2020. [3] Zhi Wang, and Han-Xiong Li, "Incremental learning based subspace modeling for distributed parameter systems," in: Proceedings of International Joint Conference on Neural Networks (IJCNN), 2019. [4] Zhi Wang, Chunlin Chen, Han-Xiong Li, Daoyi Dong, and Tzyh-Jong Tarn, "A novel incremental learning scheme for reinforcement learning in dynamic environments," in: Proceedings of World Congress on Intelligent Control and Automation (WCICA), 2016.