个人简介
Ph.D., Associate Professor
LAMDA Group
School of Artificial Intelligence
National Key Laboratory for Novel Software Technology Nanjing University, P. R. China
Short Bio
I am now an associate professor at the School of Artificial Intelligence, Nanjing University. I am also a member of the LAMDA group, led by Prof. Zhi-Hua Zhou. From July 2014 to June 2019, I worked as an associate professor at the School of Computer Science and Technology, Soochow University. I received my Ph.D. degree from the School of Computer Science and Technology, University of Science and Technology of China, advised by Prof. Xiaoping Chen, in 2012. I worked with Prof. Mykel J. Kochenderfer as a visiting scholar at the Stanford Intelligent Systems Laboratory (SISL) from September 2018 to March 2019 and worked as a research fellow at the School of Computing, National University of Singapore, from November 2012 to June 2014, under Prof. David Hsu and Prof. Wee Sun Lee. Before that, I visited the Rutgers Laboratory for Real-Life Reinforcement Learning (RL3), directed by Prof. Michael L. Littman, as a research visiting student, from October 2010 to October 2011. I also briefly worked as a research engineer at the Noah's Ark Lab in the Huawei Company in 2012.
Teaching
Multi-Agent Systems (for undergraduate students, Spring 2021, 2022) [textbook]
Control Theory and Methods (for undergraduate and graduate students, Fall 2020, 2021) [textbook]
Reinforcement Learning (for graduate students, Fall 2020, 2021, with Prof. Yang Yu) [textbook]
Intelligent Systems: Design and Application (for undergraduate and graduate students, Spring 2020, 2021) [textbook]
Intelligent Application Modeling (for undergraduate students, July 2019) [a summer course co-constructed with Tencent]
研究领域
Reinforcement learning, including deep reinforcement learning and multi-agent reinforcement learning
Probabilistic planning, particularly in partially observable Markov decision processes
Imitation learning based on generative adversarial nets
近期论文
查看导师新发文章
(温馨提示:请注意重名现象,建议点开原文通过作者单位确认)
Lei Yuan, Jianhao Wang, Fuxiang Zhang, Chenghe Wang, Zongzhang Zhang, Yang Yu, and Chongjie Zhang. Multi-Agent Incentive Communication via Decentralized Teammate Modeling. In: Proceedings of the 36th Conference on Artificial Intelligence (AAAI-2022), Vancouver, Canada, 2022.
Fan-Ming Luo, Shengyi Jiang, Yang Yu, Zongzhang Zhang, and Yi-Feng Zhang. Adapt to Environment Sudden Changes by Learning a Context Sensitive Policy. In: Proceedings of the 36th Conference on Artificial Intelligence (AAAI-2022), Vancouver, Canada, 2022.
Chenyang Wu, Guoyu Yang, Zongzhang Zhang, Yang Yu, Dong Li, Wulong Liu, and Jianye Hao. Adaptive Online Packing-guided Search for POMDPs. In: Advances in Neural Information Processing Systems 34 (NeurIPS-2021), Virtual Conference, 2021.
Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang, and Yang Yu. Cross-Modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning. In: Advances in Neural Information Processing Systems 34 (NeurIPS-2021), Virtual Conference, 2021.
Yan Zheng, Jianye Hao, Zongzhang Zhang, Zhaopeng Meng, Tianpei Yang, Yanran Li, and Changjie Fan. Efficient Policy Detecting and Reusing for Non-Stationarity in Markov Games, Autonomous Agents and Multi-Agent Systems, 2021, 35(2): 1-29.
陈子璇, 章宗长, 潘致远, 张琳婧. 一种基于广义异步值迭代的规划网络模型. 软件学报, 2021, 32(11): 3496-3511.
Cong Fei, Bing Wang, Yuzheng Zhuang, Zongzhang Zhang, et al. Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative Adversarial Nets. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI-2020), pages 2929-2935, Yokohama, Japan, 2020.
Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, et al. Efficient Deep Reinforcement Learning via Adaptive Policy Transfer. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI-2020), pages 3094-3100, Yokohama, Japan, 2020.
Yan Zheng, Jianye Hao, Zongzhang Zhang, Zhaopeng Meng, and Xiaotian Hao. Efficient Multiagent Policy Optimization Based on Weighted Estimators in Stochastic Environments, Journal of Computer Science and Technology, 2020, 35(2): 268-280.
林嘉豪, 章宗长, 姜冲, 郝建业. 基于生成对抗网络的模仿学习综述. 计算机学报, 2020, 43(2): 326-351.
Xiaobai Ma, Katherine R. Driggs-Campbell, Zongzhang Zhang, and Mykel J. Kochenderfer. Monte-Carlo Tree Search for Policy Optimization. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI-2019), pages 3116-3122, Macao, China, 2019.
Yan Zheng, Zhaopeng Meng, Jianye Hao, Zongzhang Zhang, Tianpei Yang, and Changjie Fan. A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents. In: Advances in Neural Information Processing Systems 31 (NeurIPS-2018), pages 960-970, Montreal, Canada, 2018.
刘全, 翟建伟, 章宗长, 钟珊, 周倩, 章鹏, 徐进. 深度强化学习综述. 计算机学报, 2018, 41(1): 1-27.
Zongzhang Zhang, Zhiyuan Pan, and Mykel J. Kochenderfer. Weighted Double Q-learning. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI-2017), pages 3455-3461, Melbourne, Australia, 2017.
Zongzhang Zhang, Qiming Fu, Xiaofang Zhang, and Quan Liu. Reasoning and Predicting POMDP Planning Complexity via Covering Numbers, Frontiers of Computer Science, 2016, 10(4): 726-740.
Zongzhang Zhang, David Hsu, Wee Sun Lee, Zhan Wei Lim, and Aijun Bai. PLEASE: Palm Leaf Search for POMDPs with Large Observation Spaces. In: Proceedings of the 25th International Conference on Automated Planning and Scheduling (ICAPS-2015), pages 249-257, Jerusalem. Israel, 2015.
Zongzhang Zhang, David Hsu, and Wee Sun Lee. Covering Number for Efficient Heuristic-Based POMDP Planning. In: Proceedings of the 31st International Conference on Machine Learning (ICML-2014), pages 28-36, Beijing, China, 2014.
Aijun Bai, Feng Wu, Zongzhang Zhang, and Xiaoping Chen. Thompson Sampling based Monte-Carlo Planning in POMDPs. In: Proceedings of the 24th International Conference on Automated Planning and Scheduling (ICAPS-2014), pages 28-36, Portsmouth, USA, 2014.
Zongzhang Zhang, Michael L. Littman, and Xiaoping Chen. Covering Number as a Complexity Measure for POMDP Planning and Learning. In: Proceedings of the 26th Conference on Artificial Intelligence (AAAI-2012), pages 1853-1859, Toronto, Ontario, Canada, 2012.
Zongzhang Zhang and Xiaoping Chen. FHHOP: A Factored Hybrid Heuristic Online Planning Algorithm for Large POMDPs. In: Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence (UAI-2012), pages 934-943, Catalina Island, CA, USA, 2012.
学术兼职
Editorial Board Member: Intelligent Computing (AAAS/Science合作期刊, 2022 - 2024)
Young Associate Editor: Frontiers of Computer Science (2019 - 2022)
Senior Program Committee Member: IJCAI 2020-2021; AAAI 2019; ICAPS 2021; ECAI 2020
Member of the Novel Program Committee Board: IJCAI 2022-2024
Program Committee Member/Reviewer: AAAI 2018, 2020, 2022; ICML 2019-2022; IJCAI 2013, 2017-2019; NeurIPS 2018-2021; AAMAS 2021; ICLR 2021-2022; AISTATS 2022; ICAPS 2020; ECML-PKDD 2020; CoRL 2020; IJCNN 2020; CCDM 2020; ACML 2017-2019; PRICAI 2018-2019; ICA 2017-2019; ADPRL 2018; DAI 2019-2021; SSCI 2019; CCFAI 2019
Journal Reviewer: Transactions on Pattern Analysis and Machine Intelligence, Journal of Artificial Intelligence Research, IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Cybernetics, ACM Transactions on Intelligent Systems and Technology, Machine Learning, Pattern Recognition, IEEE Computational Intelligence Magazine, Information Sciences, Frontiers of Computer Science, Neurocomputing, Knowledge-Based Systems, Applied Intelligence, Expert Systems with Applications, 中国科学:信息科学, 计算机学报, 软件学报, 自动化学报, 计算机研究与发展
Workshop Co-chair: Asian Workshop on Reinforcement Learning (AWRL) 2016-2018, PRICAI 2018's Workshop on Methods and Applications of
Reinforcement Learning
Local Organizing Committee Chair: DAI 2020, MLA 2020
Professional Organization Membership: AAAI Member, IEEE Member, CCF Senior Member
Reviewer Award: ICLR 2021's Outstanding Reviewer, NeurIPS 2019's Top Reviewer