当前位置: X-MOL首页全球导师 国内导师 › 施少怀

个人简介

施少怀,哈尔滨工业大学(深圳)计算机科学与技术学院教授,2022年入选国家级青年人才计划,“鹏城孔雀计划”特聘岗位B档。2020年在香港浸会大学获得博士学位,2020-2022年在香港科技大学计算机科学与工程系任研究助理教授。研究兴趣为分布式机器学习系统和高性能计算,在相关领域共发表文章30余篇,包括TPDS,INFOCOM,ICLR,AAAI,MLSys等顶刊或顶会论文,并授权1 项美国专利。2篇论文分别获得国际会议IEEE DataCom 2018和IEEE INFOCOM 2021最佳论文奖;总谷歌学术引用超过2000次,H-index为21;其中一个代表性著作成功应用于超大规模2048块GPU的集群,于2018年,在百万级图片识别训练任务的训练时间上创造了最快的世界纪录。他也担任多个学术服务,包含担任ACM MobiSys 2021研讨会 EMDL程序委员会共同主席、多个顶会(如NeurIPS, AAAI, ICDCS等)程序委员会成员以及多个顶刊(如TPDS, TMC, TNSE等)审稿人。 工作经历 2023年10月~至今 哈尔滨工业大学(深圳)计算机科学与技术学院 教授 2022年9月~2023年9月 哈尔滨工业大学(深圳)计算机科学与技术学院 助理教授 2020年9月~2022年8月 香港科技大学 计算机科学与工程系 研究助理教授 2019年4月~2020年5月 英伟达(Nvidia AI Technology Centre) 研究实习生 2013年2月~2016年3月 香港浸会大学 研究助理/高级研究助理 教育经历 2016年3月~2020年8月 香港浸会大学 计算机科学 博士学位 2010年9月~2013年1月 哈尔滨工业大学 计算机科学与技术 硕士学位 2006年9月~2010年7月 华南理工大学 软件工程 学士学位 讲授课程 2023年秋季 《并行处理与体系结构》哈尔滨工业大学(深圳)硕士生 2023年秋季 《计算机体系结构》哈尔滨工业大学(深圳)本科生 2023年春季 《计算机系统》哈尔滨工业大学(深圳)本科生 2022年春季 《High Performance Computing》香港科技大学 本科生 2021年春季 《High Performance Computing》香港科技大学 本科生 Awards and Prizes 2021, Best Paper Award of IEEE INFOCOM 2021. 2020, Yakun Scholarship Scheme for Mainland Postgraduate Students, Hong Kong Baptist University. [Link] 2018-2020, RPg Performance Award Scheme, Hong Kong Baptist University. [Link] 2018, Best Paper Award of IEEE DataCom 2018. [Link] 2018, Teaching Assistant Performance Award, Hong Kong Baptist University. [Link] 2017, Alibaba Tianchi Healthcare AI Competition, Ranked 7th out of 2887. [Link] 2012, Graduate National Scholarship, Harbin Institute of Technology. [Link] 2010, First Prize of the Second National CUDA Programming Competition, NVIDIA. 2009, Outstanding Prize of the First National CUDA Programming Competition, NVIDIA. 2007-2010, National Scholarship and Merit Student, South China University of Technology.

研究领域

Distributed machine learning systems GPU computing Parallel and distributed systems Deep learning

近期论文

查看导师新发文章 (温馨提示:请注意重名现象,建议点开原文通过作者单位确认)

Hucheng Liu, Shaohuai Shi, Xuan Wang, Zoe Lin Jiang, and Qian Chen, “Performance Analysis and Optimizations of Matrix Multiplications on ARMv8 Processors,” Design, Automation and Test in Europe Conference (DATE), Valencia, Spain, March 25-27, 2024. Zhenheng Tang, Yuxin Wang, Xin He, Longteng Zhang, Xinglin Pan, Qiang Wang, Rongfei Zeng, Shaohuai Shi, Bingsheng He, and Xiaowen Chu, “FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs,” Symposium on Large Language Models (LLM 2023) with IJCAI 2023, Macao, China, August 21, 2023. Lin Zhang, Longteng Zhang, Shaohuai Shi, Xiaowen Chu and Bo Li, “Evaluation and Optimization of Gradient Compression for Distributed Deep Learning,” IEEE ICDCS 2023, Hong Kong, China, July 2023. Lin Zhang, Shaohuai Shi, Xiaowen Chu, Wei Wang, Bo Li and Chengjian Liu, “Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining,” IEEE ICDCS 2023, Hong Kong, China, July 2023. Shaohuai Shi, Qing Yang, Yang Xiang, Shuhan Qi, and Xuan Wang, “An Efficient Split Fine-tuning Framework for Edge and Cloud Collaborative Learning,” The Design Automation Conference (DAC) 2023 (Poster), Moscone West, San Francisco, July 2023. [PDF, Code] Lin Zhang, Shaohuai Shi, and Bo Li, “Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation,” ICLR 2023, Kigali, Rwanda, May 2023. [PDF, Code] Lin Zhang, Shaohuai Shi, and Bo Li, “Accelerating Distributed K-FAC with Efficient Collective Communication and Scheduling,” IEEE INFOCOM 2023, New York Area, U.S.A., May 2023. Shaohuai Shi, Xinglin Pan, Xiaowen Chu, and Bo Li, “PipeMoE: Accelerating Mixture-of-Experts through Adaptive Pipelining,” IEEE INFOCOM 2023, New York Area, U.S.A., May 2023. Zhenheng Tang, Shaohuai Shi, Bo Li, and Xiaowen Chu, “GossipFL: A Decentralized Federated Learning Framework with Sparsified and Adaptive Communication,” IEEE Transactions on Parallel and Distributed Systems (TPDS), December 2022. Lin Zhang, Shaohuai Shi, Wei Wang, and Bo Li, “Scalable K-FAC Training for Deep Neural Networks with Distributed Preconditioning,” IEEE Transactions on Cloud Computing (TCC), September 2022. [PDF, Code] Qiang Wang, Shaohuai Shi, Kaiyong Zhao, and Xiaowen Chu, “EASNet: Searching Elastic and Accurate Network Architecture for Stereo Matching,” ECCV 2022, Tel Aviv, Israel, October 2022. [PDF, Code] Zhenheng Tang, Yonggang Zhang, Shaohuai Shi, Xin He, Bo Han, and Xiaowen Chu, “Virtual Homogeneity Learning: Defending against Data Heterogeneity in Federated Learning,” ICML 2022, Baltimore, Maryland, July 2022. [PDF, Code] Zhenheng Tang, Zhikai Hu, Shaohuai Shi, Yiu-Ming Cheung, Yilun Jin, Zhenghang Ren, and Xiaowen Chu, “Data Resampling for Federated Learning with Non-IID Labels,” International Workshop on Federated and Transfer Learning for Data Sparsity and Confidentiality in Conjunction with IJCAI 2021 (FTL-IJCAI’21), Virtual Event, August 2021. Shaohuai Shi, Lin Zhang, and Bo Li, “Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks,” IEEE ICDCS 2021, Virtual Event, July 2021. Shaohuai Shi, Xiaowen Chu, and Bo Li, “Exploiting Simultaneous Communications to Accelerate Data Parallel Distributed Deep Learning,” IEEE INFOCOM 2021, Virtual Event, May 2021. (Best Paper Award, 3 out of 1266 submissions) Shaohuai Shi*, Xianhao Zhou*, Shutao Song*, Xingyao Wang, Zilin Zhu, Xue Huang, Xinan Jiang, Feihu Zhou, Zhenyu Guo, Liqiang Xie, Rui Lan, Xianbin Ouyang, Yan Zhang, Jieqian Wei, Jing Gong, Weiliang Lin, Ping Gao, Peng Meng, Xiaomin Xu, Chenyang Guo, Bo Yang, Zhibo Chen, Yongjian Wu, and Xiaowen Chu, “Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters,” The 4th Conference on Machine Learning and Systems (MLSys) 2021, Virtual Event, April 2021. [PDF] Xin He, Shihao Wang, Xiaowen Chu, Shaohuai Shi, Jiangping Tang, Jiyong Zhang, Xin Liu, Chenggang Yan, Guiguang Ding, “Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans,” AAAI 2021, Virtual Event, February 2021. Shaohuai Shi, Xiaowen Chu, and Bo Li, “MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep Learning,” IEEE Transactions on Parallel and Distributed Systems (TPDS), 2021. [PDF, Code] Shaohuai Shi, Zhenheng Tang, Xiaowen Chu, Chengjian Liu, Wei Wang, and Bo Li, “A Quantitative Survey of Communication Optimizations in Distributed Deep Learning,” IEEE Network, 2020. [PDF, Code] Zhenheng Tang, Shaohuai Shi, and Xiaowen Chu, “Communication-Efficient Decentralized Learning with Sparsification and Adaptive Peer Selection,” IEEE ICDCS 2020 (Poster), Singapore, December 2020. Shaohuai Shi, Qiang Wang, and Xiaowen Chu, “Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format,” IEEE ICPADS 2020, Hong Kong, December 2020. [PDF, Code] Yuxin Wang, Qiang Wang, Shaohuai Shi, Xin He, Zhenheng Tang, Kaiyong Zhao, and Xiaowen Chu, “Benchmarking the Performance and Power of AI Accelerators for AI Training,” The 3rd High Performance Machine Learning Workshop (HPML 2020), Melbourne, Australia, November 2020. [PDF] Shaohuai Shi, Qiang Wang, Xiaowen Chu, Bo Li, Yang Qin, Ruihao Liu, and Xinxiao Zhao, “Communication-Efficient Distributed Deep Learning with Merged Gradient Sparsification on GPUs,” IEEE INFOCOM 2020, Toronto, Canada, July 2020. [Code] Shaohuai Shi, Zhenheng Tang, Qiang Wang, Kaiyong Zhao, and Xiaowen Chu, “Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees,” The 24th European Conference on Artificial Intelligence (ECAI), Santiago de Compostela, Spain, June 2020. [PDF, Code] Qiang Wang*, Shaohuai Shi*, Shizhen Zheng, Kaiyong Zhao, and Xiaowen Chu, “FADNet: A Fast and Accurate Network for Disparity Estimation,” International Conference on Robotics and Automation (ICRA), Paris, France, June 2020. [PDF, Code] Xin He, Shihao Wang, Shaohuai Shi, Zhenheng Tang, Yuxin Wang, Zhihao Zhao, Jing Dai, Ronghao Ni, Xiaofeng Zhang, Xiaoming Liu, Zhili Wu, Wu Yu, and Xiaowen Chu, “Computer-Aided Clinical Skin Disease Diagnosis Using CNN and Object Detection Models,” KDDBHI Workshop 2019, IEEE BigData Conference, Los Angeles, CA, December 2019. Shaohuai Shi, Kaiyong Zhao, Qiang Wang, Zhenheng Tang, and Xiaowen Chu, “A Convergence Analysis of Distributed SGD with Communication-Efficient Gradient Sparsification,” IJCAI 2019, Macao, China, August 2019. [PDF, Code] Shaohuai Shi, Qiang Wang, Kaiyong Zhao, Zhenheng Tang, Yuxin Wang, Xiang Huang, and Xiaowen Chu, “A Distributed Synchronous SGD Algorithm with Global Top-k Sparsification for Low Bandwidth Networks,” IEEE ICDCS 2019, Texas, USA, July 2019. [PDF, Code] Shaohuai Shi, Xiaowen Chu, and Bo Li, “MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms,” IEEE INFOCOM 2019, Paris, France, May 2019. [PDF, Code] Xianyan Jia*, Shutao Song*, Shaohuai Shi*, Wei He, Yangzihao Wang, Haidong Rong, Feihu Zhou, Liqiang Xie, Zhenyu Guo, Yuanzhou Yang, Liwei Yu, Tiegang Chen, Guangxiao Hu, and Xiaowen Chu, “Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes,” NeurIPS 2018 Workshop on Systems for ML and Open Source Software, Montreal, Canada, December 2018. [PDF] Shaohuai Shi, Qiang Wang, Xiaowen Chu, and Bo Li, “A DAG Model of Synchronous Stochastic Gradient Descent in Distributed Deep Learning,” IEEE ICPADS 2018, Singapore, December 2018. [PDF] Shaohuai Shi, Qiang Wang, and Xiaowen Chu, “Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs,” IEEE DataCom 2018, Athens, Greece, August 2018. (Best Paper Award) [PDF] Shaohuai Shi, Pengfei Xu, and Xiaowen Chu, “Supervised Learning Based Algorithm Selection for Deep Neural Networks,” IEEE ICPADS 2017, Shenzhen, China, December 2017. [PDF, Code] Pengfei Xu, Shaohuai Shi, and Xiaowen Chu, “Performance Evaluation of Deep Learning Tools in Docker Containers,” The 3rd International Conference on Big Data Computing and Communications (BigCom), Chengdu, China, August 2017. Shaohuai Shi, Qiang Wang, Pengfei Xu, and Xiaowen Chu. “Benchmarking State-of-the-art Deep Learning Software Tools,” The 7th International Conference on Cloud Computing and Big Data (CCBD), Macao, China, November 2016. [PDF, Code] Jingjing Chen, You Li, Xiaowen Chu, Shaohuai Shi, Tang Tao, Lin Cui, Zhiling Xu, and Jianliang Xu, “Ebanshu: An Interactivity-aware Blended Virtual Learning Environment,” The 9th International Conference on Internet and Web Applications and Services (ICIW), Paris, France, July 2014 Shuhan Qi, Xuan Wang, and Shaohuai Shi, “Mixed Precision Method for GPU-based FFT,” The 14th IEEE International Conference on Computational Science and Engineering, Dalian, China, August 2011. Jiangfeng Peng, Hu Chen, and Shaohuai Shi, “The GPU-based String Matching System in Advanced AC Algorithm,” The 10th IEEE International Conference on Computer and Information Technology, West Yorkshire, UK, June 2010. Preprints Zhenheng Tang, Shaohuai Shi, Xiaowen Chu, Wei Wang, and Bo Li, “Communication-Efficient Distributed Deep Learning: A Comprehensive Survey,” March 2020. [PDF] Shaohuai Shi, Xiaowen Chu, Ka Chun Cheung, and Simon See, “Understanding Top-k Sparsification in Distributed Deep Learning,” 2019. [PDF, Code] Shi, Shaohuai and Xiaowen Chu. “Speeding Up Convolutional Neural Networks by Exploiting the Sparsity of Rectifier Units,” April 2017. [PDF]

推荐链接
down
wechat
bug