当前位置: X-MOL 学术IEEE Internet Things J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Utility Aware Optimal Data Selection for Differentially Private Federated Learning in IoV
IEEE Internet of Things Journal ( IF 8.2 ) Pub Date : 7-12-2024 , DOI: 10.1109/jiot.2024.3427132
Jiancong Zhang 1 , Shining Li 1 , Changhao Wang 1
Affiliation  

Federated learning coordinates distributed datasets to train models, which brings the significant impact of data selection on model performance. Personalized differential privacy, however, introduces heterogeneity into the vehicular datasets: the higher privacy protection may reduce the contribution of local models to model convergence. Therefore, the goal of this paper is to dynamically optimize the combination of datasets to tackle the heterogeneity in differential private federated learning in IoV. This is extremely challenging without direct data access and a visible training process. Therefore, we propose an efficient hierarchical data selection method. First, the utility is evaluated using the convergence bound derived from the noise function and the cost function. Accordingly, a collection of high-value clients is selected to maximize the potential contribution of the combination to the global model. Then, we design an optimization function based on the unknown variables within the convergence bound and develop a low-complexity algorithm to approximate the sampling probability. Meanwhile, the aggregation weight of each model is adjusted to ensure unbiased estimation. Experimental results on two real-world trajectory datasets show that the scheme can reduce the meter error by 8.90% and 15.97% respectively, and improve the convergence speed by 23.9% and 27.1% respectively.

中文翻译:


车联网中差分隐私联邦学习的效用感知最优数据选择



联邦学习协调分布式数据集来训练模型,这给模型性能带来了数据选择的显着影响。然而,个性化差异隐私将异构性引入到车辆数据集中:更高的隐私保护可能会减少本地模型对模型收敛的贡献。因此,本文的目标是动态优化数据集组合,以解决车联网中差分私有联邦学习的异构性。如果没有直接的数据访问和可见的训练过程,这是极具挑战性的。因此,我们提出了一种有效的分层数据选择方法。首先,使用从噪声函数和成本函数导出的收敛界限来评估效用。因此,选择一批高价值客户,以最大限度地提高合并对全球模式的潜在贡献。然后,我们基于收敛范围内的未知变量设计优化函数,并开发低复杂度算法来近似采样概率。同时调整各模型的聚合权重,保证估计的无偏。在两个真实轨迹数据集上的实验结果表明,该方案可使计米误差分别降低8.90%和15.97%,收敛速度分别提高23.9%和27.1%。
更新日期:2024-08-22
down
wechat
bug