南湖新闻网讯(通讯员 陈君)近日,我校信息学院陈洪教授团队在人工智能领域国际会议NeurIPS-2023(Thirty-seventh Conference on Neural Information Processing Systems,CCF-A类)发表机器学习理论研究论文。该论文以“Fine-Grained Theoretical Analysis of Federated Zeroth-Order Optimization”为题,系统分析了联邦零阶学习算法的稳定性与泛化性。
近年来,面向数据隐私保护和分布式计算的联邦学习引起了人工智能领域学者的广泛关注。针对实际应用中梯度信息难获取的场景,联邦零阶优化算法(FedZO)被提出以融入零阶优化来规避梯度计算,在保证数据隐私安全的前提下实现多个本地客户端协同训练全局模型。然而,关于FedZO理论基础的分析还相对欠缺。
鉴于此,本文发展算法稳定性工具建立了联邦零阶优化算法的泛化分析,给出了泛化保障的充分条件,阐明了影响泛化性的主要因素。具体来说,本文先刻画了一般情形联邦零阶优化算法的泛化误差和L1平均模型稳定性之间的量化关系,然后分别建立了Lipschiz条件下同步FedZO算法、重尾条件下同步FedZO算法和异步FedZO算法的稳定性分析,并给出了后两种场景最优的优化误差上界。特别地,本文的理论结果与已有联邦零阶学习算法的实验观测相一致。
信息学院2022级博士研究生陈君为论文第一作者,陈洪教授为通讯作者,MBZUAI 顾彬教授与信息学院邓昊老师参与了论文的研究工作。该研究获得了国家自然科学基金面上项目等的资助。
会议链接:https://neurips.cc/
【英文摘要】Federated zeroth-order optimization (FedZO) algorithm enjoys the advantages of both zeroth-order optimization and federated learning, and has shown exceptional performance on black-box attack and softmax regression tasks. However, there is no generalization analysis for FedZO, and its analysis on computing convergence rate is slower than the corresponding first-order optimization setting. This paper aims to establish systematic theoretical assessments of FedZO by developing the analysis technique of on-average model stability. We establish the first generalization error bound of FedZO under the Lipschitz continuity and smoothness conditions. Then, refined generalization and optimization bounds are provided by replacing bounded gradient with heavy-tailed gradient noise and utilizing the second-order Taylor expansion for gradient approximation. With the help of a new error decomposition strategy, our theoretical analysis is also extended to the asynchronous case. For FedZO, our fine-grained analysis fills the theoretical gap on the generalization guarantees and polishes the convergence characterization of the computing algorithm.
审核:邓昊