Scaffold Extensions for Client Drift Mitigation in Federated Learning: A Synthesis of Approaches, Limitations, and Future Directions
Date
2026Author
Muthii, James Mburu
Wanjau, Stephen K.
Njenga, Stephen
Metadata
Show full item recordAbstract
Client drift arising from non-independent and identically distributed (non-IID) data across participating clients remains one of the most critical obstacles to effective Federated Learning. The Scaffold algorithm, which introduces control variates to correct local gradient updates, has emerged as one of the most prominent variance reduction methods for mitigating this drift. Although numerous extensions to Scaffold have been proposed, no systematic review has exclusively examined the Scaffold algorithm and the control variate mechanism for client drift mitigation, leaving the research community without a consolidated understanding of how Scaffold has been extended, what limitations persist, and which characteristics remain underexplored. This study addresses that gap through a systematic literature review guided by PRISMA 2020 guidelines. Seven electronic databases were searched for publications from 2016 to 2026, yielding 1,847 records, from which 33 studies were included after duplicate removal, screening, and full-text eligibility assessment based on criteria requiring each study to address Scaffold or control variates for client drift in FL and cover at least two performance metrics. Data were synthesized thematically using frequency counts and tabular summaries. The review reveals nine distinct extension approaches: variance reduction via gradient estimation techniques was the most prevalent (11 studies, 34%), followed by integration with advanced optimization algorithms (8 studies, 25%), together accounting for 59% of the reviewed work. Twelve Scaffold characteristics were targeted for extension, with variance reduction the most commonly modified (37%, rising to 50% with combined categories), while communication mechanism, privacy budget allocation, and similarity-based approaches remained significantly underexplored. Recurring limitations across all approaches included communication and computational overhead, hyperparameter sensitivity, restrictive theoretical assumptions, performance degradation under extreme data heterogeneity, and limited large-scale empirical validation. A notable finding is that similarity-based approaches for client drift mitigation are largely absent from the literature, with only one study employing a similarity measure. The review, therefore, recommends future investigation of similarity-based methods as adaptive control variates within the Scaffold protocol, alongside prioritization of communication-efficient, privacy-preserving designs validated at scale. This research was self-sponsored with no external funding.
Collections
- Journal Articles (CI) [139]
