FedExIT - Missing Class-agnostic Semi-Supervised Federated Learning with Extreme Imbalance Tackling Scheme

Saha P., Mishra D., Wagner F., Kamnitsas K., Noble JA.

Most Federated Learning schemes assume that all clients either possess fully annotated, balanced data or that labels pertain to the same set of classes for each client. In this paper, working towards a more general, realistic, and practical framework, we relax both assumptions accommodating: (a) absence of annotated data in most clients, (b) non-IID client data distribution, (c) highly imbalanced client class distribution, and (d) non-identical client class sets with missing classes in different clients. To this end, we propose FedExIT (Federated Learning with Extreme Imbalance Tackling) that possesses three components. Firstly, it includes a theoretically grounded inter-class proximity factor in order to tackle severe imbalance and missing classes. Additionally, for the unlabeled clients, FedExIT introduces a confidence margin-weighted dual mean teacher that trains the student model with uncertainty-aware guidance from two teacher models. As the commonly used mean-teacher is rather unstable at the early training phase, we leverage a foundation model, DINOv2, finetuned on a labeled client, as an auxiliary teacher. To further reduce classifier bias, FedExIT leverages a client-adaptive classifier finetuning strategy by generating balanced, synthetic embeddings around global prototypes in the feature space of each client. We conduct experiments using seven well-known datasets including (i) 3 overall balanced datasets viz. SVHN, CIFAR-10, and CIFAR-100; (ii) 3 overall imbalanced datasets viz. CIFAR-10 LT, CIFAR-100 LT, and ISIC-2018 as well as (iii) a large dataset iNaturalist 2021 with 10,000 classes (to check scalability). We simulate several FL settings with varying number of clients, proportion of labeled clients and degree of heterogeneity, that demonstrate the superiority of FedExIT over 10 baseline methods.

DOI

10.1016/j.inffus.2025.104080

Type

Journal article

Publisher

Elsevier

Publication Date

2026-06-01T00:00:00+00:00

Volume

130

Pages

104080 - 104080

Total pages

0

Keywords

46 Information and Computing Sciences, 4611 Machine Learning

Permalink More information Close