Improving Diffusion Models for Class-imbalanced Training Data via Capacity Manipulation

Feng Hong, Jiangchao Yao, Yifei Shen, Dongsheng Li, Ya Zhang, Yanfeng Wang

Vision & 3D Thu, Apr 23 · 10:42 AM–10:52 AM · 201 A/B Avg rating: 6.00 (6–6)

Abstract

While diffusion models have achieved remarkable performance in image generation, they often struggle with the imbalanced datasets frequently encountered in real-world applications, resulting in significant performance degradation on minority classes. In this paper, we identify model capacity allocation as a key and previously underexplored factor contributing to this issue, providing a perspective that is orthogonal to existing research. Our empirical experiments and theoretical analysis reveal that majority classes monopolize an unnecessarily large portion of the model's capacity, thereby restricting the representation of minority classes. To address this, we propose Capacity Manipulation (CM), which explicitly reserves model capacity for minority classes. Our approach leverages a low-rank decomposition of model parameters and introduces a capacity manipulation loss to allocate appropriate capacity for capturing minority knowledge, thus enhancing minority class representation. Extensive experiments demonstrate that CM consistently and significantly improves the robustness of diffusion models on imbalanced datasets, and when combined with existing methods, further boosts overall performance.

One-sentence summary·Auto-generated by claude-haiku-4-5-20251001(?)

Capacity manipulation improves diffusion models' handling of class-imbalanced data by reserving capacity for minority classes via low-rank decomposition.

Contributions·Auto-generated by claude-haiku-4-5-20251001(?)

Identifies model capacity allocation as key factor in class imbalance performance degradation
Proposes capacity manipulation loss using low-rank decomposition to explicitly reserve model capacity for minority classes
Demonstrates robustness to extreme imbalance ratios across various image datasets and training scenarios

Methods used·Auto-generated by claude-haiku-4-5-20251001(?)

Low-rank decomposition
Capacity manipulation loss
Diffusion models

Datasets used·Auto-generated by claude-haiku-4-5-20251001(?)

iNaturalist
ImageNet

Limitations (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

While CM demonstrates robustness to extreme imbalance ratios (e.g., IR=500 on iNaturalist), limitations exist in scenarios of absolute sample scarcity where the reserved capacity lacks sufficient data to learn meaningful representations, limiting generation quality
from the paper
Applicability and adaptation of capacity manipulation for different data modalities (e.g., video, 3D data) and other types of generative models remains unexplored
from the paper

Future work (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Evaluate applicability and potential adaptations of capacity manipulation for different data modalities (e.g., video, 3D data) and other types of generative models
from the paper
Address few-shot to zero-shot transition with extremely low absolute number of minority samples
from the paper

Author keywords

Imbalance
Diffusion Models

Something off? Let us know →

Improving Diffusion Models for Class-imbalanced Training Data via Capacity Manipulation

Abstract

Author keywords

Related orals

Depth Anything 3: Recovering the Visual Space from Any Views

Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator

Radiometrically Consistent Gaussian Surfels for Inverse Rendering

True Self-Supervised Novel View Synthesis is Transferable

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation