ICLR 2026 Orals

Let Features Decide Their Own Solvers: Hybrid Feature Caching for Diffusion Transformers

Shikang Zheng, Guantao Chen, Qinming Zhou, Yuqi Lin, Lixuan He, Chang Zou, Peiliang Cai, Jiacheng Liu, Linfeng Zhang

LLMs & Reasoning Thu, Apr 23 · 10:54 AM–11:04 AM · 201 A/B Avg rating: 7.00 (4–10)

Abstract

Diffusion Transformers offer state-of-the-art fidelity in image and video synthesis, but their iterative sampling process remains a major bottleneck due to the high cost of transformer forward passes at each timestep. To mitigate this, feature caching has emerged as a training-free acceleration technique that reuses hidden representations. However, existing methods often apply a uniform caching strategy across all feature dimensions, ignoring their heterogeneous dynamic behaviors. Therefore, we adopt a new perspective by modeling hidden feature evolution as a mixture of ODEs across dimensions, and introduce \textbf{HyCa}, a Hybrid ODE solver inspired caching framework that applies dimension-wise caching strategies. HyCa achieves near-lossless acceleration across diverse tasks and models, including 5.55$\times$ speedup on FLUX, 5.56$\times$ speedup on HunyuanVideo, 6.24$\times$ speedup on Qwen-Image and Qwen-Image-Edit without retraining.

One-sentence summary·Auto-generated by claude-haiku-4-5-20251001(?)

HyCa uses hybrid ODE solvers with dimension-wise caching strategies to accelerate diffusion transformers by 5-6x without retraining.

Contributions·Auto-generated by claude-haiku-4-5-20251001(?)
  • Models hidden feature evolution as mixture of ODEs across dimensions in diffusion transformers
  • Introduces HyCa framework applying dimension-wise caching strategies instead of uniform strategies
  • Achieves 5.55x speedup on FLUX and 6.24x on Qwen-Image without retraining
  • Demonstrates compatibility with distilled models and LoRA fine-tuning
Methods used·Auto-generated by claude-haiku-4-5-20251001(?)
  • ODE solvers
  • Feature clustering
  • Caching strategies
  • Diffusion models
Datasets used·Auto-generated by claude-haiku-4-5-20251001(?)
  • FLUX text-to-image
  • HunyuanVideo
  • Qwen-Image
  • Qwen-Image-Edit
Limitations (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Authors did not state explicit limitations.

Future work (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)
  • Extend mixture-of-ODE perspective to other generative models
    from the paper
  • Explore learning-based caching strategies to further enhance efficiency
    from the paper

Author keywords

  • Generative models
  • Efficient ML
  • Diffusion Transformer Acceleration
  • Feature Caching

Related orals

Something off? Let us know →