ICLR 2026 Orals

Structured Flow Autoencoders: Learning Structured Probabilistic Representations with Flow Matching

Yidan Xu, Yixin Wang, XuanLong Nguyen

Causal & Statistical Methods Sat, Apr 25 · 3:51 PM–4:01 PM · 201 A/B Avg rating: 6.00 (4–8)
Author-provided TL;DR

A framework that composes any probabilistic graphical model with flow matching, jointly learning structured latent representations and high-fidelity generative models through a single objective.

Abstract

Flow matching is a powerful approach for high-fidelity density estimation, but it often fails to capture the latent structure of complex data. Probabilistic models like variational autoencoders (VAEs), on the other hand, learn structured representations but underperform in sample quality. We propose Structured Flow Autoencoders (SFA), a family of probabilistic models that augments graphical models with conditional continuous normalizing flow (CNF) likelihoods, enabling flow-matching-based structured representation learning. At the core of SFA is a novel flow matching objective that explicitly accounts for latent variables, allowing joint learning of the CNF likelihood and posterior. SFA applies broadly to graphical models with continuous and mixture latents, as well as latent dynamical systems. Empirical studies across image, video, and RNA-seq data show that SFA consistently outperforms VAEs and their structured extensions in generation quality, representation utility, and scalability to large datasets. Compared to generative models like latent flow matching (LatentFM), SFA also produces more diverse samples, suggesting better coverage of the data distribution.

One-sentence summary·Auto-generated by claude-haiku-4-5-20251001(?)

Structured Flow Autoencoders integrate flow matching with graphical models for structured representation learning.

Contributions·Auto-generated by claude-haiku-4-5-20251001(?)
  • Structured conditional flow matching objective accounting for latent variables in flow matching
  • Enables joint learning of CNF likelihood and posterior for probabilistic graphical models
  • Framework applicable to continuous, mixture, and dynamical latent structures
  • Outperforms VAEs and latent flow matching on generation quality and representation utility
Methods used·Auto-generated by claude-haiku-4-5-20251001(?)
  • Flow matching
  • Graphical models
  • Continuous normalizing flows
  • Variational autoencoders
Limitations (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)
  • CNF posterior variant remains computationally expensive due to ODE solver at each gradient step
    from the paper
  • Principled selection strategies for posterior family remain open question
    from the paper
  • Architectural challenge of designing UNet-based decoders conditioning on learned stochastic latent
    from the paper
  • Standard conditioning mechanisms may be insufficient when latent carries distributional uncertainty
    from the paper
Future work (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Authors did not state explicit future directions.

Author keywords

  • Flow Matching
  • Probabilistic Model
  • Representation Learning
  • Probabilistic Graphical Model
  • Autoencoder

Related orals

Something off? Let us know →