Structured Flow Autoencoders: Learning Structured Probabilistic Representations with Flow Matching

Yidan Xu, Yixin Wang, XuanLong Nguyen

Causal & Statistical Methods Sat, Apr 25 · 3:51 PM–4:01 PM · 201 A/B Avg rating: 6.00 (4–8)

Author-provided TL;DR

A framework that composes any probabilistic graphical model with flow matching, jointly learning structured latent representations and high-fidelity generative models through a single objective.

Abstract

Flow matching is a powerful approach for high-fidelity density estimation, but it often fails to capture the latent structure of complex data. Probabilistic models like variational autoencoders (VAEs), on the other hand, learn structured representations but underperform in sample quality. We propose Structured Flow Autoencoders (SFA), a family of probabilistic models that augments graphical models with conditional continuous normalizing flow (CNF) likelihoods, enabling flow-matching-based structured representation learning. At the core of SFA is a novel flow matching objective that explicitly accounts for latent variables, allowing joint learning of the CNF likelihood and posterior. SFA applies broadly to graphical models with continuous and mixture latents, as well as latent dynamical systems. Empirical studies across image, video, and RNA-seq data show that SFA consistently outperforms VAEs and their structured extensions in generation quality, representation utility, and scalability to large datasets. Compared to generative models like latent flow matching (LatentFM), SFA also produces more diverse samples, suggesting better coverage of the data distribution.

One-sentence summary·Auto-generated by claude-haiku-4-5-20251001(?)

Structured Flow Autoencoders integrate flow matching with graphical models for structured representation learning.

Contributions·Auto-generated by claude-haiku-4-5-20251001(?)

Structured conditional flow matching objective accounting for latent variables in flow matching
Enables joint learning of CNF likelihood and posterior for probabilistic graphical models
Framework applicable to continuous, mixture, and dynamical latent structures
Outperforms VAEs and latent flow matching on generation quality and representation utility

Methods used·Auto-generated by claude-haiku-4-5-20251001(?)

Flow matching
Graphical models
Continuous normalizing flows
Variational autoencoders

Limitations (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

CNF posterior variant remains computationally expensive due to ODE solver at each gradient step
from the paper
Principled selection strategies for posterior family remain open question
from the paper
Architectural challenge of designing UNet-based decoders conditioning on learned stochastic latent
from the paper
Standard conditioning mechanisms may be insufficient when latent carries distributional uncertainty
from the paper

Future work (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Authors did not state explicit future directions.

Author keywords

Flow Matching
Probabilistic Model
Representation Learning
Probabilistic Graphical Model
Autoencoder

Something off? Let us know →

Structured Flow Autoencoders: Learning Structured Probabilistic Representations with Flow Matching

Abstract

Author keywords

Related orals

Causal Structure Learning in Hawkes Processes with Complex Latent Confounder Networks

CauKer: Classification Time Series Foundation Models Can Be Pretrained on Synthetic Data

Global Resolution: Optimal Multi-Draft Speculative Sampling via Convex Optimization

Conformal Robustness Control: A New Strategy for Robust Decision