On the Wasserstein Geodesic Principal Component Analysis of probability measures
Nina Vesseron, Elsa Cazelles, Alice Le Brigant, Klein
Abstract
This paper focuses on Geodesic Principal Component Analysis (GPCA) on a collection of probability distributions using the Otto-Wasserstein geometry. The goal is to identify geodesic curves in the space of probability measures that best capture the modes of variation of the underlying dataset. We first address the case of a collection of Gaussian distributions, and show how to lift the computations in the space of invertible linear maps. For the more general setting of absolutely continuous probability measures, we leverage a novel approach to parameterizing geodesics in Wasserstein space with neural networks. Finally, we compare to classical tangent PCA through various examples and provide illustrations on real-world datasets.
Geodesic PCA for probability distributions using Wasserstein geometry with neural network parametrization for continuous distributions.
- Method for exact GPCA on Gaussian distributions via lift computations in space of invertible linear maps
- Novel neural network approach to parameterize geodesics in Wasserstein space for absolutely continuous distributions
- Sampling capability from any point along geodesic components without empirical approximations
- Geodesic Principal Component Analysis
- Wasserstein geometry
- neural network parametrization
- optimal transport
GPCA and TPCA yield similar results for most Gaussian distributions except those with covariance matrices near SPD cone boundary
from the paper
Develop more fundamental theories to explain Intrinsic Entropy measurements
from the paperExplore convex function parametrization without imposing hard architectural constraints
from the paper
Author keywords
- wasserstein PCA
- optimal transport
- deep learning
Related orals
TabStruct: Measuring Structural Fidelity of Tabular Data
TabStruct benchmark evaluates tabular data generators on structural fidelity and conventional dimensions using global utility metric without ground-truth causal structures.
Monocular Normal Estimation via Shading Sequence Estimation
RoSE estimates surface normals via shading sequence prediction, addressing 3D misalignment in monocular normal estimation.
TTSDS2: Resources and Benchmark for Evaluating Human-Quality Text to Speech Systems
TTSDS2 metric robustly correlates with human judgments for TTS evaluation across diverse speech domains maintaining >0.5 Spearman correlation.
World-In-World: World Models in a Closed-Loop World
Introduces closed-loop benchmark evaluating generative world models on embodied task performance rather than visual quality.
EditBench: Evaluating LLM Abilities to Perform Real-World Instructed Code Edits
Introduces EditBench benchmark for real-world LLM code editing with 545 problems from actual developer usage.