Hyperparameter Trajectory Inference with Conditional Lagrangian Optimal Transport
Harry Amad, Mihaela van der Schaar
Abstract
Neural networks (NNs) often have critical behavioural trade-offs that are set at design time with hyperparameters—such as reward weights in reinforcement learning or quantile targets in regression. Post-deployment, however, user preferences can evolve, making initial settings undesirable, necessitating potentially expensive retraining. To circumvent this, we introduce the task of Hyperparameter Trajectory Inference (HTI): to learn, from observed data, how a NN's conditional output distribution changes with its hyperparameters, and construct a surrogate model that approximates the NN at unobserved hyperparameter settings. HTI requires extending existing trajectory inference approaches to incorporate conditions, exacerbating the challenge of ensuring inferred paths are feasible. We propose an approach based on conditional Lagrangian optimal transport, jointly learning the Lagrangian function governing hyperparameter-induced dynamics along with the associated optimal transport maps and geodesics between observed marginals, which form the surrogate model. We incorporate inductive biases based on the manifold hypothesis and least-action principles into the learned Lagrangian, improving surrogate model feasibility. We empirically demonstrate that our approach reconstructs NN outputs across various hyperparameter spectra better than other alternatives.
Hyperparameter Trajectory Inference uses conditional Lagrangian optimal transport to reconstruct neural network outputs across hyperparameter spectra without expensive retraining.
- Proposes Hyperparameter Trajectory Inference (HTI) task to learn how conditional output distributions change with hyperparameters
- Develops approach based on conditional Lagrangian optimal transport jointly learning Lagrangian function and optimal transport maps
- Incorporates inductive biases from manifold hypothesis and least-action principles to improve surrogate model feasibility
- Conditional Lagrangian optimal transport
- Trajectory inference
- Optimal transport maps
HTI will be challenging when underlying dynamics are chaotic, making inference from sparse samples inherently difficult
from the paperMethod applicable only for varying a single, continuous hyperparameter
from the paperRelatively simple settings demonstrated; further investigation across wider range of hyperparameter landscapes warranted
from the paper
Explore extensions to handle multiple hyperparameters simultaneously
from the paper
Author keywords
- hyperparameter
- optimal transport
- trajectory inference
- manifold learning
- interpolation
Related orals
Mastering Sparse CUDA Generation through Pretrained Models and Deep Reinforcement Learning
SparseRL leverages deep RL and pretrained models to generate high-performance CUDA code for sparse matrix operations.
Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling
DECS framework reduces reasoning model overthinking by decoupling necessary from redundant tokens via curriculum scheduling.
MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent
MemAgent uses RL-trained memory modules to enable LLMs to extrapolate from 8K to 3.5M token contexts with minimal performance degradation.
DiffusionNFT: Online Diffusion Reinforcement with Forward Process
DiffusionNFT enables efficient online reinforcement learning for diffusion models via forward process optimization with up to 25x efficiency gains.
Q-RAG: Long Context Multi‑Step Retrieval via Value‑Based Embedder Training
Q-RAG fine-tunes embedders for multi-step retrieval using reinforcement learning, achieving state-of-the-art on long-context QA.