ICLR 2026 Orals

Temporal superposition and feature geometry of RNNs under memory demands

Pratyaksh Sharma, Alexandra Maria Proca, Lucas Prieto, Pedro A. M. Mediano

Interpretability & Mechanistic Understanding Fri, Apr 24 · 10:42 AM–10:52 AM · 204 A/B Avg rating: 7.50 (6–8)
Author-provided TL;DR

We study the feature geometry of RNNs under memory demands and characterize their representational strategies using a novel framework of temporal superposition.

Abstract

Understanding how populations of neurons represent information is a central challenge across machine learning and neuroscience. Recent work in both fields has begun to characterize the representational geometry and functionality underlying complex distributed activity. For example, artificial neural networks trained on data with more features than neurons compress data by representing features non-orthogonally in so-called *superposition*. However, the effect of time (or memory), an additional capacity-constraining pressure, on underlying representational geometry in recurrent models is not well understood. Here, we study how memory demands affect representational geometry in recurrent neural networks (RNNs), introducing the concept of temporal superposition. We develop a theoretical framework in RNNs with linear recurrence trained on a delayed serial recall task to better understand how properties of the data, task demands, and network dimensionality lead to different representational strategies, and show that these insights generalize to nonlinear RNNs. Through this, we identify an effectively linear, dense regime and a sparse regime where RNNs utilize an interference-free space, characterized by a phase transition in the angular distribution of features and decrease in spectral radius. Finally, we analyze the interaction of spatial and temporal superposition to observe how RNNs mediate different representational tradeoffs. Overall, our work offers a mechanistic, geometric explanation of representational strategies RNNs learn, how they depend on capacity and task demands, and why.

One-sentence summary·Auto-generated by claude-haiku-4-5-20251001(?)

Studies temporal superposition in RNNs showing how memory demands affect representational geometry and RNNs learn different encoding strategies.

Contributions·Auto-generated by claude-haiku-4-5-20251001(?)
  • Introduces concept of temporal superposition in RNNs
  • Develops theoretical framework characterizing how memory demands lead to different representational strategies
  • Identifies phase transition from effectively linear to interference-free space regime
Methods used·Auto-generated by claude-haiku-4-5-20251001(?)
  • Recurrent neural networks
  • Representation learning
  • Linear recurrence models
  • Geometric analysis
Limitations (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)
  • Assumes temporal independence of features and studies small RNNs
    from the paper
  • Sparsity assumption for temporal independence may be strong and task-dependent
    from the paper
  • Studies k-delay task; extending to tasks requiring manipulation of input information is challenging
    from the paper
  • Linear representation hypothesis may not capture realistic overparameterized modern models
    from the paper
Future work (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)
  • Characterize geometry and behavior for tasks requiring manipulation of input information with varying memory demands
    from the paper
  • Verify whether theoretical framework captures realistic settings in overparameterized modern models
    from the paper
  • Extend to larger-width RNNs and longer-term dependencies
    from the paper

Author keywords

  • RNNs
  • superposition
  • representational geometry
  • features
  • capacity
  • memory demands

Related orals

Something off? Let us know →