ICLR 2026 Orals
← All orals

LLMs & Reasoning

Language models, chain-of-thought, reasoning, RLHF, alignment post-training, and evaluation of LLM capabilities.

All papers

Min rating

DepthLM: Metric Depth from Vision Language Models

DepthLM shows VLMs can match pure vision models in metric depth estimation with text-based supervised finetuning and visual prompting without architecture changes.

Avg rating: 6.67 (4–10) · Zhipeng Cai et al.

Hallucination Begins Where Saliency Drops

Gradient-aware diagnostic tool using saliency to identify hallucination patterns, proposing SGRS and LocoRE interventions to reduce output errors.

Avg rating: 6.00 (4–8) · Xiaofeng Zhang et al.

In-Place Test-Time Training

In-Place TTT framework enables LLMs to perform test-time training by adapting MLP projection matrices with alignment to next-token prediction.

Avg rating: 7.33 (6–8) · Guhao Feng et al.

LLMs Get Lost In Multi-Turn Conversation

Study showing LLMs exhibit 39% average performance drop in multi-turn conversations, failing to recover from wrong contextual assumptions.

Avg rating: 8.00 (6–10) · Philippe Laban et al.

Modality-free Graph In-context Alignment

MF-GIA framework enables graph neural networks to perform in-context learning across heterogeneous domains without modality assumptions using gradient fingerprints.

Avg rating: 6.00 (4–8) · Wei Zhuo et al.

Multiplayer Nash Preference Optimization

MNPO extends Nash learning to multiplayer regime for aligning LLMs with heterogeneous human preferences via n-player game formulation.

Avg rating: 6.00 (4–8) · Fang Wu et al.

Pre-training under infinite compute

Shows optimal weight decay is 30x larger than standard practice; ensembling achieves lower loss asymptote enabling data-efficient pre-training at scale.

Avg rating: 7.50 (6–8) · Konwoo Kim et al.

Premise Selection for a Lean Hammer

LeanHammer combines neural premise selection with symbolic automation for first end-to-end hammer in Lean proof assistant.

Avg rating: 6.50 (4–8) · Thomas Zhu et al.

Softmax Transformers are Turing-Complete

Proves length-generalizable softmax transformers with chain-of-thought and relative positional encoding are Turing-complete.

Avg rating: 5.50 (2–10) · Hongjian Jiang et al.

Transformers are Inherently Succinct

Proves transformers with unique-hard attention are exponentially more succinct than finite automata and LTL formulas but verification is EXPSPACE-complete.

Avg rating: 7.00 (4–8) · Pascal Bergsträßer et al.