Shoot First, Ask Questions Later? Building Rational Agents that Explore and Act Like People
Gabriel Grand, Valerio Pepe, Joshua B. Tenenbaum, Jacob Andreas
We introduce a collaborative Battleship task to evaluate information-seeking in humans and agents; insights from Bayesian Experimental Design (BED) yield inference-time strategies for building resource-rational agents in discovery settings.
Abstract
Many emerging applications of AI—from scientific discovery to medical diagnosis—require agents to seek information strategically: forming hypotheses, asking targeted questions, and making decisions under uncertainty. In high-stakes settings with limited resources, do language models (LMs) behave like rational agents? Drawing on insights from human cognition, we develop methods to evaluate and enhance agentic information-seeking. First, we introduce a decision-oriented dialogue task called Collaborative Battleship, in which a Captain must balance exploration (asking questions) and action (taking shots), while a Spotter must supply accurate, contextually-grounded answers. Compared to human players (N=42), we find that many LM agents struggle to ask informative questions, produce accurate answers, and identify high-utility actions. To address these gaps, we develop novel Monte Carlo inference strategies for LMs inspired by Bayesian Experimental Design (BED). For Spotter agents, our approach boosts accuracy by up to 14.7% absolute over LM-only baselines; for Captain agents, it raises expected information gain (EIG) by up to 0.227 bits (94.2% of the achievable noise ceiling). Combined, these components yield sharper targeting (+0.303–0.374 F1), and enable weaker LMs, such as Llama-4-Scout, to outperform both humans (8% → 82% win rate) and frontier models (0% → 67% win rate vs. GPT-5) at ≈1% of GPT-5's cost. We replicate these findings on Guess Who?, where our methods significantly boost accuracy (+28.3–42.4 p.p.), demonstrating their general applicability for building information-seeking agents.
Develops methods for LMs to ask informative questions and make decisions under uncertainty using Bayesian Experimental Design.
- Collaborative Battleship task replicating core Bayesian Experimental Design components for evaluating agent information-seeking
- Monte Carlo inference strategies for LMs inspired by BED improve accuracy up to 14.7% and expected information gain up to 94.2%
- Demonstrates weaker LMs can outperform humans and frontier models at resource-rational agents
- Bayesian Experimental Design
- Monte Carlo inference
- Language models
- In-context learning
- Collaborative Battleship task
- Guess Who? task
Pragmatic behaviors not explicitly modeled; incorporating rational speech acts framework could improve agent sophistication
from the paperFixed epsilon does not account for differences in reliability across individual information sources
from the paperReliance on efficient sampling from generative world model; in general settings may require learning model via code synthesis or diffusion
from the paper
Building agents that collaborate effectively with people is increasingly important and Collaborative Battleship provides ideal setting
from the paper
Author keywords
- Bayesian experimental design
- information-seeking
- question asking
- Collaborative Battleship
- expected information gain (EIG)
- explore-exploit tradeoffs
- resource rationality
- probabilistic inference
- Monte Carlo sampling
- symbolic grounding
- code generation
- reasoning
- decision-oriented dialogue
- cognitive modeling
- human behavior
- language model agents
- scientific discovery
Related orals
Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models
Benchmarks practical privacy risks in differential privacy-adapted LLMs, revealing distribution shifts and model choice impact effectiveness.
Half-order Fine-Tuning for Diffusion Model: A Recursive Likelihood Ratio Optimizer
Proposes Recursive Likelihood Ratio optimizer for efficient fine-tuning of diffusion models with lower variance gradient estimation.
Invisible Safety Threat: Malicious Finetuning for LLM via Steganography
Demonstrates LLMs can be finetuned to generate harmful steganographically-hidden outputs while appearing benign to safety systems.
Reducing Belief Deviation in Reinforcement Learning for Active Reasoning of LLM Agents
Proposes T3 algorithm to detect belief deviation in LLM agents and truncate trajectories for improved reinforcement learning in active reasoning tasks.
RefineStat: Efficient Exploration for Probabilistic Program Synthesis
RefineStat enforces semantic constraints and applies diagnostic-aware refinement for synthesizing valid probabilistic programs from smaller language models.