Premise Selection for a Lean Hammer
Thomas Zhu, Joshua Clune, Jeremy Avigad, Albert Q. Jiang, Sean Welleck
LeanHammer integrates neural premise selection with symbolic reasoning to automate theorem proving in Lean.
Abstract
Neural methods are transforming automated reasoning for proof assistants, yet integrating these advances into practical verification workflows remains challenging. A $\textit{hammer}$ is a tool that integrates premise selection, translation to external automatic theorem provers, and proof reconstruction into one overarching tool to automate tedious reasoning steps. We present LeanPremise, a novel neural premise selection system, and we combine it with existing translation and proof reconstruction components to create LeanHammer, the first end-to-end domain general hammer for the Lean proof assistant. Unlike existing Lean premise selectors, LeanPremise is specifically trained for use with a hammer in dependent type theory. It also dynamically adapts to user-specific contexts, enabling it to effectively recommend premises from libraries outside LeanPremise's training data as well as lemmas defined by the user locally. With comprehensive evaluations, we show that LeanPremise enables LeanHammer to solve 21\% more goals than existing premise selectors and generalizes well to diverse domains. Our work helps bridge the gap between neural retrieval and symbolic reasoning, making formal verification more accessible to researchers and practitioners.
LeanHammer combines neural premise selection with symbolic automation for first end-to-end hammer in Lean proof assistant.
- Develops LeanPremise, neural premise selection system trained specifically for hammer use in dependent type theory
- Creates LeanHammer, first end-to-end domain-general hammer for Lean proof assistant
- Enables dynamic adaptation to user-specific contexts and recommendation of premises from libraries outside training data
- Neural premise selection
- Automated theorem proving
- Dependent type theory
- Mathlib
- miniCTX-v2
Authors did not state explicit limitations.
Authors did not state explicit future directions.
Author keywords
- premise selection
- interactive theorem proving
- automated reasoning
- contrastive learning
Related orals
Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models
Benchmarks practical privacy risks in differential privacy-adapted LLMs, revealing distribution shifts and model choice impact effectiveness.
Half-order Fine-Tuning for Diffusion Model: A Recursive Likelihood Ratio Optimizer
Proposes Recursive Likelihood Ratio optimizer for efficient fine-tuning of diffusion models with lower variance gradient estimation.
Invisible Safety Threat: Malicious Finetuning for LLM via Steganography
Demonstrates LLMs can be finetuned to generate harmful steganographically-hidden outputs while appearing benign to safety systems.
Reducing Belief Deviation in Reinforcement Learning for Active Reasoning of LLM Agents
Proposes T3 algorithm to detect belief deviation in LLM agents and truncate trajectories for improved reinforcement learning in active reasoning tasks.
RefineStat: Efficient Exploration for Probabilistic Program Synthesis
RefineStat enforces semantic constraints and applies diagnostic-aware refinement for synthesizing valid probabilistic programs from smaller language models.