Transformers are Inherently Succinct

Pascal Bergsträßer, Ryan Cotterell, Anthony Widjaja Lin

LLMs & Reasoning Fri, Apr 24 · 4:27 PM–4:37 PM · Amphitheater Avg rating: 7.00 (4–8)

Abstract

We propose succinctness as a measure of expressive power of a transformer in describing a concept. To this end, we prove that transformers are highly expressive in that they can represent formal languages substantially more succinctly than standard representations of formal languages like finite automata and Linear Temporal Logic (LTL) formulas. As a by-product of this expressivity, verifying even simple properties of transformers is shown to be provably intractable (i.e. EXPSPACE-complete).

One-sentence summary·Auto-generated by claude-haiku-4-5-20251001(?)

Proves transformers with unique-hard attention are exponentially more succinct than finite automata and LTL formulas but verification is EXPSPACE-complete.

Contributions·Auto-generated by claude-haiku-4-5-20251001(?)

Demonstrates transformers represent formal languages substantially more succinctly than automata
Proves verification of simple transformer properties is EXPSPACE-complete
Establishes succinctness as measure of expressive power for transformers

Methods used·Auto-generated by claude-haiku-4-5-20251001(?)

Formal language theory
Transformer verification
Computational complexity

Limitations (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Authors did not state explicit limitations.

Future work (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Develop automatic tools for analyzing, verifying, and explaining transformers
from the paper
Exploit techniques from automated verification to verify transformers in practice
from the paper
Study subclasses of transformers that cannot encode large counters for lower verification complexity
from the paper
Investigate learnability of succinct transformers
from the paper
Study succinctness of fixed finite precision softmax and average-hard attention transformers
from the paper

Author keywords

Transformers
complexity
expressivity
automata
LTL

Something off? Let us know →

Transformers are Inherently Succinct

Abstract

Author keywords

Related orals

Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models

Half-order Fine-Tuning for Diffusion Model: A Recursive Likelihood Ratio Optimizer

Invisible Safety Threat: Malicious Finetuning for LLM via Steganography

Reducing Belief Deviation in Reinforcement Learning for Active Reasoning of LLM Agents

RefineStat: Efficient Exploration for Probabilistic Program Synthesis