TRACE: Your Diffusion Model is Secretly an Instance Edge Detector

Sanghyun Jo, Ziseok Lee, Wooyeol Lee, Jonghyun Choi, Jaesik Park, Kyungsu Kim

Diffusion & Flow Matching Sat, Apr 25 · 11:42 AM–11:52 AM · 201 A/B Avg rating: 6.00 (4–8)

Author-provided TL;DR

TRACE turns pretrained diffusion models into annotation-free instance edge generators for instance and panoptic segmentation.

Abstract

High-quality instance and panoptic segmentation has traditionally relied on dense instance-level annotations such as masks, boxes, or points, which are costly, inconsistent, and difficult to scale. Unsupervised and weakly-supervised approaches reduce this burden but remain constrained by semantic backbone constraints and human bias, often producing merged or fragmented outputs. We present TRACE (TRAnsforming diffusion Cues to instance Edges), showing that text-to-image diffusion models secretly function as instance edge annotators. TRACE identifies the Instance Emergence Point (IEP) where object boundaries first appear in self-attention maps, extracts boundaries through Attention Boundary Divergence (ABDiv), and distills them into a lightweight one-step edge decoder. This design removes the need for per-image diffusion inversion, achieving 81× faster inference while producing sharper and more connected boundaries. On the COCO benchmark, TRACE improves unsupervised instance segmentation by +5.1 AP, and in tag-supervised panoptic segmentation it outperforms point-supervised baselines by +1.7 PQ without using any instance-level labels. These results reveal that diffusion models encode hidden instance boundary priors, and that decoding these signals offers a practical and scalable alternative to costly manual annotation. **Project Page:** https://shjo-april.github.io/TRACE.

One-sentence summary·Auto-generated by claude-haiku-4-5-20251001(?)

TRACE reveals diffusion models encode hidden instance boundary priors and leverages them for unsupervised instance segmentation without dense annotations.

Contributions·Auto-generated by claude-haiku-4-5-20251001(?)

Identifies Instance Emergence Point where object boundaries first appear in diffusion self-attention
Extracts boundaries through Attention Boundary Divergence enabling 81x faster inference
Achieves 5.1 AP improvement in unsupervised instance segmentation without instance-level labels

Methods used·Auto-generated by claude-haiku-4-5-20251001(?)

Diffusion models
Instance segmentation
Self-attention analysis
Edge detection

Datasets used·Auto-generated by claude-haiku-4-5-20251001(?)

COCO

Limitations (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Authors did not state explicit limitations.

Future work (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Extend to video panoptic segmentation, medical imaging, and open-vocabulary grouping combining text and TRACE
from the paper

Author keywords

diffusion
unsupervised instance segmentation
weakly-supervised panoptic segmentation
inference dynamics
attention

Something off? Let us know →

TRACE: Your Diffusion Model is Secretly an Instance Edge Detector

Abstract

Author keywords

Related orals

Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)

GLASS Flows: Efficient Inference for Reward Alignment of Flow and Diffusion Models

Neon: Negative Extrapolation From Self-Training Improves Image Generation

Generative Human Geometry Distribution

Cross-Domain Lossy Compression via Rate- and Classification-Constrained Optimal Transport