ICLR 2026 Orals

Probabilistic Kernel Function for Fast Angle Testing

Kejing Lu, Chuan Xiao, Yoshiharu Ishikawa

Efficiency, Systems & Kernels Thu, Apr 23 · 3:27 PM–3:37 PM · 201 C Avg rating: 8.00 (8–8)
Author-provided TL;DR

We propose two probabilistic kernel functions for angle testing, which can be used to accelerate vector-based similarity search.

Abstract

In this paper, we study the angle testing problem in the context of similarity search in high-dimensional Euclidean spaces and propose two projection-based probabilistic kernel functions, one designed for angle comparison and the other for angle thresholding. Unlike existing approaches that rely on random projection vectors drawn from Gaussian distributions, our approach leverages reference angles and adopts a deterministic structure for the projection vectors. Notably, our kernel functions do not require asymptotic assumptions, such as the number of projection vectors tending to infinity, and can be theoretically and experimentally shown to outperform Gaussian-distribution-based kernel functions. We apply the proposed kernel function to Approximate Nearest Neighbor Search (ANNS) and demonstrate that our approach achieves a 2.5x--3x higher query-per-second (QPS) throughput compared to the widely-used graph-based search algorithm HNSW. Our code and data are available at https://github.com/KejingLu-810/KS.

One-sentence summary·Auto-generated by claude-haiku-4-5-20251001(?)

Proposes probabilistic kernel functions for angle testing enabling efficient approximate nearest neighbor search.

Contributions·Auto-generated by claude-haiku-4-5-20251001(?)
  • Two projection-based probabilistic kernel functions for angle comparison and thresholding
  • Leverages reference angles and deterministic projection vector structure without asymptotic assumptions
  • Achieves 2.5-3x higher query-per-second throughput compared to HNSW algorithm
Methods used·Auto-generated by claude-haiku-4-5-20251001(?)
  • Probabilistic kernel functions
  • Angle testing
  • Projection-based methods
  • Approximate nearest neighbor search
  • Graph-based search
Datasets used·Auto-generated by claude-haiku-4-5-20251001(?)
  • SIFT
  • Word2Vec
  • GIST
Limitations (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Authors did not state explicit limitations.

Future work (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Authors did not state explicit future directions.

Author keywords

  • Randomized algorithm
  • Locality Sensitive Hashing
  • Directional statistics

Related orals

Something off? Let us know →