A Representer Theorem for Hawkes Processes via Penalized Least Squares Minimization
Hideaki Kim, Tomoharu Iwata
Abstract
The representer theorem is a cornerstone of kernel methods, which aim to estimate latent functions in reproducing kernel Hilbert spaces (RKHSs) in a nonparametric manner. Its significance lies in converting inherently infinite-dimensional optimization problems into finite-dimensional ones over dual coefficients, thereby enabling practical and computationally tractable algorithms. In this paper, we address the problem of estimating the latent triggering kernels--functions that encode the interaction structure between events--for linear multivariate Hawkes processes based on observed event sequences within an RKHS framework. We show that, under the principle of penalized least squares minimization, a novel form of representer theorem emerges: a family of transformed kernels can be defined via a system of simultaneous integral equations, and the optimal estimator of each triggering kernel is expressed as a linear combination of these transformed kernels evaluated at the data points. Remarkably, the dual coefficients are all analytically fixed to unity, obviating the need to solve a costly optimization problem to obtain the dual coefficients. This leads to a highly efficient estimator capable of handling large-scale data more effectively than conventional nonparametric approaches. Empirical evaluations on synthetic and real-world datasets reveal that the proposed method achieves competitive predictive accuracy while substantially improving computational efficiency compared to state-of-the-art kernel method-based estimators.
Representer theorem for Hawkes processes shows dual coefficients are analytically fixed to unity via penalized least squares.
- Novel representer theorem for multivariate Hawkes processes showing optimal estimators are linear combinations of transformed kernels
- Analytical solution with dual coefficients fixed to unity eliminates need for costly optimization
- Efficient estimator handling large-scale data better than conventional nonparametric RKHS-based approaches
- Kernel methods
- Reproducing kernel Hilbert spaces (RKHS)
- Penalized least squares
- Representer theorem
Linear Hawkes process assumption does not guarantee non-negativity of intensity function requiring post-hoc clipping
from the paperComputational complexity scales cubically with dimensionality U, empirically robust for moderate dimensions up to few hundred
from the paper
Rigorous theoretical analysis of learning guarantees
from the paper
Author keywords
- Hawkes processes
- kernel methods
- representer theorem
- point processes
- least squares loss
Related orals
On The Surprising Effectiveness of a Single Global Merging in Decentralized Learning
Shows decentralized learning with single global merging achieves convergence rates matching parallel SGD under data heterogeneity.
Non-Convex Federated Optimization under Cost-Aware Client Selection
Develops efficient federated optimization algorithm with cost-aware client selection achieving best communication and local complexity.
Fast Escape, Slow Convergence: Learning Dynamics of Phase Retrieval under Power-Law Data
Analyzes phase retrieval learning dynamics with anisotropic data, deriving explicit scaling laws and three-phase trajectories.
Quantitative Bounds for Length Generalization in Transformers
Quantitative bounds show training length required for length generalization depends on periodicity, locality, alphabet size, and model norms.
Difficult Examples Hurt Unsupervised Contrastive Learning: A Theoretical Perspective
EBTs frame System 2 thinking as energy minimization enabling inference-time reasoning emergence across modalities.