CauKer: Classification Time Series Foundation Models Can Be Pretrained on Synthetic Data

Shifeng Xie, Vasilii Feofanov, Jianfeng Zhang, Themis Palpanas, Ievgen Redko

Causal & Statistical Methods Sat, Apr 25 · 10:54 AM–11:04 AM · 201 C Avg rating: 6.00 (4–8)

Abstract

Time series foundation models (TSFMs) have recently gained significant attention due to their strong zero-shot capabilities and widespread real-world applications. Such models typically require a computationally costly pretraining on large-scale, carefully curated collections of real-world sequences. To allow for a sample-efficient pretraining of TSFMs, we propose CauKer, a novel algorithm designed to generate diverse, causally coherent synthetic time series with realistic trends, seasonality, and nonlinear interactions. CauKer combines Gaussian Process (GP) kernel composition with Structural Causal Models (SCM) to produce data for sample-efficient pretraining of state-of-the-art classification TSFMs having different architectures and following different pretraining approaches. Additionally, our experiments reveal that CauKer-generated datasets exhibit clear scaling laws for both dataset size (10K to 10M samples) and model capacity (1M to 783M parameters), unlike real-world datasets, which display irregular scaling behavior.

One-sentence summary·Auto-generated by claude-haiku-4-5-20251001(?)

Generates diverse synthetic time series for pretraining foundation models with clear scaling laws.

Contributions·Auto-generated by claude-haiku-4-5-20251001(?)

Proposes CauKer algorithm combining Gaussian Process kernels with Structural Causal Models
Generates causally coherent synthetic data with realistic trends, seasonality, and nonlinear interactions
Demonstrates TSFMs pretrained on CauKer-generated data match performance of larger real-world datasets
Reveals clear scaling laws for synthetic data unlike irregular patterns in real-world datasets

Methods used·Auto-generated by claude-haiku-4-5-20251001(?)

Gaussian processes
Structural causal models
Synthetic data generation

Limitations (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Considered only two models following different pre-training paradigms
from the paper
Did not consider large-scale forecasting benchmarks such as Time-300B
from the paper

Future work (author-stated)·Auto-generated by claude-haiku-4-5-20251001(?)

Authors did not state explicit future directions.

Author keywords

Time Series Foundation Model
Time Series Classification

Something off? Let us know →

CauKer: Classification Time Series Foundation Models Can Be Pretrained on Synthetic Data

Abstract

Author keywords

Related orals

Causal Structure Learning in Hawkes Processes with Complex Latent Confounder Networks

Global Resolution: Optimal Multi-Draft Speculative Sampling via Convex Optimization

Conformal Robustness Control: A New Strategy for Robust Decision

Structured Flow Autoencoders: Learning Structured Probabilistic Representations with Flow Matching