Optimally Combining Forecasts

Online Learning with provable performance guarantees

May 19, 2026

∙ Paid

Imagine you have multiple models forecasting asset returns.

Maybe one is a linear autoregressive model that performs well in trending markets. Another is a mean-reversion model, and a third is a more complex machine learning model that captures nonlinear patterns. Each has its strengths and weaknesses, and most importantly, you don’t know in advance which one will be best tomorrow.

The naive approach is to pick the best model in-sample and deploy it. Another naive approach is equal weighting. But those two approaches ignore everything you learn as new data arrives.

What if there exists a principled way to combine your models that:

Allocates a lot of weight to models that currently perform well and less weight to models that perform poorly.
Requires no retraining, no hyperparameter tuning, no rebalancing decisions.
Comes with a mathematical guarantee that you perform nearly as well as the best model.

In this article, we present an algorithm that is able to achieve all of the above, demonstrate its behavior, and show how this model shines when there are different regimes where different models perform best.

I write about quantitative trading the way it’s actually practiced:

Robust models and portfolios, combining signals and strategies, understanding the assumptions behind your models.

Topics I write about include portfolio construction, market making, risk management, research methodology, and more.

If this way of thinking resonates, you’ll probably like what I publish.

What you’ll learn:

The theoretical framework of online learning and how it applies to forecast combination.
What regret is, why minimizing it is the right objective, and how it differs from standard in-sample loss minimization.
Why square loss has a special property called 2-mixability that enables a provably optimal combination algorithm.
How the aggregating forecaster works, why it is minimax optimal, and how to implement it from scratch in Python.
How to extend the algorithm with polynomial discounting to handle non-stationary markets with regime changes.
Why the optimal discount rate depends on regime persistence, and how to tune it empirically.

VertoxQuant

Optimally Combining Forecasts

Online Learning with provable performance guarantees

What you’ll learn:

This post is for paid subscribers