ml technology

Ensemble Methods

Techniques that combine predictions from multiple models to reduce variance and improve out-of-sample robustness.

Ensemble methods improve prediction by combining the outputs of multiple models rather than relying on a single model's forecast. In quantitative signal research, ensembles are widely used to generate more stable composite alpha signals and to reduce the risk that any single model overfits in-sample.

Common ensemble techniques

  • Bagging (Bootstrap AGGregatING) — trains many models on different bootstrap samples of the data and averages their predictions. Reduces variance. Random Forests are the canonical example.
  • Boosting — trains models sequentially, each correcting the errors of the previous. Reduces bias. XGBoost, LightGBM, and CatBoost are widely used in finance.
  • Stacking — uses a meta-model to learn the optimal combination of base models' predictions, trained on held-out data.
  • IC-weighted averaging — a signal-research-specific ensemble: weight each sub-signal by its recent rolling IC, giving more influence to whichever signal has been most predictive recently.

Ensemble methods are also a form of signal aggregation: rather than deciding which single signal to use, they let the data determine the optimal blend.

Related terms

Related articles