ml technology

Feature Importance

A measure of each input variable's contribution to a machine learning model's predictions.

Feature importance quantifies how much each input variable (feature) contributes to a model's predictive accuracy. In quantitative signal research, features are candidate alpha signals or derived factors, and the model predicts future asset returns or alpha scores.

Common measures

  • Mean Decrease in Impurity (MDI) — for tree-based models (Random Forest, gradient-boosted trees): how much each feature reduces the impurity metric (e.g., MSE) summed across all trees. Fast to compute but biased toward high-cardinality continuous features.
  • Permutation importance — measures the drop in model performance when a feature's values are randomly shuffled, breaking its predictive relationship. More reliable than MDI; works for any model type.
  • SHAP values (Shapley Additive Explanations) — game-theoretic attribution of each prediction to each feature. Provides both global importance and per-prediction explanations, enabling regime analysis and feature interaction detection.

Feature importance analysis in quant research serves two purposes: identifying which signals actually drive model predictions, and detecting whether the model has learned spurious correlations from the training data that will not persist out-of-sample.

Related terms

Related articles