Feature Importance

A measure of each input variable's contribution to a machine learning model's predictions.

Feature importance quantifies how much each input variable (feature) contributes to a model's predictive accuracy. In quantitative signal research, features are candidate alpha signals or derived factors, and the model predicts future asset returns or alpha scores.

Common measures

Mean Decrease in Impurity (MDI) — for tree-based models (Random Forest, gradient-boosted trees): how much each feature reduces the impurity metric (e.g., MSE) summed across all trees. Fast to compute but biased toward high-cardinality continuous features.
Permutation importance — measures the drop in model performance when a feature's values are randomly shuffled, breaking its predictive relationship. More reliable than MDI; works for any model type.
SHAP values (Shapley Additive Explanations) — game-theoretic attribution of each prediction to each feature. Provides both global importance and per-prediction explanations, enabling regime analysis and feature interaction detection.

Feature importance analysis in quant research serves two purposes: identifying which signals actually drive model predictions, and detecting whether the model has learned spurious correlations from the training data that will not persist out-of-sample.

Feature Importance

Common measures

Related terms

Related articles