Feature importance quantifies how much each input variable (feature) contributes to a model's predictive accuracy. In quantitative signal research, features are candidate alpha signals or derived factors, and the model predicts future asset returns or alpha scores.
Common measures
- Mean Decrease in Impurity (MDI) — for tree-based models (Random Forest, gradient-boosted trees): how much each feature reduces the impurity metric (e.g., MSE) summed across all trees. Fast to compute but biased toward high-cardinality continuous features.
- Permutation importance — measures the drop in model performance when a feature's values are randomly shuffled, breaking its predictive relationship. More reliable than MDI; works for any model type.
- SHAP values (Shapley Additive Explanations) — game-theoretic attribution of each prediction to each feature. Provides both global importance and per-prediction explanations, enabling regime analysis and feature interaction detection.
Feature importance analysis in quant research serves two purposes: identifying which signals actually drive model predictions, and detecting whether the model has learned spurious correlations from the training data that will not persist out-of-sample.