Performance Metrics for Signal Evaluation

Key Takeaways

No single number evaluates a signal. You need predictive-power metrics, risk-adjusted return metrics, cost-aware metrics, and statistical-significance metrics together.
The information coefficient (IC) measures how well a signal ranks future returns; the information ratio (IR) measures active return per unit of active risk.
Grinold's Fundamental Law of Active Management links them: IR is approximately IC times the square root of the number of independent bets (breadth).
A raw Sharpe ratio from a backtest is biased upward by how many strategies you tried — corrections such as the deflated Sharpe ratio exist precisely to undo that bias.
The decisive question is whether the edge survives out-of-sample and net of realistic transaction costs, not whether the in-sample curve looks smooth.

A trading signal is only as good as your ability to measure it honestly. The hard part is not computing a return number — it is separating a genuine, repeatable edge from a backtest that looks good by luck. This guide walks through the metrics that matter for evaluating a signal, what each one actually tells you, and the traps that make a strong-looking number meaningless.

Predictive power: the information coefficient

Before any return can be earned, a signal has to actually predict something. The information coefficient (IC) measures that directly: it is the correlation between the signal's values today and the realised forward returns over the holding horizon. Practitioners usually compute the rank (Spearman) IC, which is robust to outliers and matches how signals are typically used to rank assets.

An IC is small by nature — useful signals in liquid markets often have ICs that look tiny to an outsider. What matters is that the IC is positive, stable across time, and stable across the assets you trade. Two further diagnostics help:

IC decay: compute the IC at several forward horizons. The horizon where IC peaks tells you the natural holding period; the rate at which it falls off tells you how fast the signal must be acted on.
Hit rate: the fraction of predictions with the correct sign. It is intuitive but incomplete — a signal can have a modest hit rate yet be very profitable if its correct calls are larger than its wrong ones.

Risk-adjusted return: Sharpe, Sortino, and the information ratio

Return alone is not a verdict — it has to be judged against the risk taken to earn it. The standard measures are:

Sharpe ratio: excess return divided by the volatility of returns, annualised. It is the default common currency for comparing strategies, but it penalises upside and downside volatility equally.
Sortino ratio: a variant that divides by downside deviation only, which better reflects that investors fear losses more than they dislike upside swings.
Information ratio (IR): active return (return above a benchmark) divided by tracking error (the volatility of that active return). For a strategy measured against a benchmark, the IR is often more relevant than the raw Sharpe.

Sortino, Calmar and especially the deflated Sharpe (which penalises the number of trials) give a more honest read than Sharpe alone. Relative and illustrative.

The Fundamental Law of Active Management

Richard Grinold's Fundamental Law of Active Management connects skill to results in a single, clarifying relationship: the information ratio is approximately the information coefficient multiplied by the square root of breadth, where breadth is the number of independent bets made per period.

The practical lesson is profound. A small per-bet edge (a low IC) can still produce a strong information ratio if it is applied across many independent opportunities. This is the mathematical argument for diversification across names and across time, and it explains why a "weak" signal traded broadly can beat a "strong" signal traded narrowly. The catch is the word independent: if your bets are correlated, your effective breadth is far lower than your trade count suggests, and the law flatters you.

Accounting for cost and turnover

Every metric above should be computed on returns that already subtract realistic trading costs. A signal that looks excellent gross of costs but turns over its book aggressively can be unprofitable net. Two cost-aware diagnostics are essential:

Turnover: how much of the portfolio is replaced per period. High turnover multiplies cost exposure and shortens the runway before the edge is consumed.
Break-even cost: the level of per-trade cost at which the strategy's net return falls to zero. If your expected real-world cost is uncomfortably close to break-even, the signal is fragile regardless of its gross Sharpe.

Drawdown and path metrics

Two strategies with the same average return can be wildly different to live through. Path-dependent metrics capture this:

Maximum drawdown: the largest peak-to-trough decline. It speaks to survivability and to the leverage you can responsibly run.
Calmar ratio: annualised return divided by maximum drawdown, a return-per-unit-of-pain measure.
Time under water: how long the strategy stays below a prior peak — a practical gauge of how much patience (from you and from any capital allocator) the strategy demands.

Statistical significance and the multiple-testing trap

This is where most promising signals quietly fail. A backtest Sharpe ratio is an estimate, and like any estimate it has error. Worse, if you tried many candidate signals and reported the best one, that best Sharpe is biased upward — you have partly measured luck, not skill.

t-statistic of returns: a basic check on whether the mean return is distinguishable from zero given the sample. More data and longer history raise confidence.
Deflated Sharpe ratio: introduced by Bailey and López de Prado, it adjusts the observed Sharpe downward to account for the number of trials run and the non-normal shape of returns. It directly attacks the "I tested 500 variants and kept the winner" problem.
Out-of-sample evaluation: the most honest test of all. Reserve data the signal was never tuned on, and judge it there. A metric computed only in-sample is a description of the past, not a forecast.

Metric	What it answers	Main blind spot
Information coefficient	Does the signal predict returns?	Says nothing about cost or sizing
Sharpe / Sortino	Return per unit of risk	Inflated by multiple testing; assumes well-behaved returns
Information ratio	Skill versus a benchmark	Depends on a fair benchmark and true breadth
Max drawdown / Calmar	Survivability of the path	Backward-looking; worst case may be yet to come
Deflated Sharpe	Is the result real or selection bias?	Requires honest count of trials

Conclusion

Evaluating a signal well means triangulating: confirm it predicts (IC and its decay), confirm the return is well-paid for the risk (Sharpe, Sortino, IR), confirm it survives costs and turnover, confirm the path is liveable (drawdown), and confirm the result is not an artefact of trying many things (out-of-sample testing and a deflated Sharpe). A number is only evidence once it has survived all five.

Frequently asked questions

What is the information coefficient (IC)?+

The IC is the correlation between a signal's values today and the realised forward returns over its holding horizon, usually computed as a rank correlation so it is robust to outliers. Useful ICs are small by nature; what matters is that the IC is positive and stable across time and across the assets you trade.

What does the Fundamental Law of Active Management say?+

Richard Grinold's law states that the information ratio is approximately the information coefficient times the square root of breadth, where breadth is the number of independent bets per period. The practical lesson is that a small per-bet edge can still produce a strong information ratio if applied across many genuinely independent opportunities.

Why can't I trust a high backtest Sharpe ratio?+

A backtest Sharpe is an estimate with error, and if you tried many candidate signals and kept the best, that Sharpe is biased upward — you have partly measured luck. The deflated Sharpe ratio of Bailey and López de Prado adjusts the figure downward for the number of trials and the shape of returns, directly addressing this selection problem.

What is the difference between the Sharpe ratio and the information ratio?+

The Sharpe ratio divides excess return by the volatility of returns. The information ratio divides active return — return above a benchmark — by tracking error, the volatility of that active return. For a strategy measured against a benchmark, the information ratio is often the more relevant measure.

Which single metric should I use to evaluate a signal?+

None — a single number is never a verdict. Confirm the signal predicts (IC and its decay), that the return is well-paid for its risk (Sharpe, Sortino, information ratio), that it survives costs and turnover, that its drawdown path is liveable, and that the result is not an artefact of multiple testing. A metric is only evidence once it has survived all of these.

Editorial Team

Micro Alphas publishes reference explainers on quantitative signal research — signal attribution, alpha decay, market microstructure, and the methods quant teams use to find and protect their edge. Figures are sourced; we correct errors.

About us & editorial standards →

Continue the path

Step 3 of 7 in Market Analysis →

← PrerequisiteMarket Impact: Managing Signal Decay Next up →High-Frequency Vs Low-Frequency Alpha Signals5 min read

↑ FoundationsMarket Regimes: Impact on Signal Stability

Concepts in this guide

information coefficient information ratio Fundamental Law of Active Management breadth Sharpe ratio deflated Sharpe ratio Hit rate Sortino ratio

Try the tools

Information Coefficient Calculator →Signal Skill Explorer →