Risk-Adjusted Returns: A Practical Guide
Reference guide.1. Why risk-adjusted returns matter
A 200% return on a 70% drawdown is closer to gambling than to trading. A 100% return on a 15% drawdown is closer to skill. The two records produce very different long-run outcomes when capital is allocated in real life: the high-drawdown trader runs out of capital, hits redemption walls, or psychologically blows up before the strategy compounds. The lower-drawdown trader keeps trading.
Risk-adjusted return ratios attempt to compress this difference into a single number. None of them is perfect, but each captures part of what skilled trader-evaluation actually looks at. The Trading World Champion methodology uses risk-adjusted returns as one of four equally-weighted criteria for the same reason allocators do: it is the closest single number to capturing trading skill rather than trading luck.
2. Maximum drawdown
The largest peak-to-trough decline in account value during a trading period. Reported as a percentage. A 14% maximum drawdown means that the worst point in the period was 14% below the prior peak.
Max drawdown is the simplest risk metric and the one most allocators look at first. Most multi-year hedge funds run drawdowns in the 10-30% range; competition traders frequently run drawdowns of 40%+ because of higher leverage. A 70%+ drawdown almost always indicates the trader either lacked risk discipline or had the wrong sizing for their strategy.
Max drawdown alone is not enough. Two traders with identical 30% drawdowns can have very different records: one experienced the drawdown over six months and recovered; the other took five years. The first is more skilled.
3. Calmar ratio
Definition: annualised return divided by maximum drawdown over the same period.
Example: a 178% return with a 14% maximum drawdown produces a Calmar of 178 / 14 = 12.71.
When to use it: short-period competition trading or single-year audited records where a single peak-to-trough drawdown captures most of the relevant risk. Calmar is the most informative single number for evaluating WCTC-style competition results and individual single-year audited records.
Reference points:
- Calmar < 1: more drawdown than annual return. Not skilled.
- Calmar 1-3: typical for solid hedge funds in normal years.
- Calmar 3-5: very strong year. Top decile of audited records.
- Calmar 5-10: exceptional. Top 1%.
- Calmar > 10: rare. Either exceptional or measurement artefact (very short period, single big trade).
Champion examples:
- Darren O'Neill, 2023: 178% / 14% = Calmar 12.71. Among the highest single-year Calmars in the archive.
- Jim Simons / Medallion, 2018: ~40% / single-digit drawdown estimated, Calmar 6+.
- Ken Griffin / Citadel, 2022: 38.1% / drawdown not publicly disclosed but estimated low.
Calmar's main limitation
Calmar uses single peak-to-trough drawdown, not the average drawdown or drawdown duration. A trader who experiences one severe drawdown and recovers quickly has the same Calmar as a trader who experiences the same drawdown and stays underwater for years. The two are not equivalent.
4. Sharpe ratio
Definition: excess return per unit of total volatility, calculated as (return - risk-free rate) / standard deviation of returns.
When to use it: multi-year fund-level performance comparisons. Sharpe is the standard institutional allocator metric and is reported in virtually every audited fund letter and database (BarclayHedge, HFR, eVestment).
Reference points:
- Sharpe < 0.5: weak. Not enough return for the volatility taken.
- Sharpe 0.5-1.0: typical for index-tracking strategies and many hedge funds in mediocre years.
- Sharpe 1.0-1.5: solid. Most multi-year hedge fund records sit here.
- Sharpe 1.5-2.0: very strong. Top decile sustained over 5+ years.
- Sharpe > 2.0: exceptional. Sustained over 10+ years it is essentially Renaissance Medallion territory.
- Sharpe > 3.0 over short periods: usually measurement artefact (small sample, autocorrelation, mismeasured risk-free, illiquid assets being marked-to-model).
Champion examples:
- Darren O'Neill, 2023: 2.57 Sharpe. An exceptional single-year number.
- Jim Simons / Medallion: estimated 2.0+ sustained over 30 years — the most exceptional Sharpe in trading history at scale.
- Stanley Druckenmiller / Duquesne: estimated 2.0+ sustained over 30 years.
Sharpe's main limitation
Sharpe penalises upside volatility as much as downside volatility. A strategy that has occasional huge winning months (option-buying, deep value, distressed credit) is penalised by Sharpe even though the upside variance is exactly what the strategy is designed to deliver. The Sortino ratio addresses this.
5. Sortino ratio
Definition: excess return per unit of downside volatility only. Calculated similarly to Sharpe but using only the standard deviation of negative returns.
When to use it: asymmetric strategies. Distressed-debt trading, options-buying, deep-value equity, and concentrated activist campaigns all generate intentional upside variance that Sharpe penalises unfairly. Sortino corrects for this.
Reference points:
- Sortino is typically higher than Sharpe for the same trader (because total volatility ≥ downside volatility).
- For symmetric strategies (systematic futures, market-neutral equity), Sharpe and Sortino are similar.
- For asymmetric strategies, Sortino is meaningfully higher and is the more appropriate metric.
6. Drawdown duration
Not a ratio, but a critical complement to Calmar. Drawdown duration is the length of time the account value remained below its prior peak. A 30% drawdown that recovered in three months tells a different story from a 30% drawdown that took three years to recover.
Trading World Champion methodology evaluates both depth and duration. A trader whose worst drawdown lasted 18 months is held to a higher long-term scrutiny than one whose worst drawdown lasted three weeks, regardless of identical depth.
7. Combining the metrics
Single-metric ranking is misleading. Allocators and serious editorial evaluation combine metrics:
| Question | Best metric |
|---|---|
| How risky was the path? | Maximum drawdown + drawdown duration |
| How efficient was return generation? | Sharpe (symmetric strategies) or Sortino (asymmetric) |
| How impressive is a single competition year? | Calmar |
| Is multi-year skill durable? | Sharpe sustained over 5+ years across regimes |
8. Common errors
- Quoting Calmar without specifying the time window. A 12.71 Calmar over one year is impressive; a 12.71 Calmar over one month is meaningless.
- Quoting Sharpe over too-short a window. Six-month Sharpe ratios above 3.0 are easy to produce by chance. Sustained 5+ year Sharpe above 2.0 is a real signal.
- Confusing gross and net of fees. A 30% Sharpe-2 hedge fund return is much more impressive net of 2/20 fees than gross of fees.
- Comparing across asset classes blindly. Calmar of 5 in futures competition is not the same as Calmar of 5 in long-equity hedge fund.
- Ignoring drawdown duration. A 50% drawdown that recovered in a year tells a different story from a 50% drawdown still ongoing.
- Treating a single metric as the answer. Allocators look at the full distribution of risk-adjusted measures, not the one number that flatters the manager.