Historical performance lies! Beware of survivor bias.

Authors

Often you find studies how well a specific fund or trading strategy has performed. You must be aware of the fact that basically all studies based on historical data lie – or at least do not tell the full story. There are two highly underestimated effects: Survivor bias and fat tails. Both effects make poor strategies look good. In this article, I show you why.

Survivorship bias and estimation of fund performance

If you run a fund, you are very happy about good performance and unhappy about poor returns. So are your investors. What happens if your fund performs poor for several years in a row? You loose your investors and finally close the fund. This fact is very important: After closing the fund, your fund data usually disappears from the market data providers (reuters, bloomberg, yahoo, etc.).

Take a look at the following example:

Fund Return year 1 Return year 2 Return year 3 Return year 4 Total return
A +10%  +10% +10% +10% +46%
B -5% -5% -5% – (closed) -14%
C +10%  +15% +5% +3% +37%

Fund A, B, C represent all available public funds in the hypothetical bigdata-biology-cloud-computing subsector (BBCC sector). In this example, fund B performs so poor that, it was closed in year 4. Now, performing a study on the BBCC sector, the analysts will find historical data for funds A and C, only. Consequently, they will report an average performance of 9% p.a. in the BBCC sector. However, a real investor investing in the BBCC sector distributing her investments equally across all available funds (A, B and C) would have gained 5% p.a. only. This is about half of the performance seen by the analysts. This kind of error happens easily and thus often.

Survivorship bias and historical backtesting

A similar effect happens often when people try to find a superior trading strategy in historical market data. Often you see the following algorithm for fitting parameters of a trading strategy:

  1. Data = Historical Market Data
  2. Choose values x_1 to x_n for parameters of trading strategy S
  3. If performance( S(x_1,x_n), Data ) is maximum of all possible values of (x_1, x_n)

stop and return x_1 to x_n

else

goto 2.

end

Clearly, this strategy S(x_1,x_n) will perform great on the available historical data. But, this says nothing about the future performance of the strategy. And, usually, the strategy will not perform good in the future.

How can you do better? A better approach follows from division of the data into a training- and a test-set. The resulting algorithm would be:

  1. Data_training = half of Historical Market Data
  2. Data_test = other half of Historical Market Data
  3. Choose values x_1 to x_n for parameters of trading strategy S
  4. If performance( S(x_1,x_n), Data_training ) is maximum of all possible values of (x_1, x_n)

If performance( S(x_1,x_n), Data_test ) is good

stop and return “Performance of S is good with parameters: ” + x_1 to x_n

else

stop and return “Strategy S is a bad strategy”

end

else

goto 2.

end

In this case, the trading strategy S is optimized on a training set of the historical data and the performance of the test-set is good either. This still does not mean that the resulting strategy will perform well in the future. But at least it did perform well on unseen data. Now, this algorithm may return with “Strategy S is a bad strategy”, which will be unsatisfactory for most people. Most people will then develop more and more strategies S_1 … S_m  until the algorithm returns “Performance of S is good …”. But, is this really a good strategy? If you want an answer to this, make sure that you reserve some of the historical data for a validation and check your strategy on this unseen data. So, for optimizing a trading strategy you need at least three data sets: training set, test set and validation set. Again, this will say nothing about performance of your trading strategy in the future. But, it is a good indication if your strategy passes this tough strategy optimization.

Survivorship bias and fat tails

Another feature, which creates wrong views on the performance of financial assets or trading strategies are fat tails. There are several kind of trading strategies which seem to exploit arbitrage in the market. E.g. volatility arbitrage funds were popular a few years ago. They perform well on historical data and they deliver steady income with no risk. But, suddenly, something happens:

Volatility Arbitrage Strategy

Source:  surlytrader.com

Suddenly, the performance of a few years is gone forever. And, this might also happen with strategies which are perceived risk-less like money market funds:

Geldmarktfonds galten vor der Finanzkrise als sicher. Dieser Chart zeigt, dass dieses seit der Finanzkrise nicht mehr der Fall ist. (Quelle: Handelsblatt)

Money Market Funds can be risky, too (Source: Handelsblatt)

Conclusion

The survivorship bias significantly complicates the performance estimation of funds and trading strategies. And, even if you do everything right, your performance might be hit by a market shift as we saw for money market funds in 2008/2009. Finding free lunch in the financial market is hard.

2 Comments

Comments RSS
  1. Muqil

    Good post.

  2. Luz

    If you are interested in topic: make money day trading penny stocks – you should
    read about Bucksflooder first

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s