The Official Blog of Retail Backtest
About     Recent Entries

Real Vs. Hypothetical

Before investing with some fund or using some advisory service that one way or another provides active portfolio management, we naturally want to first look at past performance. I'm going to discuss the different kinds of performance histories that can be made available, and argue that the kind that is generally considered to be the gold standard is not necessarily what it seems to be.

The taxonomy is as follows: real-money, might-as-well-be-real-money, hypothetical and problematical. Those are my adjectives and  compound adjectives for the different types of past performance records that we may encounter.

The Players

We are accustomed to being able to view the past performance histories of publicly-traded funds or mutual funds, online, and if a fund engages in active management we can take it upon ourselves to judge whether or not its past performance is as good as or better than that of other funds that hold similar securities. And it's not difficult. Higher math isn't required. You can just look at the charts.

Or, if you don't want to pick the funds or other securities yourself, if you want others to manage your money for you, there are investment advisory firms that will manage an account of yours at a brokerage for you. Generally you can get some information out of those people, if you press them, concerning how well they did for clients in the last 5-10 years versus some benchmark. Some few firms of that type do offer algorithmic active management.

Then there people who don't manage your money for you but just offer advice in the form of their published opinions, that you can take or leave. And there are different types of such persons. Consider writers of stock market letters or the like, who publish their own opinions concerning what securities you should hold and when. Nowadays you probably get their reports online or via email. I'm talking about quasi-quantitative analysts, who may talk about too-high P/E ratios, about how small caps are looking frothy compared to blue chips, about how various indicators look, about price chart patterns, etc. And it's somewhat rare that you get from them anything in the way of a firm indication as to how well you would have done in past years had you followed their advice. They often don't offer any really exact advice anyway, and so no one possessing all of the back issues of their letter could tell you for sure how well you would have done. But for those who are explicit, who tell you to buy this now or sell that now, there was the Hulbert Digest which attempted to produce a score for every newsletter that it followed. But the Digest has shuttered its doors now.

The aforementioned funds and letter writers are legion. But among those who only offer advice as opinion are some of a rarer breed, the Retail Backtest types, who exactly specify positions to hold based on an algorithm. But they may not offer a history of real results, a record of actual trades with money in an account. And so you are concerned about that. Today's blog entry is for the purpose of suggesting to you that you shouldn't in all cases be so concerned. Please read on.

The Types of Past-Performance

  • Real Money: There is a real brokerage or custodian account, or there are a number of such real accounts, not "paper-money" accounts, which have been managed by a fund manager, an investment advisor or (rarely) even a newsletter writer following his own advice. So this is the "gold standard", as the history of the equity in those accounts matches how your own investments would have performed if you had participated.
  • Might-As-Well-Be-Real-Money: If a service provides exact advice as to what positions to hold and when and has been operating for some time, then it's possible to provide good estimates of how well anyone would have done with the service over the period of record--- without reference to any actual results of anyone who followed the service. The estimates would be particularly good if the trading frequency was not too high and if the securities were highly liquid, which would mean smaller bid-ask spreads and less variance of fills from one investor to another.
  • Hypothetical: The service uses an algorithmic method or some other well-canonized approach to exactly determine what securities to hold and when and so it can compute what would have happened to an investor using the service had the service's program been available for use in the past, which it wasn't. Presently these are the unavoidable circumstances of the Retail Backtest project, which is still only in a startup/shakedown phase.
  • Problematical: Sometimes third parties attempt to determine how investors who follow some newsletter-type of service would have done with it, even though the service only approximately specifies when to buy and sell securities and so the third party then has to make some not-so-subtle assumptions about when trades should have been considered to have been initiated and later closed out.

I've left off of this list, for being too absurd to include even as problematical, the reports of characters who don't come close to clearly specifying entries and exits, who cite out of their history only the select recommended trades that did especially well.

I am prepared to lump "might-as-well-be-real-money" together with "real money" as being of essentially equal value. And there is nothing that can be done to repair "problematical" records of past performance so I'll not discuss those further. So that leaves "real money" versus "hypothetical". You may naturally prefer "real money". But you are going to have to dig a bit deeper.

Where's Your Walkthrough

Consider this. Suppose the marketplace to be entirely random, so that in reality it doesn't help to have any scheme for picking which securities or funds to hold and when. What would happen to the various funds with their different approaches to investing? Why, they would all have somewhat different results, simply because they each only hold their own little piece of the marketplace and due to chance alone some would do better than others. So were that to be the case then picking the stongest horse would certainly not work.

Sad to say, that although the marketplace is not entirely random it's mostly random. And that means that you are going to have to be quite careful as to how you pick a fund to invest in based on the real-money performance histories of the funds. You will have to do what the various scholars, professors mostly, have done, which is to go back and see each year or each quarter what would happen in the next year or quarter if money had been invested only in the funds with the best historical performance (only a real-money record deserves the adjective "historical"). You can do a Google Scholar search using a string such as "past performance repeat persist fund". You'll find variations among the studies. One says that there is persistence but it's mainly due to managers who correctly select industry groups. Another cautions that what persistence there is is just due to some funds charging high fees (yielding persistently worse performance). Others refer to persistence only being present in the negative sense of bad-performing funds alone continuing to perform as in the past. One reports that hedge funds had persistent performance only after bear market periods. So in all, the picture is not simple. You have your work cut out for you if you are to make effective use of real-money price histories of any kind.

And so, is there any service that we know of that is quite careful at picking funds based on their real-money performance? Hah! Yes. That's what Retail Backtest does! That's right. The Retail Backtest programs choose ETFs based on their price histories, which are real-money histories: anyone buying and holding any one of them would have results matching the published market price history of the fund. Where the aforementioned scholars "go back and see each year or each quarter what would happen in the next year or quarter if money had been invested only in the funds with the best historical performance", that's the "walkthrough" method of Retail Backtest except that the scholars generally don't get it quite right. They don't let the algorithm choose some of its own parameters (so no adaptability there) or determine others based on a followup step like Retail Backtest's "suboptimization". Some of the persistence that they claim may therefore need a "haircut" that they never applied.

But like the scholars who research persistence in fund performance, Retail Backtest does publish methods and results. The two main articles on the Retail Backtest website are reached via the Articles menu and are called Does Anything Work? and Does Momentum Work? The articles completely disclose the "walkthrough" plus "suboptimization" procedures of Retail Backtest, which are, I repeat, based solely on the real-money performance histories of funds.

So Retail Backtest's backtested program performance is labeled "hypothetical", by me, to match expectations in that regard. However, what Retail Backtest does is the same thing that you would be doing if you were to pick funds to invest in based on their past "real-money" performance. Yes, you would simply be competing with me, performing the same function. Granted, I do it dynamically whereas you may be inclined to a more static approach. But the dynamism yields adaptability (which is good).

So the question is, did you take the time to understand hypothesis testing? Your idea is that you will come out on top by selecting the best-performing funds, which you will determine using some criteria such as taking the last five years to be the performance period, demanding that the fund be in the top 10% and not the top 5%, etc. The figures of five years and 10% are parameters that you had to specify that you may have arbitrarily chosen without admitting to that (not good, you're in trouble already... don't you know that you'll get wildly different results depending upon your choices for those parameters?). That idea of yours is called a "hypothesis", as in "hypothetical". Welcome to the club!

The standard that you must apply, when considering what to make, if anything, of my "hypothetical" results, is "Do I know how to do what Mike is doing better than he is doing it?"

© 2017 Michael C. O'Connor ∅ All Rights Reserved