“Of Course Everyone Should Learn from the Past”, He Had Said

False Tests of Discovery Over Chance

Here you are going to be simply reading about some of the basic principles of “hypothesis testing”. That's what statisticians call it. If that sounds too boring you might consider that it is someone's life savings that are at stake, perchance yours. No kidding. This isn't an idle exercise. There will be a certain amount of specialization to suit our particular purposes, but the principles are applicable to diverse fields of inquiry. Such as what? Beyond finance, markets and economics... how about medicine? (And soon, for your entertainment and enlightenment, we'll specialize to painting houses!)

And with medicine, the development of pharmaceuticals for example, while the testing that goes on isn't called “backtesting” it is rather analogous to that which we must pursue with financial markets. The hypothesis is that drug XYZ would alleviate a certain condition and the company hopes to go forward with selling the drug. So prior to the company's managers making their decision to proceed, backwards in time from that, volunteers were given the drug and the results noted and analyzed.

In finance we don't put volunteers at risk when we're backtesting. But we put ourselves and customers at risk if we yearn so hard for a wonder drug, so to speak, a magic potion that will put us on Easy Street, that we do our backtesting in a stupid way.

Unlike the pharmaceutical companies, with computerized portfolio management we're pushing an “algorithm”— that's really just a giant formula that computes the position sizes that you should have for each security going forward, from one trading day to another (or from one month to another if you prefer)— and our hypothesis is that if we use the algorithm then the condition of our being poorer than we'd otherwise be will be alleviated and so we should go forward with it. In place of volunteers we have imaginary versions of ourselves using the algorithm in past years and ask the question “If those imaginary versions of us had done back then what we propose to do now, how would they have fared?”

It sounds so straightforward. In a way, it is. Those who have thought about it a great deal can usually get it about right. But they who don't are legion. And you can find them everywhere— all over the Internet, and too often at the office of your financial planner or investment advisor or even at a university. Indeed, the failed analysts are so numerous that one person of experience at doing some of what I do has notified me that the very word “backtest” is now utterly toxic, like garlic to a vampire. That would be thanks to inept practitioners who either didn't know what they were doing or didn't admit to the residual uncertainties that cannot be removed regardless of the analyst's skill and understanding.

So then, does the fact that some backtesters have blotted their copybooks have you wanting to proceed without seeing what would have happened in the past if you had then proceeded as you are about to now with your plan, whatever it is? That doesn't sound like what they call “due diligence”. Not to me.

The rub is that there's always the possibility that those imaginary versions of us, doing back then what we propose to do now, were just lucky— that whatever success they had with our algorithm would have been due entirely to chance (if somehow they had way back then gotten the idea of using the algorithm— they didn't— that we have only recently invented after very possibly having been influenced in some way by what has happened since). This gets interesting. We'll deal with it.


An Example

This won't be a practical example; it is almost ridiculous; it's really a parable. And so it is entirely made up out of whole cloth. It's about painting houses. Real painters know what to do and don't need this kind of help. But we don't want any previously-developed convictions about financial markets acting as intrusive thoughts, not as we begin to go about figuring out the tricky parts of backtesting. Hence painters, not brokers.

So suppose that you're a big-time painting contractor, and you have a lot of new work coming up, starting about now. One of your standard paints is in reality a two-part system, the basic paint plus an additive that is needed to make the paint cure or dry to a nicely hardened finish with good coverage. So you want your painters to put in the right amount of additive, and you've been keeping records of the different amounts of additives that your painters have used and their results.

Let's say that the painters have been using between one and five measures of the additive per gallon, as they see fit (the measure could be some little thing like one of the smaller kitchen measuring cups). And then your inspector has some way of measuring the hardness of the paint after it's had time to cure and dry and also some scheme for grading the quality of the coverage and so is able to come up with a combined hardness-coverage score as a measure of the quality of the result. So let's say that the quality scores are either “bad”, “acceptable” or “good”.

Suppose also that you the contractor have been collecting these data, the amounts of additive used and the quality scores, for quite some time— for years. And so you're finally getting around to deriving from the data the best amount of additive to use going forward, the opinions of your painters notwithstanding. Of course you're going to pick the number of measures per gallon that historically yielded the highest percentage of “good” quality scores and go with that. Right? What else? And so with that approach did you the painting contractor live happily ever after? We'll soon see.


Mike O'Connor is a physicist who now develops and tests computerized systems for optimizing portfolio performance.


Actual Painters, 10; Contractor, 0

So did the houses all get painted very well, with our contractor not having to pay for redoing any of the paint jobs. Sadly no! You will recall that the plan was to override the opinions of the painters and select the number of measures of the additive that had historically produced the greatest percentage of “good” quality ratings... and then to thereafter use that number of measures of the additive for every gallon. “Yeah! That's the ticket. Of course I want to use the amount of additive that produced the best finish,” the contractor had cried.

Now the contractor had the idea that the basic purpose of the additive being supplied separately by the paint manufacturer was so as to give better shelf life to the paint. That is, had the additive come from the factory already in the paint the paint would start to cure in the can if not used quickly. Indeed, that was actually part of the theory of use of the additive. And when he saw from the historical data that the best-performing number of measures of the additive produced a percentage of “good” ratings— we'll call that percentage P— that was only about 10% greater than the total percentage of “good” ratings in all of the dataset, he was not alarmed as he had always thought that while there would be an optimum amount to use other variables that were beyond the control of his painters could affect the results (e.g, the condition of the siding, the nature of the residue from the prior paint job, the humidity, etc.) and could therefore fairly often cause an other-than-historically-optimal amount of additive in use on any one particular house to perform as well or even better than the historically-optimal amount. That plus the fact that in the modest 10% margin he smelled real money, in the form of reduced costs for repainting that would accumulate, caused him to go ahead and bet the farm on the historically-optimal number of additive measures.

But as it turned out, it was actually crucial to get the amount of additive correct, to adjust the amount so as to suit the siding surface and environmental conditions of the day, especially the temperature. And the painters all knew that and so all along, through winters, springs, summers and falls they had been choosing, as best they could, between one and five measures of the additive per gallon with the greater amounts of additive being needed on the cooler days because the additive was in part a hardener and hardening otherwise happens more slowly at lower temperatures. And so due to that part of their vast knowledge of the trade they scored more good ratings than bad or acceptable ratings. And except for minor variations such as the mere 10% margin that the contractor found, that was also why the percentages of “good” ratings were more or less evenly distributed among the numbers of measures of the additive in the historical record— because the painters were equally good at using the right amount of additive regardless of conditions, not because the amount of additive didn't matter. Ironically it was the 10% margin that the contractor saw and jumped on that was a mere fluctuation due to uncontrollable and unknown variables.

Not understanding any of that, the greedy contractor— we have to call him “greedy” or this would not be a proper parable— fixed the number of measures to be used per gallon at the historically optimal number. Given the fluctuations the thus-determined number could have been anything. But it happened that it was a number of measures of additive that was suitable for cool weather and he imposed the use of it just prior to a busy summer month. He became bankrupt and when last seen was living in his mother's basement.

MORAL: If you're going to beat the professionals at their own game, first get your logic straight.

Some Proper Procedures

Were there rules of hypothesis testing, of backtesting, of logic that the contractor should have followed? Hah! You could say that. The first rule is to remember to actually test your hypothesis.


An Ex Ante Testing Regime: the “Walkthrough”

That's right, the contractor tested absolutely nothing. He used the history to set the number of measures of additive going forward, but he never tested what happens when someone proceeds that way. The astonishing thing is that many who promote some system or other of theirs for trading stocks only do the like of what the contractor did, yet brazenly call it backtesting. There are even online sites with associated brokerage accounts that encourage customers to use the site's software to “backtest” on their own (all the better to shirk blame), where the software only finds out what would have happened if particular parameter values had been used in the past. The customer is led to find parameter values that would have worked well in the past, if only back in the past the future good performance of those values could somehow have been known.

(Continued...)


If the contractor had painted a few houses and then stopped to check the results that would have been a bit of an actual test, but to do a very good test or two it would not have been necessary to paint any additional houses at all. Read on.

By test your hypothesis we don't mean use real money, as the contractor effectively did going forward into the summer. No, no. We mean seeing what would have happened in the past had you used the scheme that you have just now defined back then— and it will be with imaginary money when we do it for securities. Here is an example of one testing regime that the contractor could have adopted: go back to each month-end of every prior month and for each find the number of measures of the additive that produced the greatest percentage of good finishes prior to that end-of-month date (it isn't best but for now you could think of going all the way back to the first year of record if you please); then see what happened in the month that followed that prior month-end at those houses for which that putatively optimal number of measures of additive had been used. Do that for every one of those past months and tally up the next-month results for the supposedly-optimal number of measures. You can see that each end-of-month trial would be tantamount to a simulation, literally a “dry run” of what the contractor proposes to go forward with in the next month. That's the critical thing: the scheme thus backtested is the very same one that the contractor proposed to go forward with but never tested at all. With this revised scheme there is apparently no use of a parameter value that was determined after the fact (ex post). Let's call this backtesting method the “walkthrough” method.

He would find that things didn't go well the month following most of those prior months at those houses where that supposedly optimal amount of additive was used— for the reasons that are explained at the end of the previous page. Note that we said that the painters were good, but not perfect, and so there would be a sufficient number of painters who used the supposedly-optimal number of measures of additive even when some other number of measures would have been better, who therefore got bad finishes, so that the contractor could have gotten enough data to have seen the error of his ways and to have figured out what was really happening.

With regard to our other schemes, concerning stock market portfolio management, shouldn't we worry that choosing a parameter such as, say, a lookback period that defines a price momentum ratio is perchance a bit like choosing the amount of paint additive— that unseen hands, professionals like the painters (also known as “the smart money”), constantly adjust the prices of the securities in response to things weighing on the marketplace for them (analogous to the temperature variations and other conditions that no one even recorded), so that the future prospects hinge not on a fixed momentum lookback period or the associated price ratios but instead on the extent to which said professionals have done their work well? Yes! We very well need to worry about that.

Certainly with the stock market we will always be able to see how we would have fared in the past with any specified program and the walkthrough method that we have just defined will be one of our mainstays. We'll apply it to the determination of parameters that are embedded in our scheme just as the contractor should have applied it to the number of measures of additive.


Refuting the Null Hypothesis

In “Discovery or Chance? Part B” we go on to apply the walkthrough method to our version of the momentum and relative strength approach to portfolio management, but now we introduce another, supplementary method of ensuring that we don't adopt unreliable schemes. Ultimately it's one that will help us understand how good the scheme that is derived from the walkthrough really is, before putting it into practice. Here we can get into some fairly difficult statistical science if we go very far with it. Moreover, it's pretty easy to see, with any use of it, including the analysis of the financial markets, that the null hypothesis idea that we are about to consider has some hair on it. You can worry about whether or not you're doing it right. There are indeed different conceptions and implementations of it in academic literature on finance.

But there is clearly a basic duty, to yourself if you are the backtester, to roundly refute the null hypothesis. To keep it simple, for the painting contractor we might imagine that the null hypothesis could be something like “the use of any given number of measures of additive has associated with it an equal probability of producing a ‘good’ rating”. For this to make sense it helps to put yourself in the contractor's shoes (or rather, in the shoes of the contractor after he's studied statistics a bit). He doesn't know about the radical influence of temperature, he only has records to work with, and his null hypothesis is about the outcomes in the aggregate and not the results at any one house— hence the use of the word “probability”. And no one ever tested all five numbers of measures of the additive on different parts of the same house at once for him. Note especially that we are not with the null hypothesis confronting directly the question of whether or not at any given house using some other number of measures of the additive would or would not have yielded much different results. We don't have to and we can't anyway because we don't have data directly bearing on that question. The null hypothesis is something like the opposite of what the painting contractor obviously believes and so he had better be able to refute its validity with his historical data before he actually does anything.

So the idea is that for the painting contractor's scheme to have been viable based on the historical data, the data must have been such as to demonstrate in a statistical way that it's quite unlikely that the null hypothesis could also account for the outcome that the contractor found (recall that his outcome was simply that a particular number of measures of additive was associated with more “good” ratings than others— that could hardly be regarded as unexpected... what else?— and that the margin over the percentage of good ratings among all of the paint jobs was a fairly meager 10%). If the null hypothesis can't be ruled out, then obviously you can't go forward with implementation and use of a scheme with respect to which the null hypothesis is utterly antithetical.

And the matter then arises— this is amusing— that we actually need to set some criterion concerning how roundly we should require the null hypothesis to be defeated before adopting and implementing our own hypothesis with respect to which it is antithetical. It's amusing because after all of our hard work at avoiding unwarranted assumptions we will find ourselves called upon to arbitrarily set a value that is actually of rather critical importance, as the value is to determine whether or not we consider the null hypothesis to have been, in effect, eliminated. In order to establish that criterion statisticians conduct a computation that they put forth as yielding the odds that mere chance would yield results meeting or bettering the historical results that would have followed from implementing the alternative hypothesis: “chance” means assuming that it is the null hypothesis that is true, not the painting contractor's plan, and “the alternative hypothesis” is a statistician's cynical way of referring to anyone's pet scheme (here, the contractor's original plan that he so fatefully actually put into effect). Often the arbitrary standard of one in twenty is adopted: You can proceed to implement a hypothesis if the odds that the results that you got could have been matched or exceeded if the antithetical null hypothesis were true are no greater than one in twenty. You're then entitled to refer to your findings as being “statistically significant”.

Given that there would have been variances or fluctuations due to unknown causes, it's clearly a tall order to refute the null hypothesis with the contractor's historical data in the face of the stated rather uniformly good performance of the painters and the fact that the largest margin of benefit that he found for one number of measures of additive over the average performance was a mere 10%, but there are mathematical ways to get the answer whatever it is— for the case that we stated of highly-skilled painters or if the painters instead often used the wrong amounts of additive.


The Origin of the Null Hypothesis

Sir Ronald Aylmer Fisher (1890–1962) was a very notable English statistician, hence “Sir”, who put forth the use of a null hypothesis in hypothesis testing and figured out such things as how to calculate the odds that a given data set such as our painting contractor's with its five additive-amount categories and three assigned finish-quality categories is what it is due to chance alone.

You can read about the null hypothesis idea here. It's the basic Fisher conception of it and of its use that we'll adopt. We've already seen how the null hypothesis could plausibly be hard for the painting contractor to refute. Remember that the null hypothesis for him involves just supposing that any number of measures of additive used has the same probability of success. And we also learned that in reality the painters got approximately the same score distributions for each number of measures of additive thanks to their skill in using the right amount of additive regardless of conditions. Therefore we actually know without doing any math that assuming the probability of a painter getting a “good” outcome to be exactly the same regardless of the number of measures of additive that he was found to have used— that's the null hypothesis à la Fisher— would be essentially consonant with the actual historical record that showed approximately the same score distributions for each number of measures of additive used, so that the null hypothesis would therefore not be refuted by the historical data.

The good news is that for our own work with securities we'll be using a computationally simple and powerful method to attempt to refute the null hypothesis that's also easy to understand— because it's a bit like a simulation, just like the walkthrough method.


Implementing the Null Hypothesis

Well, we'll soon be considering our second, more-pertinent example, about stocks and ETFs, in Part B of this article. That will be a welcome change. But when we get there we won't have just a few simple discrete categories to deal with. We'd like a way of refuting the null hypothesis that involves computing the odds that the data are what they are by mere chance that is easily understandable yet flexible and easily applied to a range of circumstances, using much the same computer code each time. Enter Monte Carlo. Yes! The casino... not the French count who wasn't one of the Three Musketeers.

And we might as well also consider the alternative case of a painting contractor who, although not a smart cookie, is somehow smarter than some of his painters, some number of whom used the wrong amounts of the additive on numerous occasions. Even though his scheme was blind to the criticality of the proper number of measures and to the role of the temperature variable, if a number of his painters had been utterly terrible then the optimal amount of additive that he found from the historical data could actually have brought about a significant improvement. And had he used the proper walkthrough approach he likewise might have actually been met with improved results.

Walkthrough Implications for Odds Due to Chance

In such circumstances we still earnestly want to know the odds that the improvement could be due to chance. But it would be nice if instead of proceeding at once with a calculation of that we could first see in the walkthrough testing regime that we have already developed some reason to believe that it somehow tends to set us up to automatically avoid adopting hypotheses whose seemingly good performance could be due to chance.


(Continued...)

So, now with the walkthrough method applied to the contractor's data, if the verdict at the end of the month that followed just one end-of-month determination of the optimal number of measures of additive was that the outcomes would have been better had the contractor's putatively optimum amount of additive been used, what are the odds that the better results could have been due to chance? We'd have to say 50-50, simply because here we are considering only two possibilities— better or not better— and the null hypothesis gives each of those two outcomes even odds.

But of course the contractor had many months of data. So if for two months in a row the contractor's optimum amount of additive produced better results, the odds of that being by chance fall to something like one in four (one-in-two times one-in-two, and see the next paragraph for the qualification that would give us exactly one in four). Looking better. You get the picture. The math gets more complicated if the results weren't better for every month, which would of course be the expected circumstance, but the general tendency here is that the walkthrough method, as it involves repeating the test many times, naturally diminishes our concerns about good results having happened merely by chance... provided only that the results are chiefly good.

If the painting contractor had elected to only use a month's worth of historical data prior to each end-of-month then each of the trials would have been independent and if all that we were interested in was whether the next month's outcomes were better or not then the little math problem to compute the odds that the cumulative results were due to chance is the same as the one for determining the odds that flipping a coin F times produces heads H times where H may range substantially away from F/2.

Thus the walkthrough scheme that we bestowed upon the painting contractor (and will also on all of our portfolio management programs) has a built-in tendency to indicate improbability of the null hypothesis if it shows substantial accumulated benefits over multiple trials. But we want odds, not just an indications of substantial benefits, because odds would amount to benchmarking our scheme's performance vis-à-vis the null hypothesis in a clear and meaningful way. And it is also insufficient for our purposes to classify the effect of the scheme in a simple binary way, as being beneficial or not beneficial. We're concerned with gradations. So where do we turn for help with this matter of chance? Back to the casino, of course! And while we won't be doing anything quite as simple as flipping coins the required computer programming is exceedingly easy. And again, the computer doesn't care about how many manipulations it has to do.


The Basic Setup for Computing the Odds Due to Chance

So for the painting contractor's scheme, with or without the walkthrough, instead of just supposing that the null hypothesis is true and then trying to compute the odds that the results would be what they were on that basis, à la Fisher, we'll force the null hypothesis to be true by altering the data set so as simulate every amount of additive having the same probability of producing a “good” finish— our null hypothesis. To do that we'll simply randomly shuffle the data on the numbers of measures of additives used, altering the correspondence between the amount of additive used and the finish quality in such a way that we have no expectation of the distribution of finish quality ratings for any one of the numbers of measures of additive being found to differ from that of another. We'll do that by just scrambling the data on the numbers of measures of additives used in the entire multi-year history, like scrambled eggs— with a single scrambling involving randomly permuting many times the numbers of measures of additive used on pairs of houses that were painted on different days or on the same day. In that way each number of measures of additives used has a probability of having a “good” rating associated with it that is the same as for the other additive amounts. And likewise for the other finish-quality ratings.

And we will scramble the data many, many times, N times, recording after each scrambling the results that would have been obtained if the historical record had instead been the scrambled version of itself. For our purposes it would suffice to record, for every scrambling, the highest percentage of good ratings that was obtained using any of the numbers of measures of additive.


First, Application to the Doofus Contractor's Scheme

Now then, let us first see what we could make of the contractor's own simple intention of finding the number of measures of additive that worked best in the past and immediately committing to using that thus-fixed parameter in the future, without prior testing via the walkthrough (we have already seen how foolhardy that could be). We simply look through our recorded results for all of the scramblings and tally up the number S of scramblings with which any one of the number of measures of additive scored a percentage of “good” ratings equal to or greater than the “good”-rating percentage P of the best-performing number of measures of additive in the original unscrambled historical record (with which the contractor had computed his 10% margin of betterment). We then compare that number S of better-performing scramblings to the total number N of scramblings and we have our odds that the contractor's original finding could have been due to mere chance: the odds are, we would say, “S in N”, where we fill in the values for the symbols, that the contractor's original tally of a 10% margin of benefit due to a “good”-rating percentage of P was due to chance; or in the argot of gamblers, the odds would be “S to N - S” (“in” and “to” have particular mathematical meanings in this context). For example, if the odds were, say, 2000 in 5000 we would state that as 2 in 5, or 2 to 3.

If you have questions, you can always write to Mike O'Connor, and corrective comments are especially welcome. Also, if you know all of this to begin with you may find “Chance or Discovery? Part B” to be of greater interest as it completely discloses details of a tested program for portfolio management.

We have already seen that with the painters being, unknown to the contractor, very good at what they do it would be exceedingly unlikely that the odds that the contractor's 10% margin of benefit was due to chance would be as low as 1 in 20, the usual criterion for significance. But if the painters were rather uniformly bad then that could indeed have happened. However, if it were to have happened then the contractor should not then have been that much assured because we know that he chose his best number of measures of additive with hindsight only— never having tested if hindsight works going forward, if the historically best number of measures could continue to work well. That is why the walkthrough method is essential.

Now, Odds Due to Chance With the Walkthrough Method

It is our walkthrough method that tends to take care of that problem as it tests, every month, the extent to which the historically best-performing number of measures of additive continues to produce good results. And so how do we apply the Monte Carlo method to testing with the walkthrough implementation, subsequent to its application? It's simple. But first bear in mind that should implementation of the hypothesis with the proper, walkthrough approach produce bad results then we're done.

There is no point in computing the odds concerning our bad results being due to chance because we've demonstrated that a simulation of what we want to do in the future that was as pure as we could manage didn't work in the past. So let's now take the case of the contractor who has a number of bad painters, who wisely did the proper walkthrough and who got outcomes that were better than the historical percentage of “good” results. What do we do?

We just look at the recorded walkthrough results and thereby obtain the value P', the percentage of “good” ratings for the entire walkthrough, which now takes the place of the P for the contractor's simple scheme but it is now drawn solely from the outcomes at those houses at which the painters happened to have used the putatively optimal number of measures of additive that was determined from prior data in the historical record. We then form a list of the optimal numbers of measures of additive for each month that were thereby determined, scramble that list, and apply the scrambled list to the actual historical record of finish ratings— tallying again the “good” ratings that would have occurred in each succeeding month had the scrambled value been used instead of the value that had been found during the first walkthrough which was with unscrambled data. Then we compute an S' to take the place of S in the discussion above about the contractor's simple scheme, by repeating the walkthrough for each of many additional scramblings, where S' is now the number of scramblings (with repeated walkthroughs) with which the computed average walkthrough percentage of “good” ratings was equal to or greater than the “good”-rating percentage P'. And we compute the odds as before, S' in N, etc.


Is This Really an Accepted Way to Proceed?

So what about our plan to refute the null hypothesis by computing the odds that the “good” rating percentage that the contractor scored with his scheme or with the proper walkthrough version could have been reached or exceeded due to chance? That probability is actually called the “p value”. If you doubted that what we're doing has any counterpart in medicine, you should review this nice little tutorial from the National Institutes of Health. Confidence intervals are also discussed in that article— they are an alternative to p values— and we shall on occasion also have confidence intervals to offer. See for example the panel on the right-hand side of the table that is just below that chart at the top of this page.


We Should Go with the Worst?

If like the contractor we couldn't conceive of the walkthrough procedure, if we had taken his approach, we too would never in a million years have considered that it might be appropriate to do in the near future the opposite of what worked best in the past. No! Ahh... but with the walkthrough, and with a dose of humility (particularly if the endeavor is portfolio management), other possibilities come to mind.

Let us first consider a terribly important matter that is glossed over if not utterly neglected in the discourse above: the virtual necessity of not using the entire historical record to determine optimal parameter values, such as the number of measures of paint additive, but of using only a trailing history— a fixed interval of time that encompasses a substantially limited part of the historical record and trails the dates of determination of the optimal parameter values as we process the historical record with the walkthrough method. For example, above the possibility was raised of the contractor only making use of the data for the prior month to determine the number of measures of additive to use in the succeeding month. That's a trailing history of but one month's duration; the entire historical record is not used at once.

Why is this ultimately necessary? It's because otherwise we would be testing a different scheme every month... what happens when we use 50 prior months to optimize the number of measures of additive, what happens when we use 51 prior months to optimize the number of measures, 52 months, etc. Going from 50 to 52 isn't much of a difference but, with securities in particular, for which we might have, say, 20 years of relevant records, if at the start of our walkthrough we base the optimization on 2 years of data and then at the end base it on 20 years of data its clear that we're not testing anything like the same scheme. And as a practical matter we could well worry about the early optimizations with 2 years of data being based only on short-term tendencies and the last of the optimizations being insufficiently responsive even to intermediate-term tendencies.

What would have happened if the contractor's painters were not good, if the contractor actually had the possibility of using the walkthrough method to improve his operations, and, if he had used six trailing months instead of just one? Six months could be a winter and a spring, meaning cold and moderate temperatures and 3 to 5 measures of additive being best, whereas for immediate application to the succeeding hot summer 1 or 2 measures would have been optimal. (Remember, only we and his smart-money painters know about the role of temperature. He is in the dark about that and must resort to interpreting records, in the same way that we must presume that we are in the dark about what really moves the financial markets and must study financial records with computers.)

So the contractor would have had chronically bad results using the 6-month trailing period to get the putatively optimal numbers of measures of additive. In fact he might get such consistently terrible results as to wonder if he shouldn't do the opposite and use the worse-performing numbers of measures of additive. OK, we know that our contractor would never go that far. But with securities I would, I have.

We could sit in a chair and ordain that the stock market should exhibit something simple like momentum if we like, and we might be right, but for every one of us there's another who wants to tell us that the market can “get ahead of itself” and be due for a correction, or that it can become “oversold” and be due to rally— the antithesis of the persistence of momentum. And if those other guys are right then it might pay to do the opposite and assume that any show of strong momentum will be reversed.

However, as the accompanying “Chance or Discovery? Part B” article explains, it was actually found that it does pay to get out of the equities markets when momentum turns down and to get back in once a recovery rally starts, to not go against momentum. See also the Momentum Overview item under the Performance menu for charts showing the success of the momentum approach at avoiding debacles. But, with RB's New Program, which is still under testing, the good results that have been obtained thus far are based on using, for one of the parameters, the worse-performing values of the trailing history. The program does not make orthodox use of a measure of momentum.



— Mike O'Connor

Comments or Questions: write to Mike. Your comment will not be made public unless you give permission. Corrections are appreciated.

Update Frequency: Infrequent, as this article is not about current market conditions or other ongoing affairs.