Warning: Cannot modify header information - headers already sent by (output started at /home/jayhorne/public_html/index.php:4) in /home/jayhorne/public_html/wp-includes/feed-rss2.php on line 8
Here are some thoughts on backtesting, in no particular order.
Backtesting is the single most important skill to develop as a system trader. Most of your ideas will stink, and being able to elegantly prove so in short order is critical to success. You only have so much time, so it is important to avoid spending undue time on bad ideas.
Backtesting is not merely an analytic, scientific process (i.e. develop a hypothesis then test the hypothesis). Aside from outright errors in your test, the number one thing to avoid is backfitting (or over-fitting) the data, which involves optimizing more and more variables until the results look good, but basically have very little predictive value because they have been fit exclusively to some optimal historic path through the data.
Backtesting does not involve looking at charts to find examples that conform to your hypothesis. Humans are biased toward information that proves them correct and are amazingly blind toward contrary data. I find this is particularly relevant on visually-presented data such as charts. Buy a backtesting software package and use it to judge the worth of your ideas. I suggest Amibroker.
You do not want the best performance possible when backtesting. You want the best performance that is based on the fewest variables, with each variable providing as broad a bell-shaped optimization curve as possible. You need to get to know your variables intimately. If a variable under optimization causes the results to thrash up and down, lose it. If a variable leads to poor results except at one magical value, lose it. If it ramps off nicely and then drops off a cliff… well, you might be able to use it, but pick a value a ways off the cliff even if it provides lesser performance.
That said, make sure you are using the right scale to determine the value of a variable to your system. It may have a beautiful, broad, bell-shaped curve if viewed at the right level of granularity that is hidden if optimized to coarsely.
Avoid using optimization steps that are too fine-grained. You might find a beautiful tree and never notice it is in a very ugly forest.
It is remarkably easy to accidentally create a system that cannot be traded in the real world. This can happen in many different ways including: 1) using forward data, perhaps hidden away in an indicator; 2) creating entries or exits that depend on real-time orders that you won’t be able to match; 3) ignoring the possibility of the bid-ask spread and assuming stops always get optimally filled.
Thus, when you set up to backtest, you should start with some key ideas and make sure they are coded into all your systems. These sorts of questions should come to mind. What time of day can I place orders? What order types does my broker support? How much time will it take to keep up with the orders generated by my system? How much liquidity do I need in the stocks I trade?
Keep a chunk of historic data set aside for validation once you have a system. And don’t repeatedly test your optimized system on the validation data and then go back and optimize some more. Do that enough times and you have more or less optimized your system to the data that was supposedly going to be used to validate your system.
Use lots of data, and make sure the data spans different market environments. But don’t forget that the market shifts over time. For example, volume today is different than it was 10 years ago.
My own approach is to bias my system under test toward the negative so I don’t get burned by flaws in my data. So, for instance, I look ahead and rule out any trade that would return more than a particular percent. I do not, however, rule out flawed trades that would lead to a large loss. Your data vendor will almost always include flawed data. You’ll need to account for it in some manner, even if not in the manner I chose.
Don’t expect symmetry between shorts and longs. Market participants tend to bring different factors to bear on stocks going up versus stocks going down. Euphoria is different than panic. But you should generally check for symmetry as part of a practice of being thorough.
You do not simply care about the CAR (Compound Annual Return) and the DD (Drawdown). You want to know much, much more about the results. Picture your equity curve as land and the drawdown as water. What’s the ratio of land to water? In other words, the depth of a drawdown isn’t the only factor, but also the breadth of the drawdown. I suggest using a logarithmic scale for your returns. You want to see as linear a ramp up as possible, with drawdowns that may happen all the time but all look pretty similar. If you only look at the CAR and DD, you can easily end up with a system that performed fantastic at one particular moment in history and rather stinks the rest of the time.
Get to know yourself as a trader and build into your backtests whatever assumptions are needed to maintain your sanity and even your enjoyment of the trading. As I’ve mentioned elsewhere, though it took me a few years, I eventually discovered that I do much better if I’m in cash at night, with all my positions closed out. Over time, I think I’ve sort of figured out how to make this a strength, but initially I abided by it even though it hurt my backtest results. Let me put this another way. When backtesting, it is easy to ignore yourself. Don’t.
A reason to be biased toward shorter-term trading systems: your data will provide a more valid test of your hypothesis. If you are using 6 or 7 years of data (and I wouldn’t recommend less even if it slows your computer down significantly when running tests), many trades a week will result in far more trials than a system dependent on several trades a year. I’m simply not comfortable drawing conclusions without thousands of trials across different market conditions. Thus, I have always and only tested relatively short-duration trades (1 to 10 days or so).
A reason to be biased against shorter-term trading systems: your frictional costs are higher. With System (my cleverly named system), I have to make about 3% a month to break even.
In general, picking up nickels in front of a steam roller is a bad idea. A system that only rarely has a huge loss but when it does, wipes out the results from many, many wins is very problematic. First and foremost, from a backtesting validity point of view, do you have enough loss samples to even consider your results valid? Better to have lots of wins and losses if you want more statistically valid results.]]>
Hi Jay –
The Vanguard funds have worked out well for me. Besides capturing the market return, they’ve had the other benefits of requiring absolutely none of my time, and I sleep a little better at night.
Regarding efficient markets, I think it depends on what you’re investing in. Are you looking at very small companies with a relatively small number of transactions? I might be convinced. If you’re trading GE and Microsoft, or more generally stocks in the S&P 500, I’m less inclined to agree. With millions of shares traded per day, that’s an efficient market. The people trading GE (as a whole) know a lot more about GE than I do.
If you actually generate the returns you’re expecting trading stocks in the S&P 500, that is an amazing feat.
Two other comments, if you will. I think it’s one thing to be able to say, “We’re in a tech bubble” or “These tulips cost way too much.” But it’s really hard to invest against a bubble. Greenspan was warning of irrational exuberance a couple of years before the bubble popped. Shorting too early would have cost quite a lot, not to mention passing up the rest of the ride up.
Besides that, I can’t think of how you’d try to exploit popularity in the 1-5 day time frame. That’s mindblowing to me, and I’d love to hear more about that. Are you researching these companies, or is your selection based on some kind of technical analysis or trend detection? I’m intensely curious.
Finally, the big question for me. Why aren’t others discovering this, and competing away the returns? Why isn’t some mutual fund company advertising a fund that returned 70% on average over the past 6 years?
Greatly enjoying this,
I started writing a response in the comments, but thought I’d put in a post so others can join in if so inclined.
Okay, one point of clarification. When I discuss my system, I am not referring to Tarzan. Tarzan was/is an experiment, but I do not trade it currently, though there are a couple very good ideas cooked into it (I think). If you read the original post on Tarzan, you will see a reference to my early attempts at Collective2 which performed great but were ill received. That system was the alpha of my current trading system, the one I talk about when referring to my own trading.
A point of agreement. I think it is of the utmost importance to sleep well at night and have that as one of the bedrock requirements of my system design. For me, that entails being fully in cash every night, even though that tends to degrade the performance a bit in the long run. I’m just very prone to frustration at being burned by the overnight news cycle, even though it helps overall. So I simply don’t mess with it, and have actually tried to make it an advantage.
Another clarification. It seems that some of my comments are being taken to refer to macroeconomic conditions. Another of my goals, however, is to build a system that is as uncorrelated to the broader market as possible. So, for instance, in the past few days, when the market was moving up, I was almost entirely short, and did okay… which is not to say that happens every time.
Which brings me back to that post I need to write on my basic approach. Suffice to say I find it MUCH easier to gain an exploitable edge in the 1 day time frame than any other time frame I’ve evaluated. And no, I don’t actually know the names of the companies I’m buying or shorting day by day.
The last question is pertinent (though once again, those results are against a system I don’t actually trade)… I’ll probably need to address it at some point as well. But I think the answer is roughly this: 1) fast-trading systems degrade with the amount of capital invested, so my technique would probably be terrible for a mutual fund; and 2) it took me a couple thousand hours to get here, so though others may be able to get there much more quickly, it is probably a reasonable barrier to entry for the average personal investor/trader.]]>
As I said before, swing trade systems don’t come naturally to me, so this was very challenging. The system holds 7 positions at a time with an average hold time in the neighborhood of 7 trading days, so it roughly turns over one position a day… which, in my opinion, qualifies as a modestly paced swing trade pattern.
However, the current return rate will not be sustained. In fact, I am confident that it is way out in front of its sustainable return rate, so much so that I felt compelled to add a warning at the front of the system overview. Here’s how the description currently reads.
WARNING: Please do not take the current C2-forecasted return rate as accurate… it is much too high. I expect the Tarzan to make between 40% and 100% per year, with perhaps a 70% average. The system happens to be off to a fast start since I signed up for C2, which is gratifying, but I do not expect to make 200% (or higher) a year in the long run. I say all this because I do not want negative reviews from subscribers who had inflated expectations for the system. Tarzan is designed to achieve excellent returns year after year. It is not a get rich quick scheme.
Tarzan is intended to beat the market without consuming you as a trader or introducing extreme risk. Tarzan swings through the market, guiding you from trade to trade at a reasonable pace and with good success over time.
If you want a twitchy minute-by-minute system, there are probably many good (and even more bad) options on C2, but this is not one of them. However, you may well be interested in Tarzan if the following characteristics appeal to you.
1) Liquid stocks that provide solid fills
2) Both long and short positions
4) Positions generally held for 2 to 10 days (i.e. swing trades)
3) 7 trades active at a time
5) Orders entered while the market is closed to be filled the following open (this includes all orders: Buy to Open, Sell to Open, Sell to Close, Buy to Close) ~ yes, you can trade Tarzan while working a full time job
6) No leverage. You will use the margin in your account to allow you to close 1 position and open another one at the same time ~ obviously, this is up to you, but our C2 results are achieved with no leverage
The net result of all of this is this: each evening, you will typically have one position to manage. You will enter a BTC or STC against an existing position and enter a new BTO or STO, with both the entry and exit order set to execute at the next market open. Automated trading is also supported.
All entry and exit signals are generated by a proprietary trading system that enters with the trend after a correction and exits either when the correction shows itself to be the new trend or when the correction has itself been corrected.
The backtest results include 79 12-month returns (sampled at the beginning of each successive month). Of those 79 12-month periods, the following results apply:
12-month return ==== Percent of 79 samples above the return rate
0% or more ==== 100%
15% or more ==== 98.7%
25% or more ==== 91.1%
50% or more ==== 78.5%
Now, lets break it down the same way, but using the 6-month profit (there are therefore 6 extra samples, or 85 total periods).
6-month return ==== Percent of 85 samples above the return rate
-10% or more ==== 100%
0% or more ==== 91.8%
10% or more ==== 80%
20% or more ==== 67%
The compound annual return over that entire period was 82%. During that same period, the worst drawdown experienced was 31%. To be honest, I never expect the future to perform as well as the past due to such pitfalls as backfitting the data (hopefully a very minimal problem… I certainly worked hard to avoid this one), the gradual change in trader psychology over time, etc. My expectation is that the CAR will be in the neighborhood of 70%. I will continue to seek improvements in the system, but generally my focus will be on lowering that drawdown, which is high for my liking (I am targetting a 20% max DD), and not to increase the CAR.
What is the takeaway? Like any good system, results do fine in the long run (6 months or more in most years). But dipping in and out of this system (and others) can potentially minimize the profit potential.
Frequently Asked Questions
What is the recommended account size?
The return rate is rather important in this calculation, so lets assume 43% a year (see backtest results above). That works out to a compounded 3% a month. So an account with $2500 would make $75 a month, just enough to pay for the system. Now I hope to do a lot better than 43% most years, but is provides a conservative estimate, and would indicate an account would need to be in the $5000 (or more) to make the monthly fee worthwhile (i.e. the remaining return is sufficient to beat the market). A higher return rate would obviously allow for a smaller account being successful. Likewise, a larger account increases the effective return rate (once the $75 is deducted).
Where are the stops?
As 1) Tarzan is a mechanical system and 2) a loss stop cannot be proven to be of any value in the past 8 years… there are none. I like stops. I have worked hard to find a way to use them effectively with Tarzan, but it just does not work. Instead, I review each position each evening in accordance with strict exit rules for the system and exit losing positions the following morning. However, I understand that some traders will want to add a discretionary intraday stop, and that is fine, just please make it as large as you possibly can within your comfort zone. A close stop, even a seemingly large stop, will hurt performance over time. A loss stop will definitely help you on some trades, but it will bleed you dry over the long haul if the data is to be trusted. So add a stop if you would like, but keep it very far from the purchase price (like greater than 20%). My own approach is to use conservative money management in place of a stop, so that even in the worst case of a, say, 40% loss on a position, I would still do okay in the long run because I had not plowed my whole account into that trade. That backtested resuslts (with that 30% drawdown and 80% CAR) use no stops and put 1/7 of the account into each position.
At the time, I was flummoxed as to why nobody would subscribe to my system. The other top performing systems had many subscribers, with some of them pulling in over $10k a month in subscription fees. So what was wrong with mine? I now believe it is because the style of trading to which I am attracted looks stupid to others. I opened positions at market open and closed them at market close. I wasn’t twitchy enough for day traders and I broke every rule of the swing traders by closing out positions “prematurely” and had way too many trades. I got lots of questions like, “Where are the stop losses” and I’d answer that I didn’t use them because I couldn’t prove they added value… but that’s not what folks wanted to hear.
I’ve continued to work on my system and utilize it for myself. There’s been some ups and downs, but that’s not really what this post is about. I’ve missed Collective2, as it was one of the only public forums I had with my trading, and I found the interaction enjoyable. So I recently decided to give it another go by giving myself a very challenging (for me) goal. I didn’t focus on maximizing returns as I had with my previous system. Rather, I tried to design a system around a few key principles that I thought would make people happy.
Oddly enough, most traders (as far as I can tell… and numerous books agree with this observation) tend to focus on what makes them happy in the activity of trading rather than on making money. So I designed a system with the swing trader in mind:
1) All trades entered in the evening. Enter and exit positions on market open. Allows the system to be traded by those working full-time jobs.
2) Minimize the total number of trades while also minimizing the risk. I thought seven positions (both long and short) held at a time struck the right balance.
3) In general, let the profits run and cut the losses short.
4) Trades average 2 to 10 days. Here I was trying to meet the expectations of the trader looking for a bit of excitement without offending their sensibilities by dipping into the day-trader playbook.
Anyway, version 1 of Tarzan is now up and running. I still need to bring the max drawdown lower (it hits a 30% drawdown during 2002 in my backtesting), but I wanted to start developing a track record with it so I went ahead and put it on C2. As I alluded to above, classic swing systems are very difficult for me, so this was quite an achievement… or it will be, if it works. We shall see.
Platforms to backtest come in many shapes and sizes, but most of them have a common characteristic: they are insanely expensive. In spite of the cost, most have huge gaps in their capabilities. In particular, many are deficient at the portfolio level and instead assume you are trading a few specific equities rather than screening the broader market day by day.
Amibroker overcomes all these issues for the technical trader (not so much for the fundamental trader). It is both feature rich and affordable. I have spent quite literally thousands of hours developing my trading style and specific system, and most of that time was logged on Amibroker. If you want to focus on individual charts, it is feature rich. If you want to do portfolio testing, it is feature rich. I have encountered very few goods or services in my life of any sort that were such a remarkable value.
Now, I will hasten to add that Amibroker does have some flaws. However, the rich scripting language (AFL) has allowed me to overcome pretty much all the problems I’ve encountered in the generic features. For instance, I’m not a fan of the default stops (loss, profit, time, etc.), but have implemented my own using AFL. You can even turn off the default backtester itself and programmatically control the individual buy/sell decisions. Thus, I was able to create a system that includes both shorts and longs with separate max open positions for each. So the flaws, though real, tend to be surmountable and do not detract from the value of Amibroker.]]>
I use Stockfetcher to: 1) execute my system on a daily basis; 2) explore the universe of stocks and chart patterns for new ideas; 3) provide initial testing on those new ideas.
The scripts used to screen for stocks are fairly easy to learn and are remarkably powerful. The only deficiency in the scripting logic is the lack of an OR statement (this is by design intent according to the site owners), and even this small drawback is easily overcome with a clever use of the COUNT command.
At $9 a month, I’m not aware of a better value for those wanting to implement a mechanical trading system that uses EOD data (End of Day… in other words, the day’s Open, High, Low, Close, and Volume).
And now a bit of explanation for those wondering what a stock screener (or scanner as they are sometimes called) is or does. Simply put, a stock screener evaluates data from the universe of stocks (around 8000 of them on the major markets) and spits out a list of stocks that meet specified criteria at that moment. Additionally, it might provide data about the list of stocks produced. In the case of Stockfetcher, that data can be user defined.
Thus, to run my system, I use Stockfetcher to run 8 different screens each night and return a list of stocks for each screen with an associated score that I use to sort the stocks picked from the different screens. I then take the top 6 that were picked by my 4 long screens and the top 9 that were picked by my 4 short screens. But the whole ranking and picking moves us to another tool for another entry.]]>
The next morning, I closed out the position down almost 15%. I took a massive hit to my equity because I “knew” the stock was going to go up. I had been fooled by some modest success into thinking I could wing it. It was rather devastating, yet I remain thankful for the experience. It was the first trade in which I totally deviated from my system, and I got nailed. And when I say I deviated from my system, I mean I deviated massively. I threw out my money management rules and exposed far more money to a single trade than my system allowed, thus increasing my risk. And I expanded my position at a price point that was much higher than my system allowed.
In so doing, I learned (all the way down to my bones) an absolute, essential rule for system trading. I know nothing. Nothing at all about an individual trade. All my knowledge is at the system level. I have an edge, but it has nothing to do with any specific knowledge of any particular trade. Rather, I know that if I follow my system over enough time and enough trades, the odds tilt in my favor. (At least, that’s the hypothesis under test.)
Let’s say you are holding a bag with 100 marbles, 55 of them red and 45 of them blue. You reach in (without looking) and pull out a marble. It is red. You then return the marble to the bag and shake the bag up. You do this 5 times and each time you get a red marble. Your knowledge of what just happened (you got a red marble 5 times in a row) and your knowledge of the overall system odds in no way means you know the next marble will be red. Or blue, for that matter. Your knowledge remains strictly at the system level: that over enough trials you will tend toward an average of 55% of the marbles drawn being red.
Statistics 101, yet in the heat of a trade, it is surprisingly easy to forget. If you want to learn system (or mechanical) trading, I suggest you trade like Sgt. Schultz. When looking at an individual trade, always remind yourself, “I know nothing.”]]>
Picking up the story from where I left off, I began working on an entirely new system toward the end of the summer in 2006. There were numerous fits and starts along the way, and I ultimately changed to Interactive Brokers to put more emphasis on good, cheap execution and less emphasis on real-time data. I used Collective2 to do a lot of experimentation, which proved surprisingly helpful. By December I had all the elements of a new system in place and coming into the new year I’ve started fully trading it.
For now, you can see the results on my Collective2 system 1Day, which was used to try out a lot of ideas, but since the beginning of 2007 is simply tracking my actual trades (though to a different scale, as C2 forces you to start with $100k play money).
The great thing about my current approach is that it is easy. Well easy to execute on a daily basis… it was rather hard to develop. And that is the key lesson I believe I learned from my previous stressful success. Sustainable success isn’t going to come from a system that is massively difficult to actually trade. A good trading system should be like a good compressed codec, to use a techie metaphor. Take MP3s. The codec is designed to be front-loaded. All the time goes into the encoding. The decoding is quite easy and can easily be performed in real-time with minimal processing. The encoding will max out your CPU and takes quite a bit of time.
Likewise, developing a system is hard work, and as I believe I’ve amply demonstrated, takes a lot of time and involves a lot of failure (at least in my case it did). Actually trading the system, however, can be fairly straight-forward. I’ve got a spreadsheet that I use each night to download the hits on the six screens I use, and then ranks them and puts them in a format to be uploaded to my broker. It takes about 5 or 10 minutes a night to do my trades. Encode hard. Decode easy.
So my new system succeeds at lowering the stress level… time will tell if it remains profitable.]]>
It took some time, and there were some fits and starts, but I put together an approach that ultimately returned 140% profit in the following 8 months. Using no leverage, for what it’s worth. That’s the good. The bad was that it was killing me. The system was simply too stressful to trade while holding down a full-time job. Eventually, the stress led to mistakes, and what should have been a 10% drawdown turned into a 25% drop with the help of numerous mistakes on my part. Though I managed to sort my way through it and saw the beginnings of a nice recovery, I ultimately decided to set aside the system for that imaginary time in the future when I can trade full time.
So I had proven to myself that I could succeed, yet as I entered the summer of 2006 I felt I had to start from scratch, with far more emphasis on building a system that was relaxing, even if the returns were much lower. I had gained a new appreciation for the debilitating effects of stress. Stress doesn’t stay within neat boundaries, it spills over into all areas of your life. And I had also come to a better understanding of my strengths and weaknesses as a trader. I had developed a certain style, a personality, as a trader. Now I had to find out if I could invent an entirely new trading system that fit my style, gave good returns, and allowed me to relax.
I was starting to have some good ideas, but had no rhythm or discipline in my trading. I was all over the place, with no sustainable system to follow. That first spike was rather heady, as it reflected a doubling of my account in a matter of weeks (without using any leverage). But the same practices that led to the spike also led to the subsequent downturn. Another lesson learned.
Toward the end of the phase you’ll see the graph settle down. The spikes and drops smoothed out. But it also settled down, as in pointing down, generating negative returns.
At the end of these tumultuous months, I was back where I started. Or was I? In hindsight, I had learned just as much during this second phase as I had during the first phase of my trading, and this time I had not lost additional money. So school was still in session, but now I had a scholarship.]]>