Is backtesting necessary before Walk Forward Analysis?

Do you Backtest before Walk Forward Analysis


  • Total voters
    15
Suppose you backtest and don't like what you see. You're probably going to drop the rule, or modify it
You shouldn't really take any action at all, since you have forward information at this point.

But if you're going to backtest and ignore the results, then what's the point of doing it?

To be honest I find the dichotomy artificial, and it's not a distinction I'd seen before, so some of the argument might be that I am not understanding the nomenclature. I don't really understand what you mean by 'forwardtesting'....?

To me WFA is a form of backtesting*, and in my opinion the only valid one.

* other time domains are fully in sample, half out of sample (fit on first period A, test on second period B), half and half (fit on A test on B, fit on B test on A), knock one out (fit on all years except 2002, test on 2002; fit on all years except 2003, test on 2003...)

I think they mean "paper trading" or some other sort of demo trading with optimized strategy with live data (unlike past data in WFA) when they say "forward testing". But you are absolutely right that WFA is the only valid backtesting method.
 
Nice picture!

I'd personally call that 'rolling out of sample testing'. It's rolling, because we're using a fixed size window. Its definitely preferable to any kind of in sample testing, whatever you call it.

I personally prefer 'expanding out of sample testing' where you fit using a fixed start date, and move the end date. That's because 3 years is, for my kind of relatively slow trading at least, an incredibly short time to do any kind of fitting; and the benefits of having more data outweigh the fact you won't pick up on changes in market structure (I'm hoping to trade stuff that will 'always' work, not stop working in a few years time).

If you're using the right kind of techniques, which can adapt to the amount of noise vs signal in the data, you can also begin testing with relatively little data, in this case that would mean in the first test optimising in 1998 and testing in 1999. That means you don't throw away as much data.

Here is a (non animated) picture. Each row shows the years used to fit the rule we test in that year.

expanding.jpg


I am talking about this:
walkfwd2.gif


This is what I mean by Walk Forward Analysis, assuming the current date is the start of the year 2008. So all analysis&tests are done with historical data. From what I've read in several Wiley Trading series books, this is the only WFA definition there is. Optimizing the strategy from 2005 till 2008, and "paper trading" it in 2009, is called Forward Testing as far as I know. I am only talking about the process in the gif animation done with historical data. Is it clear now?
 
Nice picture!

I'd personally call that 'rolling out of sample testing'. It's rolling, because we're using a fixed size window. Its definitely preferable to any kind of in sample testing, whatever you call it.

I personally prefer 'expanding out of sample testing' where you fit using a fixed start date, and move the end date. That's because 3 years is, for my kind of relatively slow trading at least, an incredibly short time to do any kind of fitting; and the benefits of having more data outweigh the fact you won't pick up on changes in market structure (I'm hoping to trade stuff that will 'always' work, not stop working in a few years time).

If you're using the right kind of techniques, which can adapt to the amount of noise vs signal in the data, you can also begin testing with relatively little data, in this case that would mean in the first test optimising in 1998 and testing in 1999. That means you don't throw away as much data.

Here is a (non animated) picture. Each row shows the years used to fit the rule we test in that year.

expanding.jpg

Yes, the animation I posted is one of the two types of WFA: rolling WFA, so it is essentially the same thing to the test what you call "rolling out of sample testing" . However the kind of out-of-sample testing you do is called anchored WFA (so if you see this term don't be shocked :)). So bottomline is, you call it out-of-sample testing, I call it WFA. Nice to know that some traders out there don't do backtesting with the whole historical data and curvefitting as a consequence.
 
You might be interested in an exercise I did recently on a 'toy' system. The fitting was deciding the weights to give to 4 trading rules, and to 7 instruments.

Fully in sample using non robust fitting Sharpe Ratio 0.84
'anchored WFA' using non robust fitting Sharpe Ratio 0.30
'anchored WFA' using robust fitting method Sharpe Ratio 0.52

This gives an indication of how fitting entirely in sample can give you a obscenely high expectation of performance compared to what you can realist in reality (and this on a toy system with relatively few degrees of freedom), and how important it is to use a robust fitting method that doesn't produce extreme parameters from very little data.

Yes, the animation I posted is one of the two types of WFA: rolling WFA, so it is essentially the same thing to the test what you call "rolling out of sample testing" . However the kind of out-of-sample testing you do is called anchored WFA (so if you see this term don't be shocked :)). So bottomline is, you call it out-of-sample testing, I call it WFA. Nice to know that some traders out there don't do backtesting with the whole historical data and curvefitting as a consequence.
 
You might be interested in an exercise I did recently on a 'toy' system. The fitting was deciding the weights to give to 4 trading rules, and to 7 instruments.

Fully in sample using non robust fitting Sharpe Ratio 0.84
'anchored WFA' using non robust fitting Sharpe Ratio 0.30
'anchored WFA' using robust fitting method Sharpe Ratio 0.52

This gives an indication of how fitting entirely in sample can give you a obscenely high expectation of performance compared to what you can realist in reality (and this on a toy system with relatively few degrees of freedom), and how important it is to use a robust fitting method that doesn't produce extreme parameters from very little data.

This is a masterpiece. This list is why I do WFA without bothering with backtesting with whole historical data. I hope people will see this.

Can you clarify how you differentiate a non-robust fitting with a robust one? Is non-robust fitting what you do when you make the "steps" too small within the range when you are optimizing a parameter?
 
Well the fitting I am doing here is portfolio allocation. So the non robust method is just doing a one period markowitz optimisation on all the data that we have (if doing WFA, only the past data). This tends to produce very extreme portfolio weights. The robust method is non parametric bootstrapping, where I subsample the past, optimise each subsample, and then take an average of the weights.

Except when I'm deciding instrument allocations I'd also use a pool data from multiple instruments to sample from, but account for different cost levels (so assuming all instruments have the same pre cost performance for all trading rules, but end up with more allocation to faster rules only with cheaper instruments).

More generally robustness in fitting is about getting the degrees of freedom in your fit to match the amount of signal vs noise in the data. If you have a robust method and you put in random noise, then you'll get back a 'null' model (in portfolio allocation space, the 'null' model is equal weights; in more general fitting problems it might be something like 'this regression line has no slope or intercept'). But a non robust method would see patterns where none really existed. If you put in data with a lot of structure the robust method will then move the parameters away from the 'null' as much as the data justifies it.

Bootstrapping is a good way of doing this, and unlike some other methods doesn't need 'tweaking'. For example various bayesian methods, I like the concept a lot, but you need to come up with a prior, and then work out how much to shrink; it's a lot of messing around.

So you could repeatedly sample periods from the past, fit your parameters on those periods, and then take an average of the parameter values.

This is a masterpiece. This list is why I do WFA without bothering with backtesting with whole historical data. I hope people will see this.

Can you clarify how you differentiate a non-robust fitting with a robust one? Is non-robust fitting what you do when you make the "steps" too small within the range when you are optimizing a parameter?
 
I would say it's good to do both. If you optimize first you get a idea for the range of parameters that work well and where to start for your walk forward analysis. In addition you also can get your best case results and have a standard to judge just how good the system performed out of sample using walk forward.
 
wait, whats the difference between out-of-sample backtesting, forward testing and simtrading? In my book they are all one and the exact same. You run a parameterized strategy over a data set that has not been used to parameterize the strategy and simulate fills/executions of potential trades. One and the exact same thing.

It's easier and quicker to refer to this: Developing A Plan. Observation, backtesting, forwardtesting, simtrading, real trading.
 
Back
Top