Is backtesting necessary before Walk Forward Analysis?

Do you Backtest before Walk Forward Analysis


  • Total voters
    15
no, we are not talking about different terms, the terms all refer to one and the exact same way to apply data to a parameterized strategy. Nothing new, nothing different. And no, I did not say what you mentioned was incorrect because we are talking about different terms but because what you claimed was factually incorrect.

Yes I have. And the advanced stuff as well.

And I politely disagree that everything I said is incorrect, since it's clear that we're talking about different definitions of the terms 'WFA' and 'backtesting'.

Are you always this rude?
 
lol, there is and they all refer to the same exact approach. It is just sneaky snake oil sales men inventing new terminology for the same old thing. Statistics has not been re-invented, it is just that the revenue streams of many of the "backtesting" platform providers are drying up and they have to come up with fancy terminology to hook newbies. Take a look how many testing platforms claim to now feature "machine learning" and "genetic algorithms". As someone with an advanced degree in quant finance and statistics from one of the best academic institutions in the world and more than a decade at hedge fund and sell side quant trading desks, I can only chuckle and smirk. Most such platforms cannot even properly aggregate p&nl, and much less properly convert non base-currency returns. There is a whole snake oil industry that feeds off newbies or the lazy, who love to hear stories how 1 hour per day makes them rich.

backtesting (out-of-sample), simulation, paper trading, walk forward testing, forward testing is all one and exactly the same approach to run data that have not been perused for strategy parameter calibration purpose over a strategy for performance evaluation purpose.

Thanks. Good to have some helpful people. As I said on my other post, for me it would be using my terminology:

Observation & rule design,
backtesting (or simulation) done using rolling out of sample data,
paper trading,
real trading.

I hadn't heard the term 'walk forward testing' till about 6 months ago, and 'forward testing' not until yesterday; despite about a decade in the industry. Which goes to show there isn't a common dictionary on this stuff.
 
this is potentially one of the worst ways to test a strategy. You are basically overfitting the data not once, not twice, but around 7 times. And you basically completely re-calibrate the strategy from one period to another. Problem with this is that your strategy is overfitted to the most recent passed period but markets do not care about distinct, period-defined, time buckets. Imagine you "optimized" (aka overfitted) your strategy to 2004-2007 data, just as your last step in this animated graph, then you run an out-of sample backtest for 2008. You will discover that the strategy performance completely breaks down, even if the strategy may be sound, just because you limited your parameterization to one specific period.

Essentially, this approach grossly disregards market cycles and changes in market dynamics. And what I described above is your best-case scenario. Imagine you optimized the strategy to 2004-2007 data and traded 2008, you would be completely wiped out. Congratulations, another one bites the dust...simply because of an extremely poor way to develop and test a strategy.


I am talking about this:
walkfwd2.gif


This is what I mean by Walk Forward Analysis, assuming the current date is the start of the year 2008. So all analysis&tests are done with historical data. From what I've read in several Wiley Trading series books, this is the only WFA definition there is. Optimizing the strategy from 2005 till 2008, and "paper trading" it in 2009, is called Forward Testing as far as I know. I am only talking about the process in the gif animation done with historical data. Is it clear now?
 
lol, yes, so what are you surprised about? Of course this is the case. Let's drop all those fancy words for one second: If you optimize over the same data on which you measure performance then even a middle schooler understands that such results will be better than using unused data for performance evaluation. Has zero and nothing to do with "anchored WFA", robust, non-robust fitting, or whatever other poop you read in some TA books written by authors who must publish books because otherwise they could not make a living.

You might be interested in an exercise I did recently on a 'toy' system. The fitting was deciding the weights to give to 4 trading rules, and to 7 instruments.

Fully in sample using non robust fitting Sharpe Ratio 0.84
'anchored WFA' using non robust fitting Sharpe Ratio 0.30
'anchored WFA' using robust fitting method Sharpe Ratio 0.52

This gives an indication of how fitting entirely in sample can give you a obscenely high expectation of performance compared to what you can realist in reality (and this on a toy system with relatively few degrees of freedom), and how important it is to use a robust fitting method that doesn't produce extreme parameters from very little data.
 
Last edited:
this is potentially one of the worst ways to test a strategy. You are basically overfitting the data not once, not twice, but around 7 times. And you basically completely re-calibrate the strategy from one period to another.

Essentially, this approach grossly disregards market cycles and changes in market dynamics.

Not true. It's certainly far better than fitting insample. As to whether it is overfitted that will depend on how robust the fitting method is, how many years of data are used to fit and so on.

Your reply isn't even consistent. Repeatedly fitting is bad, but you have to account for changes in market dynamics (which one assumes involves repeatedly refitting, unless you fit multiple models to the past and then try and predict which state is coming next using a Markov state machine or similar - an approach with way too many parameters for my likely that will probably end up overfitting).

Take a look how many testing platforms claim to now feature "machine learning" and "genetic algorithms"

lol, yes, so what are you surprised about? Of course this is the case. Let's drop all those fancy words for one second: If you optimize over the same data on which you measure performance then even a middle schooler understands that such results will be better than using unused data for performance evaluation. Has zero and nothing to do with "anchored WFA", robust, non-robust fitting, or whatever other poop you read in some TA books written by authors who must publish books because otherwise they could not make a living.

That is obvious to me as well, but then so what? What we're talking about isn't "poop from TA books" but basic statistics. The guys in the TA books (which I haven't read by the way, despite your inference) might come up with silly names for it but he whole point of this thread is to illustrate that out of sample is better than in sample fitting (which when you strip away the jargon differences is what we're talking about). If that is in a TA book, well it might be a TA book written by an author of dubious motivation, but it's still correct.

It's clear from some replies that not everyone appreciates that, hence me making the point with some actual figures.

I'm also confused by how what I am saying is both 'obvious' and 'factually incorrect'!)

I'm no fan of machine learning eithier, but that isn't what is being discussed here, is it?

I come to this place to learn and to try and pass on something of what I know, and with an open mind to learn from others. Sadly you have zero interest in learning anything, and despite your undoubted claimed expertise you have no desire to pass on your knowledge, only to ridicule and abuse those who disagree with you or know less than you (which is pretty much everybody else as far as I can tell).

Your reasons for doing this I can only imagine, but Freud would have a field day. Perhaps you are trying to reproduce the macho camaraderie you once had on a trading floor. I pity you. For someone who claims to spend 12 hours a day working still to have time to come on here is very strange. I suggest you try and spend more time doing other things. It might make you a better person.

I've seen this kind of behaviour from you on too many threads now. You are rude, and you are a bully. Worst still you are boring. You sir, are ignored.
 
... As someone with an advanced degree in quant finance and statistics from one of the best academic institutions in the world and more than a decade at hedge fund and sell side quant trading desks, I can only chuckle and smirk...

I need some verification of that. I , myself, am a newbie with very little live trading experience and a student in a university which is not one of the world's best. English is not my first language but I do not use the word "parameterized" wrongly for "optimized" (maybe I'm understanding this falsely). So if what you are saying is true, either you did not listen/attend the classes you were supposed to because as user globalarbtrader says these are all basic statistics.
 
Not true. It's certainly far better than fitting insample. As to whether it is overfitted that will depend on how robust the fitting method is, how many years of data are used to fit and so on.

Your reply isn't even consistent. Repeatedly fitting is bad, but you have to account for changes in market dynamics (which one assumes involves repeatedly refitting, unless you fit multiple models to the past and then try and predict which state is coming next using a Markov state machine or similar - an approach with way too many parameters for my likely that will probably end up overfitting).



That is obvious to me as well, but then so what? What we're talking about isn't "poop from TA books" but basic statistics. The guys in the TA books (which I haven't read by the way, despite your inference) might come up with silly names for it but he whole point of this thread is to illustrate that out of sample is better than in sample fitting (which when you strip away the jargon differences is what we're talking about). If that is in a TA book, well it might be a TA book written by an author of dubious motivation, but it's still correct.

It's clear from some replies that not everyone appreciates that, hence me making the point with some actual figures.

I'm also confused by how what I am saying is both 'obvious' and 'factually incorrect'!)

I'm no fan of machine learning eithier, but that isn't what is being discussed here, is it?

I come to this place to learn and to try and pass on something of what I know, and with an open mind to learn from others. Sadly you have zero interest in learning anything, and despite your undoubted claimed expertise you have no desire to pass on your knowledge, only to ridicule and abuse those who disagree with you or know less than you (which is pretty much everybody else as far as I can tell).

Your reasons for doing this I can only imagine, but Freud would have a field day. Perhaps you are trying to reproduce the macho camaraderie you once had on a trading floor. I pity you. For someone who claims to spend 12 hours a day working still to have time to come on here is very strange. I suggest you try and spend more time doing other things. It might make you a better person.

I've seen this kind of behaviour from you on too many threads now. You are rude, and you are a bully. Worst still you are boring. You sir, are ignored.

I really appreciate your help here. Thanks you very much. The previous post of your's regarding nonrobust vs robust fitting was an unknown topic to me, I am going to research them when my exams are over. Again, thanks for the informative posts.
 
I would say it's good to do both. If you optimize first you get a idea for the range of parameters that work well and where to start for your walk forward analysis. In addition you also can get your best case results and have a standard to judge just how good the system performed out of sample using walk forward.

Is judging an walked-forward system (if that is the correct term) against a curved-fitted-to-historical data system the best way to go? Robert Pardo (if I am not mistaken), measures a "Perfect Profit" metric (which is the profit you make when you buy at the dip, sell/short at the peak (this metric changes from timeframe to TF) . Comparing the walked-forward system to this metric is more logical in my opinion, but would like to hear what you think also.
 
* again, you are mixing up terminology here: "Fitting in-sample" is a misnomer. All data, used for fitting/calibration/parameterization by definition is in-sample.

* good luck with re-fitting a strategy constantly. How do you know the current dynamic continues or changes tomorrow? If your strategy is not robust to handle different market cycles then no "fitting" (overfitting) in the world will save you.

Feel free to ignore me, yet that does not change the fact that the terms you and your comrades used over 5 pages are all identical, they describe one and the same.

Not true. It's certainly far better than fitting insample. As to whether it is overfitted that will depend on how robust the fitting method is, how many years of data are used to fit and so on.

Your reply isn't even consistent. Repeatedly fitting is bad, but you have to account for changes in market dynamics (which one assumes involves repeatedly refitting, unless you fit multiple models to the past and then try and predict which state is coming next using a Markov state machine or similar - an approach with way too many parameters for my likely that will probably end up overfitting).



That is obvious to me as well, but then so what? What we're talking about isn't "poop from TA books" but basic statistics. The guys in the TA books (which I haven't read by the way, despite your inference) might come up with silly names for it but he whole point of this thread is to illustrate that out of sample is better than in sample fitting (which when you strip away the jargon differences is what we're talking about). If that is in a TA book, well it might be a TA book written by an author of dubious motivation, but it's still correct.

It's clear from some replies that not everyone appreciates that, hence me making the point with some actual figures.

I'm also confused by how what I am saying is both 'obvious' and 'factually incorrect'!)

I'm no fan of machine learning eithier, but that isn't what is being discussed here, is it?

I come to this place to learn and to try and pass on something of what I know, and with an open mind to learn from others. Sadly you have zero interest in learning anything, and despite your undoubted claimed expertise you have no desire to pass on your knowledge, only to ridicule and abuse those who disagree with you or know less than you (which is pretty much everybody else as far as I can tell).

Your reasons for doing this I can only imagine, but Freud would have a field day. Perhaps you are trying to reproduce the macho camaraderie you once had on a trading floor. I pity you. For someone who claims to spend 12 hours a day working still to have time to come on here is very strange. I suggest you try and spend more time doing other things. It might make you a better person.

I've seen this kind of behaviour from you on too many threads now. You are rude, and you are a bully. Worst still you are boring. You sir, are ignored.
 
I am not a native English speaker, either but I am not sure which part of parameterization is incorrect:

Definition:
  1. Parametrization (or parameterization; also parameterisation, parametrisation) is the process of deciding and defining the parameters necessary for a complete or relevant specification of a model or geometric object.
It is equipping variables with values to specify a model/strategy. That is the exact same goal than what an optimization aims to achieve. Not sure where the confusion stems from.

And I find it funny you accuse me of misleading you or others and/or using concepts wrongly. You can feel free to take a look at the countless technical threads in which I participated. If you still think I have no clue what I am talking about then feel free to put me on ignore. But I recommend you taking the content you read or hear from others and verify for yourself. In that I am happy you are critical towards what I said but I recommend you do the same with other content. Murray Ruggiero in another thread openly admitted that his software lacks some of the most basic features that I asked him about (tick data backtesting capabilities, correct currency conversions of non base currency performance data, among others). Let me know if you need a link. Some have a strong incentive to make backtesting sound its complicated and complex and only their software feature take a "right" approach, difference here is that I have no incentive whatsoever other than helping newbies as yourself to cut down on needless complexity and just see the forest for what it really is, a bunch of trees.

You are listening to those who have an issue with the way I talk (which is obviously debatable) rather than focusing on the content, they have not been able to respond to explain how all this hocus pocus terminology is different.

But again in summary: You divide your data set into 2 subsets: One in-sample set that you use to parameterize your strategy, I call it optimize because obviously you want to choose parameters that optimize your performance while hopefully keeping the robustness of a strategy intact. You then take the second, out-of-sample dataset, and test the strategy performance on that. How you slice and dice the subsets is left to your imagination but this is all there is to it. There is no forward, backward, robust, non robust, paper, testing. Just in-sample and out-of-sample. I challenge anyone to provide a comprehensible, logical, and concise proof to the contrary should anyone disagree.

I need some verification of that. I , myself, am a newbie with very little live trading experience and a student in a university which is not one of the world's best. English is not my first language but I do not use the word "parameterized" wrongly for "optimized" (maybe I'm understanding this falsely). So if what you are saying is true, either you did not listen/attend the classes you were supposed to because as user globalarbtrader says these are all basic statistics.
 
Last edited:
Back
Top