Do Neural Networks overfit less than Decision Trees?

Sprout · Jan 7, 2018

At the risk of being off- topic (and a novice in this domain to boot), is Alphago from Deepmind curve fitting as it teaches itself?

https://www.wired.com/story/this-more-powerful-version-of-alphago-learns-on-its-own/

Three Main components of Alphago:

Policy Network-
Trained on high level games to imitate the opponents.

Value Network-
Evaluates board positions and determines probability of winning from these positions.

Tree Search -
Looks through the different variations of the game from current positions to determine probability of future outcomes.

Netflix currently has a documentary on the competition between Alphago and the world’s best Go champion - Lee Sedol.

What was surprising is the crowd’s realization when confronted with the idea of human’s being ‘out-thought’ by a machine and the huge humble pie that was served.

gon · Jan 7, 2018

T

Sprout said:
At the risk of being off- topic (and a novice in this domain to boot), is Alphago from Deepmind curve fitting as it teaches itself?

https://www.wired.com/story/this-more-powerful-version-of-alphago-learns-on-its-own/

Three Main components of Alphago:

Policy Network-
Trained on high level games to imitate the opponents.

Value Network-
Evaluates board positions and determines probability of winning from these positions.

Tree Search -
Looks through the different variations of the game from current positions to determine probability of future outcomes.

Netflix currently has a documentary on the competition between Alphago and the world’s best Go champion - Lee Sedol.

What was surprising is the crowd’s realization when confronted with the idea of human’s being ‘out-thought’ by a machine and the huge humble pie that was served.

When you read "policies", "values" or "rewards", "agents" and so on, you are speaking most likely about reinforcement learning. If it uses trees, probably it's using gradient descent.

damien.dualsigma · Jan 7, 2018

All right so thanks everyone..
Now, the key question is.. do you need to make sure parameters introduced in random forest or neural networks or whatever ML algorithm are statistically relevant in driving the output function? By themselves or as an interaction with others? Or is ML going to figure out further meaningful relationships not actually statistically relevant at first sight?

gon · Jan 7, 2018

Introducing redundant parameters will reduce the performance of your system.

For instance, if you introduce and index and one stock very representative of that index that has also the same dynamics, it will reduce the chances of a good model.

If you insert open and close prices, since they may be similar, that will probably reduce the performance too.

You can try using a correlation matrix and other feature selection techniques to avoid overloading your model with:
- Redundant predictors.
- Non-significative predictors.

Also, some models may require you to standarize the data you use. And others will require you to also normalize it. Others may require you scale your data within a certain range such as [0,1]. And for categorical data, you need always to transform it to binary representation...

If you are tempted to use preparation methods such as Principal Component Analysis, you must be also aware that it does not necessarily will reduce all the features into something useful.

sle · Jan 7, 2018

gon said:
Random Forest are much easier to configure and perform very well, why do not you give a try? And if you get a good model with it, then you can try to move on with NNs that are a pain in the ass if you are not used to them.

The issue with random forests is the discrete (piece-wise linear) nature of the splits. That's a problem with almost all "classifier" methods, you end up with discrete populations which is not necessarily optimal in finance.

gon said:
https://www.crcpress.com/Introducti...ition/Watt-McCleery-Hart/p/book/9781584886525

Looks like a pretty good book, especially if one is just getting started. Thanks!

gon said:
If you are tempted to use preparation methods such as Principal Component Analysis, you must be also aware that it does not necessarily will reduce all the features into something useful.

Actually, PCA is a great tool for financial applications, if you know how to use it.

gon · Jan 7, 2018

sle said:
The issue with random forests is the discrete (piece-wise linear) nature of the splits. That's a problem with almost all "classifier" methods, you end up with discrete populations which is not necessarily optimal in finance.

Looks like a pretty good book, especially if one is just getting started. Thanks!

Interesting point.

Just for curiosity, do you know about any specific NN topology especially good for finance?

I am curious about it.

In the past I have been tempted to use NNs to parse directly charts, similar to this:

https://arxiv.org/ftp/arxiv/papers/1111/1111.5892.pdf

userque · Jan 7, 2018

damien.dualsigma said:
All right so thanks everyone..
Now, the key question is.. do you need to make sure parameters introduced in random forest or neural networks or whatever ML algorithm are statistically relevant in driving the output function? By themselves or as an interaction with others?

Yes.

damien.dualsigma said:
Or is ML going to figure out further meaningful relationships not actually statistically relevant at first sight?

A NN will build a network designed to forecast lottery drawings, based upon past drawings. It will deem some past drawing inputs more relevant that others.

NN's will find all sorts 'relationships' in the training data, even if they are meaningless on unseen data.

I haven't used random forest much, but I believe the same holds true, but to a lesser degree.

damien.dualsigma · Jan 7, 2018

userque said:
Yes.

Thank you.
One more “silly” question at this point:
- let’s say you have a model with 10 parameters and 1000 timesteps. Let’s say it gives you an R squared of 0.3 in sample, 0.2 out of sample.
How would you feel about it?
Would you run money on it?

Thanks all!

A NN will build a network designed to forecast lottery drawings, based upon past drawings. It will deem some past drawing inputs more relevant that others.

NN's will find all sorts 'relationships' in the training data, even if they are meaningless on unseen data.

I haven't used random forest much, but I believe the same holds true, but to a lesser degree.

userque · Jan 7, 2018

@damien.dualsigma , I see you've quoted me, but I don't see any response you've made--other than just quoting my post. Am I missing something?

sle · Jan 7, 2018

damien.dualsigma said:
Thank you.
One more “silly” question at this point:
- let’s say you have a model with 10 parameters and 1000 timesteps. Let’s say it gives you an R squared of 0.3 in sample, 0.2 out of sample.
How would you feel about it?
Would you run money on it?

A good R^2 does not mean a good strategy, obviously, but this would be a good start. At the very least, I'd run it with IRL execution assumptions and see how it fares.