Profit model correlation

dtrader98 · Apr 18, 2008

Quote from Jerry030:

Youâre welcome.

It is without a doubt the best open source data mining package out there.... probably equivalent to a $4 to $6K commercial package. Their business model is very creative: give the package away for free, then make your profit by selling consulting services when people realize it will take many hundreds of hours to reach the highly skilled level of usage and time being money, once you know what you need it's often more cost effective to pay to get that in done a few weeks than spend months doing it yourself.

By system examples do you mean tutorial or real projects?

You can find some tutorials here: http://www.neuralmarkettrends.com/tutorials/

I don't know of published studies.

Idea: if there are a few of us who what to explore data mining with Rapid Miner and the financial markets lets start a collaborative group.

For example:
1) Pick 3 or 4 markets and several time frames.
2) Create standard training and test data sets
3) Put those on a private site.
4) Each month pick a characteristic to model: entry points, stops, profit targets, trend start, trend stop... there are over a dozen logical trading system components for any market or system strategy.
4) Everybody try to create an optimal result using their preferred method: NN, decision tree, and so on. With Rapid Miner there are lots of potential mehods and mixtures or design components.
5) At the end of the months everybody post their best model as a RapidMiner file.

What one person overlooks, someone else may discover. In any case each month the best solution becomes a kind of benchmark for further independent research or incorporation in your own trading system...... or minimally a tutorial lesson for those trying to learn that method. One person might already have their own great method for trade exit but is less than optimal at stop loss placement. So their participation in the group may really pay off if the collaborative research leads to better stop loss strategy that they can add to their trading system.

What does anybody think?

My thinking is that the work at least in terms of sharing monthly Rapid Miner model files and statistical results needs to be a private process for those contributing something to the effort.

Otherwise in typical Internet forum fashion you get 3 people posting anything useful, 110 people reading for what they can learn but contributing nothing and 27 people insisting their own ideas are much better but also contributing nothing useful. For examples look at most threads on ET or similar groups. Much talk, lots of people doing a kind of social network pecking order shuffle, with little concrete value.

So with this idea it would be the reverse: let your Rapid Miner model do the talking....sort of eliminate the pontifications and posturing and focus on objective results. This approach is standard in academic circles where researchers exchange data sets and experimental designs for peer review and validation of their theories.

Jerry030

"Otherwise in typical Internet forum fashion you get 3 people posting anything useful, 110 people reading for what they can learn but contributing nothing and 27 people insisting their own ideas are much better but also contributing nothing useful"

Don't forget the 50 that pop in and try to shred your ideas, but never back it up nor add anything useful to the discussion.

I like the way you think. Might be worthwhile to pursue, but you'd have to find someone to moderate the 'private' logistics.

Regarding examples, there was a good tutorial on financial markets at one of the sites you mentioned (it was on gold). I stepped through it, but the author didn't draw really draw any conclusions or show any projections.
It was like, and so here you can look at correlations visually and that was about it (lesson 5 or 6 I think).

I'd like to see something like that, but with more of a conclusion about predictability and results. A lot like the type of system methodology you describe earlier.

I plan to play with it a bit more and see if I can come up with something useful.
What I really like is how many of the functions are instantiated and can be quickly pulled into a graphical tree like environment, allowing us to play with many AI type functions to quickly prototype.

Regarding papers, I see a few that look at things like hit rate, but don't seem too comprehensive (i.e. they only look at a small set of data universe). Also, they tend to be proprietary in their approach.

Jerry030 · Apr 18, 2008

Quote from dtrader98:

Don't forget the 50 that pop in and try to shred your ideas, but never back it up nor add anything useful to the discussion.

I like the way you think. Might be worthwhile to pursue, but you'd have to find someone to moderate the 'private' logistics.

Regarding examples, there was a good tutorial on financial markets at one of the sites you mentioned (it was on gold). I stepped through it, but the author didn't draw really draw any conclusions or show any projections.
It was like, and so here you can look at correlations visually and that was about it (lesson 5 or 6 I think).

I'd like to see something like that, but with more of a conclusion about predictability and results. A lot like the type of system methodology you describe earlier.

I plan to play with it a bit more and see if I can come up with something useful.
What I really like is how many of the functions are instantiated and can be quickly pulled into a graphical tree like environment, allowing us to play with many AI type functions to quickly prototype.

Regarding papers, I see a few that look at things like hit rate, but don't seem too comprehensive (i.e. they only look at a small set of data universe). Also, they tend to be proprietary in their approach.

*****
I like the way you think. Might be worthwhile to pursue, but you'd have to find someone to moderate the 'private' logistics.
*****

I can take care of that as I already have a couple of AI financial newsgroups on Yahoo and other places.

If I get a few more interested responses, I'll start a thread on ET and post some RM models and the results to see what kind of intellectual ferment has potential before setting up a private site.

good one · Apr 19, 2008

amazing video tutorials.
i've only just gone through the first one but would love to be involved in this project once i know what i'm doing.

TSGannGalt · Apr 21, 2008

Trader922 · Apr 21, 2008

Quote from Jerry030:

*****
I like the way you think. Might be worthwhile to pursue, but you'd have to find someone to moderate the 'private' logistics.
*****

I can take care of that as I already have a couple of AI financial newsgroups on Yahoo and other places.

If I get a few more interested responses, I'll start a thread on ET and post some RM models and the results to see what kind of intellectual ferment has potential before setting up a private site.

Jerry- I would most definitely be interested in participating in this group. I did my undergrad work in economics & had a lot of statistics work as well. I have been trading for several years and am interested in incorporating these type of techniques into my trading methods. I am currently beginning to learn Rapid Miner and would greatly appreciate working with a group of other individuals on learning at a much quicker pace.

Regards,
Eric

dtrader98 · Apr 28, 2008

hey all, is this thread doa? I have been playing with rapidminer a bit, and had some issues. For one, it seems to hog my computer and crash often (one downside to the nice gui based platform).

I was thinking maybe we could pick a simple experiment, like a neural net with an agreed upon data set, and compare results, just to make sure everyone can use the software properly. And if some of us neophytes have problems, maybe the expert users can chime in or we could collaboratively solve the stumbling blocks to get the ball rolling.

I have interacted a bit with the community over there, and the overwhelming conclusion is that the neural nets they have run are not good for stock market type predictions -- and these are users who are fluent in the language. Although there is a proprietary neuromaster tool that supposedly does this type of prediction, but unfortunately I don't want to play with that tool as it isn't worth the cost to figure out if it works.

This seems contrary to what jerry mentioned earlier, so maybe we are just spinning our wheels on rapid miner? Would be nice to have maestro chime in, as he has a lot of experience in this area.

I've seen a few papers that claim at least an edge in direction (rather than magnitude) using different types of neural net weighting algorithms (i.e. steepest descent, GA, etc).

Jerry030 · Apr 29, 2008

Quote from dtrader98:

hey all, is this thread doa? I have been playing with rapidminer a bit, and had some issues. For one, it seems to hog my computer and crash often (one downside to the nice gui based platform).

I'm still thinking about how to roll the idea out as a new thread here. Sorry it's taking a bit, but glad you are getting some experience with RM.

Did you download the new beta version 4.1 which has memory management options on the install screen?

I was thinking maybe we could pick a simple experiment, like a neural net with an agreed upon data set, and compare results, just to make sure everyone can use the software properly. And if some of us neophytes have problems, maybe the expert users can chime in or we could collaboratively solve the stumbling blocks to get the ball rolling.

Excellent idea!

I'd offer the following suggestions:

1) Pick a time format for the bars..daily, 10 minute, or whatever

2) Pick 1 specific instrument for each market type: stocks, futures and Forex.

3) Pick a simple common trading goal, such as trend start, open higher, gap opening, close outside of x day range, etc.

I'll for my choice I'll pick CBOT Wheat Futures, daily bars and open to close relationships.

When you pick something you should be responsible for supplying the price file as results could vary depending on the source and quality of the data.

I have interacted a bit with the community over there, and the overwhelming conclusion is that the neural nets they have run are not good for stock market type predictions -- and these are users who are fluent in the language. Although there is a proprietary neuromaster tool that supposedly does this type of prediction, but unfortunately I don't want to play with that tool as it isn't worth the cost to figure out if it works.

The community over where?

Who sells or has neuromaster? I havenât heard of it.

This seems contrary to what jerry mentioned earlier, so maybe we are just spinning our wheels on rapid miner? Would be nice to have maestro chime in, as he has a lot of experience in this area.

There is a basic law of economic ecology in all this: Software that can make money (in the market) is a difficult thing to create, but extremely valuable.

Some that say it can't be done have tried and failed, others havenât tried but have an economic interest in not trying (theyâre selling systems or services and don't what their market jumping ship), and other have tried and succeeded but don't want a bunch of people doing the same as in some ways most markets are zero sum games.

Personally I know it can be done. My personal best performance was a 300% increase in total account balance in 9 months with a 93% Win/Loss ratio. As they say, your performance may vary, a lot. (lol)

This stuff requires several things 1) a lot of time 2) a mental skill that mixes the skills of a detective with the analytic ability of a systems designer and the insight into dynamic process of a philosopher or something similar.

If you think you can load some data, press a few buttons and get winning trades ...well you can't and those with that mindset should attend a seminar or buy a trading system where such results are quarantined. There are lots of such things for sale. My spam filter is full of them.

I've seen a few papers that claim at least an edge in direction (rather than magnitude) using different types of neural net weighting algorithms (i.e. steepest descent, GA, etc).

What youâll ever see in print is about 2% of what going on. This is a domain where if you want it, you have to do it yourself or find somebody who has it to manage your money. The âhow toâ won't be published or sold as a package.

So I would offer that data mining is not for most of the people reading this or any newsgroup. Most will probably fail at it, which is true of trading in general as well I guess.

Jerry030

dtrader98 · Apr 29, 2008

Quote from Jerry030:

What youâll ever see in print is about 2% of what going on. This is a domain where if you want it, you have to do it yourself or find somebody who has it to manage your money. The âhow toâ won't be published or sold as a package.

So I would offer that data mining is not for most of the people reading this or any newsgroup. Most will probably fail at it, which is true of trading in general as well I guess.

Jerry030

Thanks for the reply Jerry. Couple of points...

The software I was referring to is called stockneuromaster, and they seem to be hawking it all over a site that (If I recall correctly) is one of the ones you referenced earlier, or at least it is on a major site that advocates rapidminer,
http://www.neuralmarkettrends.com/2008/04/28/apple-inc-aapl-neural-net-model/#comment-1972
The neural market trends site has a few nice tutorials. But about the only commentary that refers to trading systems is the proprietary "stockneuromaster" program they are pitching. And as I mentioned, I don't plan to learn on that platform.

There is a forum that has a few discussions regarding the potential for trading systems. From what I gather, no one seems to have much luck developing reliable forecasting models with stocks there, nor to they even mention much about systems terminology(although as you said, it's possible they just don't know enough to do so, miner software or not). The forum I looked over is :
http://sourceforge.net/forum/forum.php?forum_id=390413

I also spoke to someone pretty knowledgeable on RM, who mentioned in a few ways, that predicting via RM is pretty complicated and subject to many errors (such as curve fitting, normalizing input improperly with wide change in scaling regimes during learning/training vs. validating, etc...).

I think you are jumping a bit over my capability on the use of the tool by jumping to designing a system with specific trading goals(breakout, etc.).
I'm just not certain of how to translate those goals via RM, yet, although an example would help.

From my personal limited knowledge with rapidminer, I was expecting to start with a set we could all run to begin with that is not too complex. This way everyone would be able to get a common say xml model to verify I/O conditions and response.

Like say a simple daily equity model (like Qs for instance) nn, say with maybe 1,000 days of data for training and the end result would be to validate the one day prediction over some period and quantify the success over an out of sample set.
This is not so much to validate a system, for me, but to gain a rudimentary knowledge of using rapidminer and ironing out the bugs with some others attempting and looking at potential pitfalls, etc...

For instance, many of the tutorials they show were performed in a former version of YALE and did not port over easily.

Don't want to bog you down here with the rudimentary stuff. Let me know if my idea seems reasonable, otherwise maybe I'll keep working on getting up to speed with the tool while I have time. I would love to see your wheat run, but want to make sure I'm contributing something equal.

Trader922 · Apr 29, 2008

All- I am certainly still interested in collaboratively working on a Rapid Miner project.

Wheat seems like a good choice to me for a market to study based on the recent events in the market.

Also, I have a dotnetnuke site I would be willing to customize to facilitate the project and storage of the data.

Jerry - How much data would you typically use for a project like this when using EOD data?

Regards,
Eric

Jerry030 · Apr 29, 2008

Quote from dtrader98:

Thanks for the reply Jerry. Couple of points...

The software I was referring to is called stockneuromaster, and they seem to be hawking it all over a site that (If I recall correctly) is one of the ones you referenced earlier, or at least it is on a major site that advocates rapidminer,
http://www.neuralmarkettrends.com/2008/04/28/apple-inc-aapl-neural-net-model/#comment-1972
The neural market trends site has a few nice tutorials. But about the only commentary that refers to trading systems is the proprietary "stockneuromaster" program they are pitching. And as I mentioned, I don't plan to learn on that platform.

There is a forum that has a few discussions regarding the potential for trading systems. From what I gather, no one seems to have much luck developing reliable forecasting models with stocks there, nor to they even mention much about systems terminology(although as you said, it's possible they just don't know enough to do so, miner software or not). The forum I looked over is :
http://sourceforge.net/forum/forum.php?forum_id=390413

I also spoke to someone pretty knowledgeable on RM, who mentioned in a few ways, that predicting via RM is pretty complicated and subject to many errors (such as curve fitting, normalizing input improperly with wide change in scaling regimes during learning/training vs. validating, etc...).

I think you are jumping a bit over my capability on the use of the tool by jumping to designing a system with specific trading goals(breakout, etc.).
I'm just not certain of how to translate those goals via RM, yet, although an example would help.

From my personal limited knowledge with rapidminer, I was expecting to start with a set we could all run to begin with that is not too complex. This way everyone would be able to get a common say xml model to verify I/O conditions and response.

Like say a simple daily equity model (like Qs for instance) nn, say with maybe 1,000 days of data for training and the end result would be to validate the one day prediction over some period and quantify the success over an out of sample set.
This is not so much to validate a system, for me, but to gain a rudimentary knowledge of using rapidminer and ironing out the bugs with some others attempting and looking at potential pitfalls, etc...

For instance, many of the tutorials they show were performed in a former version of YALE and did not port over easily.

Don't want to bog you down here with the rudimentary stuff. Let me know if my idea seems reasonable, otherwise maybe I'll keep working on getting up to speed with the tool while I have time. I would love to see your wheat run, but want to make sure I'm contributing something equal.

Time is very limited today so sorry for the very short reply below

1) Neuromaster...thanks for the info. That makes sense. The folks who made Rapid Miner free (open Source) didn't do it to benefit humanity but to make money. Hence the general consulting service offering on their web site and also hence the neuromaster offering. That's fair and fine, IMO. We all need to make a living.

Keep in mind though that they have identified the market well. Most traders are in a hurry kind of person. So why learn a very complex process (predictive modeling), when you can buy a packaged solution?

2) Specific on RM for the markets. The real issue isn't the software app but the process. Can a NN, or Decision Tree or any of the several dozen data mining paradigms RM incorporates work in the market? If the paradigm is implemented accurately the same method will work with any package. Think paradigm not package. Will a NN work? Yes or no...Decision tree ..Yes or no.

3) "I also spoke to someone pretty knowledgeable on RM, who mentioned in a few ways, that predicting via RM is pretty complicated and subject to many errors (such as curve fitting, normalizing input improperly with wide change in scaling regimes during learning/training vs. validating, etc...)."

Yep...these are issues and skills needed with most packages. The corporate grade stuff does all this automatically but we are talking software that is in the multiple 10s of thousands of dollars retail. For a free package you've got to understand and do this yourself. Lack of understanding is why most people who try predictive modeling with NN fail. There is an IT term GIGO -Garbage In, Garbage Out.

4)"I think you are jumping a bit over my capability on the use of the tool by jumping to designing a system with specific trading goals(breakout, etc.).
I'm just not certain of how to translate those goals via RM, yet, although an example would help."

Sorry about that.....you'll find I'll do that...keep reminding me not to.

"From my personal limited knowledge with rapidminer, I was expecting to start with a set we could all run to begin with that is not too complex. This way everyone would be able to get a common say xml model to verify I/O conditions and response."

An idea here may be to start with stuff that is easy to model, learn the process then get into the markets. RM has some test datasets, also there are public data mining data set out there used to benchmark software. Or we can do a very simple market related test...let me know what people think.

"Like say a simple daily equity model (like Qs for instance) nn, say with maybe 1,000 days of data for training and the end result would be to validate the one day prediction over some period and quantify the success over an out of sample set.
This is not so much to validate a system, for me, but to gain a rudimentary knowledge of using rapidminer and ironing out the bugs with some others attempting and looking at potential pitfalls, etc..."

1000 bars is on the small side. Youâd be better off with 10,000 even having to use hourly. You may not trade hourly that but you can learn with it.

Also perhaps forget the NN part to start as it is the hardest to master. Start with say a Decision Tree or Rule problem. Use performance as a base then get the NN to improve performance.

Or conversely take a know structure of a simple trading system, there are many â¦. MACD crossover with RSI confirmation and ADX for exit, and model a prediction of its success or failure on each trade. There you free yourself from the complexity of data normalization as most TIs are by their nature normalized â¦0 to 100 for the RSI.

Jerry030