System Development with acrary

acrary · Jul 1, 2004

Quote from opmtrader:

I noticed that correlations are used in much of your work. I was wondering if you took sample size into account when performing these correlation calculations. I now use a "correlation rank" in my work after reading the following in Victor Niederhoffer's Practical Speculation:

"In practice, we have found any number above 0.10 or below -0.10 based on 100 or more observations to be useful. The general formula we follow for usefulness is that the correlation coefficient times the number should be greater than 10. Thus, for 50 observation, a correlation of 0.20 would be useful and for 20 observations, a correlation of 0.50 would be useful." (pg. 196 to 197)

I don't really think of sample size when viewing a correlation. All I'm trying to do is see if there is something that could require further followup.

I think Victor is estimating standard error and throwing out some ballpark numbers. For a better estimate:

Error estimate

E = z * std. dev. of sample
_________________________
sqrt of number of samples in test

z = number of std. dev. of normal distribution for the confidence level needed.
z = 3.08 = 99.8% confidence level
z= 2.58 = 99.0% confidence level
z=1.96 = 95.0% confidence level
z=1.645 = 90.0% confidence level

Ex. 20 trades in test
sample correlation = .50
sample std. dev. = .25

If we want to know the estimate of the mean to the 99% level then:

E = 2.58 * .25
_______
sqrt(20)

E = .1442

so with 99% certainty, we know the correlation is .50 +- .1442 (you can expect to see the correlation between .3558 and .6442 in the future)

opmtrader · Jul 1, 2004

Yes your way seems more complete. Thank you for that explanation. I'll have to revise my methods to be more accurate.

Not to take this off track with more questions but I was wondering if there is a correlation threshold that you find significant for further investigation? I know from a past post that you will test a correlation value that is supposedly significant with a value generated from a source you know is relatively insignificant. Is this method still a good approach?

Also do you ever fear that you might be missing a more specific setup by looking at broader correlation relationships? I think you rely on your own methods and creativity to drill down into the data but this might be a problem for others.

Lately I have been using scatterplots across a number of variables to do my strategy exploration. Lets say I want to study gaps. I might log variables like gap size, past 1 day return, past 2 day return ... next 1 day return, next 2 day return, ... 1 day autocorr, volume, etc. I'll then throw up scatterplots of all these relationships and start exploring. Find what matters and what doesn't. I think this is a more robust way of exploring a phenomenon in its entirety rather than the "write rule -> optimize" approach. Any comment on this approach? I'm still experimenting with approaches to generate new trading ideas. As I said I think organization is key. I'd add speed to that list as well.

Please ignore if this takes you off track.

bdixon619 · Jul 1, 2004

from page 89 October 1997, TASC, interview by Thom Hartle with Tushar Chande,

TH: So how did you benchmark these systems?

TC: I benchmarked them using correlation, return-variation ratiio*, ... and proportion of profitable rolling three-, six-, and 12-month intervals. I also noted the average monthly return and standard deviation of monthly returns over the entire dataset. An individual trader can compare his system performance directly to these numbers to get a rough idea of a long-term, breakout-style, trend-following benchmark.

*the return-variation ratio is defined as the average monthly return divided by the standard deviation of monthly returns

.
.
.

from page 90 of the same article,

TH: Whew! That's quite a handful. Could you narrow it down a bit?

TC: I know, it's complicated. I would focus on the return-variability ratio and proportion of profitable three-month periods. You can use statistical tests to check for significance or your subjective assessment of different systems.

.
.
.

from page 75 of the same article,

TH: Could you explain what a point estimate is?

TC: A point estimate means that even though the data may range from some low value to some high value, the summary statistic uses a single value in between.

TH: Is there some other kind of estimate too?

TC: The opposite of point estimates is interval estimates, which say that thereis a x% confidence that the true value lies in an interval between two points.

There are an additional couple other areas covered in the article that have been covered here. I think it is safe to say you have found an acceptable method for system testing. That you have been using this since 1988 is testament to your forward-thinking and enterprising nature.

bdixon619 · Jul 2, 2004

"While useless trades with zero expectancy arguably never hurt anybody, they do consume time, commissions and slippage!" --The Student

There's the writing on the wall as it applied to me when I was looking for ways to trade stocks more efficiently.

I started out reading P/E ratios of stocks. The inverse of this is:

EPS/Current Market Price of Common Stock.

Let's say, I have a volatility breakout system that averages x amount of points per y days. (I never developed systems using dollars of profit, only points.) Then if I am using a rolling average of y days, all I have to do is divide the current day's average by the current day's closing price to give me a general idea how much a particular stock is earning as a percentage of it's price.

Next, I wanted to find out the average trade return. Dividing the rolling average return by the number of trades gives me the average trade amount.

It is then possible to take either one or both of those figures (depending on how you most easily understand the idea) and divide by the amount of commissions paid during this rolling period. e.g. I used 13-day rolling periods so every day all of my numbers for my stocks would change a little bit. I used 13-day periods to help smooth volatility of returns for easier comparison. The reason for this is to try and identify those stocks that are breaking out based on their return statistics. Stocks trending will have lower commissions against returns while ranging stocks will have higher commissions.

Finally, I wanted to keep track of how many trades were taking place (on average) across these rolling 13-day periods.

These two operations produced either two or three columns of data. Plotting either of the first two against the third produced a scatter chart. By using the Autofilter in Excel, various return or trade combinations could be filtered out depending upon what goals were being focused on. Remember, my goal, since I am limited in capital is to employ it in the most efficient manner.

acrary · Jul 6, 2004

Sorry, but I have to cut this journal short. I've asked Magna to close it.

I was offered a head trader position with a obscene compensation package over the weekend. I didn't think I'd want to work for anyone else again, but I guess we all have our price.

One of the conditions is the end of public communications. This is necessarily my last post on ET.

Take care and good luck.

Alan

Magna · Jul 6, 2004

At acrary's request, and at great regret, this journal has been closed.

acrary · Sep 21, 2005

Thank you Magna for re-opening this thread.

Here's a table that I've used with single systems to get an idea of profitability for a period. It uses profit factor and number of trades during a period to estimate the percentage of time the period will be profitable.

For example, if I have a system with a profit factor of 1.50 and it trades 50 times per-month then I'd expect this system by itself to be profitable about 89% of the time. If I wanted the system to be profitable 95% of the time I can see I'd have to increase the trading frequency to about 90 trades per-month.

I printed it out in wordpad in landscape mode for reference.

Chriz · Sep 21, 2005

Acrary, thank you for providing the table. Im interested in the math behind. Can you explain the calculation or point me to a page where i can find further information?

acrary · Sep 21, 2005

It's just a Monte Carlo sim using 100,000 passes per table entry and plugging in the profit factor as a multiplier for the winning trades. Sum the winners and losers until you've reached the number of trades in the pass. Then see if the pass is a winner or loser. The percentage in the table is the number of winners in relation to the 100,000 passes.

acrary · Sep 21, 2005

Jack, thanks for the spreadsheet. I'm no longer under contract. Anything I have worth posting, I'll post here.

I'm done for the day.

System Development with acrary

acrary

opmtrader

bdixon619

bdixon619

acrary

Magna

Administrator

acrary

Attachments

Chriz

acrary

acrary