Recommendations on time-series price prediction models?

jublin · Apr 15, 2021

longandshort said:
I worked briefly at a quant fund (I'm a trader and not a quant, but I worked closely with them to design and implement strategies they were developing) and they were spending a lot of time and money on:
1. Using volume to predict volatility
2. Using volatility to predict price change

Based on the research we had conducted, there was very little (if any) signal in prices themselves. They were looking to trade intra-day momentum, reversals, and sniff out larger orders.

Thanks for the info! Yeah the training I did already include volume information. But the model trained are probably not as sophisticated in characterizing volatility as semi-manually designed models.

Also sniffing out larger orders is very interesting. In manual trading, I also found it to be a pretty useful (dark pool, etc.). Some trading platform like webull provides inferred capital flows categorized into buckets of large, medium and small orders (but seems not very useful.) A search online shows some articles on methods to sniff out large orders, like this one: https://exegy-signum.com/insights/hiding-and-seeking-with-iceberg-orders/, but requires level 2 or 3 data which I don't have.

ph1l · Apr 15, 2021

jublin said:
Hi thanks, this is great. Indeed, the mentioned ETF increased from 4/6 to 4/12 and then went down. I didn't expect such a simple model to be able to predict with reasonable accuracy. Wonder if there is any article I can read about this in more depth? Is it mostly a frequentist model based on observation or there is a Bayesian explanation on why there are such long-term oscillations?

As I mentioned in this post, The Profit Magic of Stock Transaction Timing by J.M. Hurst covers the concept. "Technical Analysis of the Financial Markets," by John J. Murphy covers this in the "Time Cycles" chapter.

"Decoding The Hidden Market Rhythm - Part 1: Dynamic Cycles," by Lars von Thienen covers a similar concept with detrended sums of sinusoids projecting turning points.

The cycles in the sinusoids and the trend of asset prices continually change, so they need to be recalculated (e.g., for each new bar).

I'd think an LSTM as you proposed in the first post of this thread could be used to do something similar.

userque · Apr 16, 2021

jublin said:
Hi I agree that using a "recursive" approach is dicy and it amplifies error over time. For sure, the one-step model can be applied "recursively" in prediction. By "being not recursive", I mean that the training phase didn't take into account any supervision from more than one day ahead. This resulted in an easy "local optimum" --- simply, more or less, use today's price as the prediction of tomorrow's price. Using this setup, the training phase is not forced to dig any deeper. (By applying the trained model recursively in prediction, it indeed quickly converges to a constant price.)

What I meant to ask is are there any better practices that can force the model to try to predict multiple days, or even months ahead? By "forcing" I mean that the time frame is taken into account in the training phase. I'm not sure how hard it is, this is my first training. Maybe I should try an easier problem formulation, like classification instead of predicting.

I trained for 100 epochs but it cleared converged after about 5 and stayed there. It took less than a hour. I used 15 years of hourly data.

You can use multiple outputs, instead of just one.

You can train a model to forecast multiple bars ahead, non-recursively.

The loss function would take into account the multiple bars ahead, rather than just one.

You can "take into account" more than one bar ahead, without being recursive.

jublin · Apr 16, 2021

ph1l said:
As I mentioned in this post, The Profit Magic of Stock Transaction Timing by J.M. Hurst covers the concept. "Technical Analysis of the Financial Markets," by John J. Murphy covers this in the "Time Cycles" chapter.

"Decoding The Hidden Market Rhythm - Part 1: Dynamic Cycles," by Lars von Thienen covers a similar concept with detrended sums of sinusoids projecting turning points.

The cycles in the sinusoids and the trend of asset prices continually change, so they need to be recalculated (e.g., for each new bar).

I'd think an LSTM as you proposed in the first post of this thread could be used to do something similar.

Got it, thanks!

jublin · Apr 16, 2021

userque said:
You can use multiple outputs, instead of just one.

You can train a model to forecast multiple bars ahead, non-recursively.

The loss function would take into account the multiple bars ahead, rather than just one.

You can "take into account" more than one bar ahead, without being recursive.

Yeah, I forgot to mention I tried that, like using 5 days of data, instead of 1, in the training. The results are bad. I probably can tune the network to improve the results but I don't think it's a good investment of time at this point. I was looking for some online article talking about some widely accepted starting point for predicting multiple days ahead using deep learning. Can't find any. Everything I found only predict one day ahead. Anyone can conceive such a network, but I'm looking for an article showing an example where it actually works.

userque · Apr 16, 2021

jublin said:
Yeah, I forgot to mention I tried that, like using 5 days of data, instead of 1, in the training. The results are bad. I probably can tune the network to improve the results but I don't think it's a good investment of time at this point. I was looking for some online article talking about some widely accepted starting point for predicting multiple days ahead using deep learning. Can't find any. Everything I found only predict one day ahead. Anyone can conceive such a network, but I'm looking for an article showing an example where it actually works.

Did you standardize/normalize the multiple outputs ... as well as the inputs?

ph1l · Apr 16, 2021

jublin said:
Everything I found only predict one day ahead. Anyone can conceive such a network, but I'm looking for an article showing an example where it actually works.

This might be one.
https://analyticsindiamag.com/hands...t-neural-network-for-stock-market-prediction/

Our task is to predict stock prices for a few days, which is a time series problem. The LSTM model is very popular in time-series forecasting, and this is the reason why this model is chosen in this task.
...
The plot is shown in the below image.

I am illiterate in Python, so I can't tell how long "a few days" is. This line

Code:

X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))

makes me think few == 1.

ph1l · Apr 19, 2021

userque said:
Rather than limit your models to cos and x^2, have you considered using 'all' functions and genetic programming?

For the same data I used in this post, I tried to fit a curve using genetic programming.

The generated, fitted function for the close prices is

Code:

y =
       0: R4 = R2 * cos (-81.5485)
       1: R0 = 50.7399 - R4
       2: R4 = R4 * cos (0.199104)
       3: R3 = sqrt (R4)
       4: R4 = 41.8048 * cos (R3)
       5: R4 = 4.3255 / R4
       6: R4 = atan2 (R4 / R0)
       7: R0 = R0 * cos (87.6408)
       8: R0 = R4 + R0
       9: R4 = R4 * sin (-4.13973)
      10: R0 = R4 + R0
      11: R2 = -44.0889 * sinh (R0)
      12: R4 = abs (R0)
      13: R3 = log (R4)
      14: R4 = atan2 (R2 / 70.0589)
      15: R0 = R4 + R0
      16: R2 = asinh (R0)
      17: R1 = 13.707 * sin (R3)
      18: R1 = R1 * sin (-37.162)
      19: R4 = 1.39025 * sin (R4)
      20: R1 = R1 * cos (68.1254)
      21: R3 = tanh (R1 * R3 + R4)
      22: R2 = R2 - R3
      23: R4 = asin (R1)
      24: R1 = sigmoid (-12.9007 * R1 + R4)
      25: R1 = R1 * cosh (R3)
      26: R2 = R1 + R2
      27: R2 = R2 * sin (1.16991)
      28: R2 = R2 * sin (-79.8879)
      29: R0 = 62.65 - R2
      return R0

As before, the only input value is time represented as the offset in calendar days from the start of the data.
R0, R1, etc. are registers which are initialized to the input value and get operated on by mathematical functions.

The fit to the data is comparable in closeness to the fit before, but the predicted future prices are very different.

The overall predicted direction for the fit before and this fit are both still up.

The prices and fitted curve with a parabolic, least squares trend of the fitted curve subtracted are:

Here, the detrended, fitted curve is pointing downward which is the opposite of the detrended, fitted curve from before.

userque · Apr 19, 2021

ph1l said:
For the same data I used in this post, I tried to fit a curve using genetic programming.

The generated, fitted function for the close prices is

Code:

y = 0: R4 = R2 * cos (-81.5485) 1: R0 = 50.7399 - R4 2: R4 = R4 * cos (0.199104) 3: R3 = sqrt (R4) 4: R4 = 41.8048 * cos (R3) 5: R4 = 4.3255 / R4 6: R4 = atan2 (R4 / R0) 7: R0 = R0 * cos (87.6408) 8: R0 = R4 + R0 9: R4 = R4 * sin (-4.13973) 10: R0 = R4 + R0 11: R2 = -44.0889 * sinh (R0) 12: R4 = abs (R0) 13: R3 = log (R4) 14: R4 = atan2 (R2 / 70.0589) 15: R0 = R4 + R0 16: R2 = asinh (R0) 17: R1 = 13.707 * sin (R3) 18: R1 = R1 * sin (-37.162) 19: R4 = 1.39025 * sin (R4) 20: R1 = R1 * cos (68.1254) 21: R3 = tanh (R1 * R3 + R4) 22: R2 = R2 - R3 23: R4 = asin (R1) 24: R1 = sigmoid (-12.9007 * R1 + R4) 25: R1 = R1 * cosh (R3) 26: R2 = R1 + R2 27: R2 = R2 * sin (1.16991) 28: R2 = R2 * sin (-79.8879) 29: R0 = 62.65 - R2 return R0

As before, the only input value is time represented as the offset in calendar days from the start of the data.
R0, R1, etc. are registers which are initialized to the input value and get operated on by mathematical functions.

The fit to the data is comparable in closeness to the fit before, but the predicted future prices are very different.
View attachment 257238
The overall predicted direction for the fit before and this fit are both still up.

The prices and fitted curve with a parabolic, least squares trend of the fitted curve subtracted are:
View attachment 257239
Here, the detrended, fitted curve is pointing downward which is the opposite of the detrended, fitted curve from before.

Thanks.

Looks like you're using a Python Library for the genetic programming?

I don't see where your function takes in any past closing prices as inputs, nor any other 'x' values as inputs; only constants.?! What is 'y' a function of ?????? The R variables are simply a way to simplify the notation for a long function.

I have a stand along genetic programming app. I'm tempted to run your data and see what pops out.

Did you hold out any data for validation? Doesn't seem likely with so little data.

ph1l · Apr 20, 2021

userque said:
Thanks.

Looks like you're using a Python Library for the genetic programming?

I don't see where your function takes in any past closing prices as inputs, nor any other 'x' values as inputs; only constants.?! What is 'y' a function of ?????? The R variables are simply a way to simplify the notation for a long function.

I have a stand along genetic programming app. I'm tempted to run your data and see what pops out.

Did you hold out any data for validation? Doesn't seem likely with so little data.

I wrote the genetic programming part with C++ and opencl. The calculations for the function are done in opencl with single-precision floating point arithmetic. The controlling part is perl and shell (bash). The images are from gnuplot.

The only input to the function is time in the form of number of bars relative to the start of the data (0 through 88 calendar days for the example's data that was fitted). This allows the function to be applied for any time.

The attached inputData.csv has the input data with comma-separated format
<TICKER>,<DTYYYYMMDD>,<TIME>,<OPEN>,<HIGH>,<LOW>,<CLOSE>,<VOLUME>,<UNADJCLOSE>,<UNADJVOLUME>
Calendar days in the data when U.S. stock markets were closed are linearly-interpolated from the previous trading close price.
The candlestick chart has the <OPEN>,<HIGH>,<LOW>,<CLOSE> columns.
The function is fitted on the <CLOSE> column only.
The fitted data and parabolic, least squares trend of the fitted data past the candlesticks is the predicted data (12 bars).

The raw, fitted data including the extra 12 predicted bars is in the attached fitted.txt. This data looks like it has more precision than the actual data because perl converts the single-precision floating point to double-precision.

The actual future data is in the attached unseendata.csv. Since this is recent data for an ETF, there isn't too much of it. This data wasn't used in any calculations or measurements.

Recommendations on time-series price prediction models?

Attachments