Forecasting stock prices with a feature fusion LSTM-CNN model using different representations

nooby_mcnoob · Dec 5, 2019

Another paper which says it will beat the returns of many active managers:

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0212320

I didn't really find anything super novel in this paper except I've tried to do exactly the same thing and failed. Obviously, I am stupid but these guys included the parameters of their layers so I will try and reproduce their results.

I don't know why they don't just include Jupyter notebooks. Ah well.

May I just say I absolutely love this website. All the important metadata about the paper is there with links. Much nicer than arXiv

nooby_mcnoob · Dec 5, 2019

Oh shit I figured out what they did different, they used closing price + image. I used OHLC + image. That makes a lot of sense.

IAS_LLC · Dec 6, 2019

This:
"Forecasting stock prices is an attractive pursuit for investors and researchers who want to beat the stock market. However, forecasting stock prices is difficult."

nooby_mcnoob · Dec 6, 2019

IAS_LLC said:
This:
"Forecasting stock prices is an attractive pursuit for investors and researchers who want to beat the stock market. However, forecasting stock prices is difficult."

I don't want to forecast, just want to pattern match a OHLC time series. I am not sure if I would ever make use of it, but why not try :-)

IAS_LLC · Dec 6, 2019

I too have attempted using a ConvNet-LSTM stack (didnt care for it...LSTM is to ad-hoc for my tastes), but I didn't run images through it... Not sure why you would when you have a more compact representation in the raw-data data that built the chart in the first place.

nooby_mcnoob · Dec 6, 2019

IAS_LLC said:
I too have attempted using a ConvNet-LSTM stack (didnt care for it...LSTM is to ad-hoc for my tastes), but I didn't run images through it... Not sure why you would when you have a more compact representation in the raw-data data that built the chart in the first place.

What do you mean by LSTM is too ad-hoc? All of deep learning is ad-hoc IMO

nooby_mcnoob · Dec 6, 2019

IAS_LLC said:
I too have attempted using a ConvNet-LSTM stack (didnt care for it...LSTM is to ad-hoc for my tastes), but I didn't run images through it... Not sure why you would when you have a more compact representation in the raw-data data that built the chart in the first place.

Also, they say that you experience degradation when your stack is too deep and had a way around it by short-circuiting . I never tried that, and I don't exactly understand it either so that's another rabbit hole.

A degradation problem may occur even if the network is deeply piled up. To solve this problem, He et al. [26] used a shortcut connection for residual learning, as shown in Fig 6b. In the case of a shortcut connection, the input X is mapped to the feature F(X) through the activation function without going through the weight layer.

Eq 5), as follows, where X is an input matrix and F(X) and H(X) are output matrices. Setting the residual to zero makes the optimization easier. This method can solve the problem of degradation due to the deepening of the network [26].

IAS_LLC · Dec 6, 2019

nooby_mcnoob said:
What do you mean by LSTM is too ad-hoc? All of deep learning is ad-hoc IMO

I agree with you.

But I believe there are better solutions for time series modeling than LSTM that are grounded in system-id and Bayesian estimation (at least for my purposes). HMMs are my go to.

Also, the stated advantage of LSTM with regard to vanishing gradients is solved in conventional ANNs using a leaky relu

nooby_mcnoob · Dec 6, 2019

IAS_LLC said:
I agree with you.

But I believe there are better solutions for time series modeling than LSTM that are grounded in system-id and Bayesian estimation (at least for my purposes). HMMs are my go to.

Also, the stated advantage of LSTM with regard to vanishing gradients is solved in conventional ANNs using a leaky relu

I can definitely agree that LSTM isn't the best solution for our problem. What do you mean by system-id and HMM?

nooby_mcnoob · Dec 6, 2019

nooby_mcnoob said:
I can definitely agree that LSTM isn't the best solution for our problem. What do you mean by system-id and HMM?

Oh, Markov models.