Fitting ARIMA-GARCH model on Forex timeseries

Delpi · Mar 12, 2023

Hi, I'm building a trading strategy based on an ARIMA-GARCH model that make one-step ahead prediction on forex market (it's a quite simple strategy, I know there's a high probability that it won't work profitably, I'm just trying to learn new stuff) but I'm not sure how long should be the time series used to properly fit a model like that. I tried with the last 10 years (1 hour timeframe, in other words 60 000 prices) but I think it's too much. Can you help me?

traider · Mar 12, 2023

Fit more to ensure no overfitting

Real Money · Mar 12, 2023

IMO, it's much better to combine a VWAP and residual based model.

Get the CME futures data.

longandshort · Mar 12, 2023

Arima garch not a good fit for raw fx data lol.

use garch to estimate covariance variance matrix

but the real drivers of fx (the channel range) are time varying real rate differentials, bop imbalances, and relative economic activity.

Kevin Schmit · Mar 12, 2023

Delpi said:
Hi, I'm building a trading strategy based on an ARIMA-GARCH model ...Can you help me?

Yes, probably. But you need to provide us more information. What language are you operating in? What is the order of your ARIMA model and how did you determine that. Assuming GARCH(1,1), what variety of of GARCH are you using. Are you modeling in levels or in differences? Are you modeling in pairs (e.g EURUSD) or in currencies (e.g. EUR)?

60k+ observations is likely adequate, it is not "too long." One thing to keep in mind is that the ARIMA-GARCH process itself (particularly in the mean (arima) model half) may be non-stationtionary, so the coefficients may vary over your ten year time frame.

I know there's a high probability that it won't work profitably, I'm just trying to learn new stuff

I think this is a good idea. Fitting classic timeseries methods to fx returns or levels is worth your time and effort. Following Box, Jenkins, Engle, and Bollerslev is likely much more profitable than listening to the average ET poster, so be careful about taking the advice you get here. ARIMA-GARCH models on FX single currency indices aren't that bad, even on the one hour sampling frame. Almost always Thiel's U well under one and highly significant vs. naive GBM fit.

longandshort · Mar 13, 2023

Kevin Schmit said:
I think this is a good idea. Fitting classic timeseries methods to fx returns or levels is worth your time and effort. Following Box, Jenkins, Engle, and Bollerslev is likely much more profitable than listening to the average ET poster, so be careful about taking the advice you get here. ARIMA-GARCH models on FX single currency indices aren't that bad, even on the one hour sampling frame. Almost always Thiel's U well under one and highly significant vs. naive GBM fit.

Wouldn’t he want to convert raw prices into returns first? Also, wondering what you think about arima on returns within a BEER constrained channel.

Kevin Schmit · Mar 13, 2023

longandshort said:
Wouldn’t he want to convert raw prices into returns first?

That depends on the answer to my other questions. He might want to at least convert his price series to log or Isometric log ratio (fx rates are compositional, price/value matrix is transitive and ratio symmetric, and rank deficient by 1) values and then either difference manually (equivalent to log returns) or let arima do the differencing by not setting d (in arima(p,d,q)) order to zero. Although you'd think the result would be the same, with most software (assuming OP is not writing his arima routine from scratch, though that would be a good exercise) that is not the case. In R arima will initialize using a diffuse prior if d is fit, also it will not include an intercept, which it will include if d is not fit. Other implementations in other languages will have their own idiosyncrasies.

Also, wondering what you think about arima on returns within a BEER constrained channel.

For the OP and his one hour ahead forecasts? Probably wouldn't bother. At that time frame I would expect returns to be dominated by short-term flows and nominal interest rate movements, the major components of BEER will have little impact. For my own fx models, I generally will decompose first into single currency indices and then decompose those in the frequency domain into various intrinsic mode functions (IMF's) using something similar to VMD (variational mode decomposition). Since my forecast horizon is roughly 10 hours (Asia, Europe, North America), and the total variance is dominated by the higher frequency IMF's, I generally don't bother forecasting the slower IMF's. So relevant features for me don't include BOP, differential inflation, terms of trade, or any other macro variables.

If you are forecasting longer term, then yes, arima with a BEER or even PEER channel sounds like a good idea. I'd do it in levels, though, over single currency indices transformed by Isometric Log Ratio.

Edit: also, doing the differencing himself rater than letting arima do it, would give the OP more control over how he does the differencing.

Delpi · Mar 13, 2023

Kevin Schmit said:
Yes, probably. But you need to provide us more information. Are you modeling in levels or in differences? Are you modeling in pairs (e.g EURUSD) or in currencies (e.g. EUR)?

60k+ observations is likely adequate, it is not "too long." One thing to keep in mind is that the ARIMA-GARCH process itself (particularly in the mean (arima) model half) may be non-stationtionary, so the coefficients may vary over your ten year time frame.

I think this is a good idea. Fitting classic timeseries methods to fx returns or levels is worth your time and effort. Following Box, Jenkins, Engle, and Bollerslev is likely much more profitable than listening to the average ET poster, so be careful about taking the advice you get here. ARIMA-GARCH models on FX single currency indices aren't that bad, even on the one hour sampling frame. Almost always Thiel's U well under one and highly significant vs. naive GBM fit.

What language are you operating in?

I'm using Python, but in some parts of the script I've added a bit of R (only to find the best orders using the lowest AIC method, because it's easier in R than in Python).

What is the order of your ARIMA model and how did you determine that. Assuming GARCH(1,1), what variety of of GARCH are you using.

I tried several models and used the AIC to determine the best one. Actually, I'm only using the classic GARCH(1, 1), I haven't introduced more complex models yet.

In this table you can find the pairs I'm modelling with the relatives ARIMA orders; I don't use returns (that should be stationary) but the closing prices and let the model differencing the series.
I think it' off-topic, anyway, I don't use the prices "as is", but I multiply them before the use (e.g. for the EURUSD I use a 1000x multiplier) because the variations in prices are too small and I've had problems with the GARCH fitting (the error was "Inequality constrains incompatible").
The following is the function i use to make one-step ahead prediction

Code:

def forecasting(data, distrib, order):
    arimamodel = ARIMA(data, order=order)
    arimafit = arimamodel.fit()
    resids = arimafit.resid
    pred_mu = arimafit.forecast(steps=1)
    archmodel = arch.arch_model(resids, p=1, q=1, vol="GARCH", dist=distrib, rescale=False)
    archfit = archmodel.fit()
    forc = archfit.forecast(horizon=1)
    pred_et = forc.mean["h.1"].iloc[-1]
    prediction = pred_mu + pred_et
    return prediction

longandshort · Mar 13, 2023

Really appreciate the follow up!

Kevin Schmit said:
That depends on the answer to my other questions. He might want to at least convert his price series to log or Isometric log ratio (fx rates are compositional, price/value matrix is transitive and ratio symmetric, and rank deficient by 1) values and then either difference manually (equivalent to log returns) or let arima do the differencing by not setting d (in arima(p,d,q)) order to zero. Although you'd think the result would be the same, with most software (assuming OP is not writing his arima routine from scratch, though that would be a good exercise) that is not the case. In R arima will initialize using a diffuse prior if d is fit, also it will not include an intercept, which it will include if d is not fit. Other implementations in other languages will have their own idiosyncrasies.

Would you fit in Kalman filtering on your arima or do you prefer to analyze non-smoothed/filtered timeseries?

For the OP and his one hour ahead forecasts? Probably wouldn't bother. At that time frame I would expect returns to be dominated by short-term flows and nominal interest rate movements, the major components of BEER will have little impact. For my own fx models, I generally will decompose first into single currency indices and then decompose those in the frequency domain into various intrinsic mode functions (IMF's) using something similar to VMD (variational mode decomposition). Since my forecast horizon is roughly 10 hours (Asia, Europe, North America), and the total variance is dominated by the higher frequency IMF's, I generally don't bother forecasting the slower IMF's. So relevant features for me don't include BOP, differential inflation, terms of trade, or any other macro variables.

If you are forecasting longer term, then yes, arima with a BEER or even PEER channel sounds like a good idea. I'd do it in levels, though, over single currency indices transformed by Isometric Log Ratio.

Edit: also, doing the differencing himself rater than letting arima do it, would give the OP more control over how he does the differencing.

Very helpful, thank you for that. I agree that decomposing any sort of model or price is the most useful first step. I bring up BEER (or really any longer term economic model) mainly as a proxy for economic analysis. Are you trading around informational catalysts or prefer something closer to trend following? Or something else!

Kevin Schmit · Apr 9, 2023

longandshort said:
Would you fit in Kalman filtering on your arima or do you prefer
to analyze non-smoothed/filtered time series?

I don't actually use ar(i)ma, or even arima-X in my analysis as I find that kind of model too restrictive. I think that arma might be a good model for the OP to experiment with, even if it might only, on his hourly fx data, put out statistically significant but not economically significant (i.e. tradable) results. I would drop the i and just do an arma model. Difference or pseudo-difference outside the model and then run arma. In arima, the d term can bleed into p and q as compensation for slight (non-integer) over- or under-differencing. I would also drop the GARCH and instead run an instantaneous RV estimation via over-lapping five minute bars or realized kernels. Then use those vols, sampled hourly and adjusted for daily and weekend seasonals, to deflate his hourly returns (differenced) series (turn that series into standard scores). Then run the arma on that ss series.

I would not further smooth the series except to possibly bring in outliers.

I might recommend a kalman filter to estimate time-varying the coefficients of his arma model. It is unlikely that the arma model will be stationary and, because it is easy to recast arma models in state-space representation, kalman is a good way to get a decent estimate of time-current coefficients.

For my own work I might do a great deal of smoothing or denoising on both the left and right hand side of my models -- but different forms/methods for each side.

For time-varying parameters, weightings, and coefficients I usually prefer kernel methods to Kalman or similar state-space filters (though I may find myself at times using Godambe-style estimating functions). Kalman is essentially an exponential kernel, which I have found to be sub-optimal and to lack usable escorts (for cross validation -- see Hong 2017). Also Kalman does not play well with intercept adjustment as an ad-hoc band-aid for structural breaks.

I agree that decomposing any sort of model or
price is the most useful first step.

I do a lot of decompositon, of all different kinds. For the OP, I suggest that he first decompose the forex pairs into their constituent parts (numerator and denominator or left and right CCY) and analyze the 9 CCY's in his table individually instead of via an oddly weighted sample of the 36 cross pairs.

I would also suggest that he do the above suggested decomposition into vol and standard score series iteratively via alternating principal axis factoring (of the 9 CCY's) as, for example, EUR may be a combination of several latent factors, each having its own time-varying vol. Since the ss isolation depends on the vol isolation and vice-versa, an alternating optimization is called for.

Are you trading around informational catalysts or prefer
something closer to trend following? Or something else!

My models are fairly standard classification and regression models. I have a lhs of approximately normally distributed real numbers (regression) or labels (classificaton) dv's, and a right hand side design matrix of mostly real numbers.

The dv's are almost always some transformation of 1 period ahead returns or some surrogate thereof. My forecast horizon is short, basically a third of a day (asia, europe, north america).

My go to models are ensembles of something similar to seperable nlls, and its classification analogs, with a strong linear part and a heavily constrained/regularized non-linear part. I also use a form of sweep (Little and Rubin 1987) to handle time-varying number of IV's and real-time missing IV's.

My two biggest secrets which aren't really secrets since everyone knows of them (except on ET) are:

1) p >> n
2) time-varying parameters and coefficients