Fitting ARIMA-GARCH model on Forex timeseries

traider · Apr 9, 2023

Kevin Schmit said:
I don't actually use ar(i)ma, or even arima-X in my analysis as I find that kind of model too restrictive. I think that arma might be a good model for the OP to experiment with, even if it might only, on his hourly fx data, put out statistically significant but not economically significant (i.e. tradable) results. I would drop the i and just do an arma model. Difference or pseudo-difference outside the model and then run arma. In arima, the d term can bleed into p and q as compensation for slight (non-integer) over- or under-differencing. I would also drop the GARCH and instead run an instantaneous RV estimation via over-lapping five minute bars or realized kernels. Then use those vols, sampled hourly and adjusted for daily and weekend seasonals, to deflate his hourly returns (differenced) series (turn that series into standard scores). Then run the arma on that ss series.

I would not further smooth the series except to possibly bring in outliers.

I might recommend a kalman filter to estimate time-varying the coefficients of his arma model. It is unlikely that the arma model will be stationary and, because it is easy to recast arma models in state-space representation, kalman is a good way to get a decent estimate of time-current coefficients.

For my own work I might do a great deal of smoothing or denoising on both the left and right hand side of my models -- but different forms/methods for each side.

For time-varying parameters, weightings, and coefficients I usually prefer kernel methods to Kalman or similar state-space filters (though I may find myself at times using Godambe-style estimating functions). Kalman is essentially an exponential kernel, which I have found to be sub-optimal and to lack usable escorts (for cross validation -- see Hong 2017). Also Kalman does not play well with intercept adjustment as an ad-hoc band-aid for structural breaks.

I do a lot of decompositon, of all different kinds. For the OP, I suggest that he first decompose the forex pairs into their constituent parts (numerator and denominator or left and right CCY) and analyze the 9 CCY's in his table individually instead of via an oddly weighted sample of the 36 cross pairs.

I would also suggest that he do the above suggested decomposition into vol and standard score series iteratively via alternating principal axis factoring (of the 9 CCY's) as, for example, EUR may be a combination of several latent factors, each having its own time-varying vol. Since the ss isolation depends on the vol isolation and vice-versa, an alternating optimization is called for.

My models are fairly standard classification and regression models. I have a lhs of approximately normally distributed real numbers (regression) or labels (classificaton) dv's, and a right hand side design matrix of mostly real numbers.

The dv's are almost always some transformation of 1 period ahead returns or some surrogate thereof. My forecast horizon is short, basically a third of a day (asia, europe, north america).

My go to models are ensembles of something similar to seperable nlls, and its classification analogs, with a strong linear part and a heavily constrained/regularized non-linear part. I also use a form of sweep (Little and Rubin 1987) to handle time-varying number of IV's and real-time missing IV's.

My two biggest secrets which aren't really secrets since everyone knows of them (except on ET) are:

1) p >> n
2) time-varying parameters and coefficients

Have you had success with HMM ? They provide an alternative to 2 except the switching is more discrete.

Kevin Schmit · Apr 10, 2023

traider said:
Have you had success with HMM ? They provide an alternative to 2 except the switching is more discrete.

Not for what I am doing now. I rely on deeply nested leave-block-out cross validation and generalized (rotated into complex plane) LBO and LFO cross validation that rely on clever tricks with additive hat/annihilator matrices to run in anything approaching real-time. I have not been able to figure out how to fit HMM's/HCRF's into that structure.

Have used HMM, HMRFM (Hidden Markov Random Field Model), HCRFM (Hidden Conditional Random Field Model) extensively in the past. Most recently for basket pairs trading, modeling the baskets over three states: cointegrating, nearly cointegrating, and not-cointegrating, and then trading an SVECM on the projected error-correcting SN's within the basket only while in the cointegrating regime. Also some some denoising of the SN series prior to fitting the SVECM.