Quote from comintel:
According to Wikipedia:
"Broadly speaking, StatArb is actually any strategy that is bottom-up, beta-neutral in approach and uses statistical/econometric techniques in order to provide signals for execution. Signals are often generated through a contrarian mean-reversion principle, but can also be formed by lead/lag effects, extreme psychological barriers[citation needed], corporate activity, as well as short-term momentum. This is usually referred to as a multi-factor approach to StatArb."
Good reference.
Mainly the step most people calculate is calculating how far away the mean is from the mean on a normalized basis. You take a difference of a difference and divide by the standard deviation to normalize your results. The process really refers to first differencing a time series, and is what my system at
www.collective2.com/go/pairsqidqld is based on. Really ingenius, but the research that went into it dates back to the mid 80's when quants discovered pairs trading. They weren't fortunate enough to live in a day where ETF's could serve as eternal pairs and so didn't have to bother themselves with "finding" the pairs. We have perfectly negatively correlated pairs with 2 x leverage and 2 x leverage inverse ETF's.
All good stuff.
Stat arb produces results looking like a regular linear regression and mostly you'll find things are perfectly priced with very few mispricings. Incidentally, this is usually done in STATA, SAS, or Excel.
One of my most recent ones took the log of the market cap and regressed it onto the log of revenue and about 27 other variables with results that looks like this with a high R^2 of 0.88
Source | SS df MS Number of obs = 500
-------------+------------------------------ F( 28, 471) = 133.61
Model | 495.498531 28 17.6963761 Prob > F = 0.0000
Residual | 62.3830638 471 .132448118 R-squared = 0.8882
-------------+------------------------------ Adj R-squared = 0.8815
Total | 557.881595 499 1.11799919 Root MSE = .36393
------------------------------------------------------------------------------
logofmarke~p | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
logofrevenue | .8916765 .0154872 57.58 0.000 .8612439 .9221091
var4 | -.2091459 .0369259 -5.66 0.000 -.2817059 -.136586
var5 | -.0752378 .0314276 -2.39 0.017 -.1369935 -.0134821
var6 | .005468 .0396742 0.14 0.890 -.0724924 .0834284
var7 | .561904 3.969468 0.14 0.887 -7.238154 8.361962
var8 | .0037096 .0089385 0.42 0.678 -.0138546 .0212739
var9 | 2.968399 .633364 4.69 0.000 1.72383 4.212968
var10 | .012367 .0089514 1.38 0.168 -.0052227 .0299567
var11 | -.2553104 .3799248 -0.67 0.502 -1.001868 .4912469
var12 | .3935184 .1404802 2.80 0.005 .1174729 .6695639
var13 | -.8261104 .1359956 -6.07 0.000 -1.093344 -.5588772
var14 | -27.7728 2.798556 -9.92 0.000 -33.272 -22.2736
var16 | -.4961916 .1187044 -4.18 0.000 -.7294474 -.2629358
var17 | .3137915 .0928498 3.38 0.001 .1313404 .4962427
var18 | .440714 .1671892 2.64 0.009 .1121851 .769243
var19 | .7017086 3.978552 0.18 0.860 -7.116199 8.519616
var20 | -1.297751 .3029049 -4.28 0.000 -1.892963 -.7025383
var21 | 1.136644 .2618217 4.34 0.000 .6221605 1.651127
var22 | -3.173427 .92185 -3.44 0.001 -4.984874 -1.361979
var23 | (dropped)
var24 | .0009775 .0061591 0.16 0.874 -.0111252 .0130802
var25 | .0974009 .0497012 1.96 0.051 -.0002626 .1950643
var26 | -.0144898 .0814906 -0.18 0.859 -.1746199 .1456403
var27 | -.1635945 .0436717 -3.75 0.000 -.24941 -.077779
var28 | .0018706 .0014 1.34 0.182 -.0008804 .0046217
var29 | .0029186 .0008508 3.43 0.001 .0012468 .0045904
var30 | .2466865 .0140288 17.58 0.000 .2191198 .2742533
var31 | -.1140925 .120899 -0.94 0.346 -.3516607 .1234757
var32 | .1789512 .1083369 1.65 0.099 -.0339323 .3918346
_cons | 1.956292 .3624543 5.40 0.000 1.244065 2.66852
------------------------------------------------------------------------------