Hi all,
I have been modelling the effect economic indicators/earnings has on equities. The equities I am looking at are not the equities under the lime light during the event. An example (I currently have a trade on) is how DHI will react to LEN earnings announcement. Another example is how a company will react to an Economic release (Currently long MA vol into GDP numbers).
My process certainly over fits the data especially for the economic indicator trades, and I was wondering if I could get some advice on how to reduce the bias. Once I get my data into a data frame, I work through each ticker in the top/bottom percentile to find the best candidate for the trade (time consuming and definitely not efficient).
On a side note, if anyone knows of a cheaper source than Trading Economics please lmk.
Process For earnings: This is mostly done on Bberg, I look at the supply chain of each company and scan for the highest $ relationship divided by the suppliers/customers mkt cap. I also use peers. Once a list of stocks is created, I compare how much the stocks move on the target companies earnings dates vs how much they move on the target company NON-earnings dates. Any interesting candidates will be further looked into. The problem lies in how significant the move really is. I'll post code and the simple math down below.
Process for Econ Indicators: This is really biased and probably my biggest issue. We have GDP numbers this week, and what I have done is scanned for the stocks that move the most on GDP announcement days. I use the mean, median and STD of the returns. I am currently using the free data (last 4 quarters) at Economic Trading.
The code is in R but I will right sentences in ## so you can follow along. This is screening for trades around economic releases. I am only using the 4 historical dates from Trading Economics (Usually use Bberg, however this is not a forever source).
Here are the top long vol trades for the GDP announcement. Then I need to make sure none of the stocks had earnings on the GDP dates etc.. you can see this becomes daunting. Any help here would be MUCH appreciated thank you!!!!
I have been modelling the effect economic indicators/earnings has on equities. The equities I am looking at are not the equities under the lime light during the event. An example (I currently have a trade on) is how DHI will react to LEN earnings announcement. Another example is how a company will react to an Economic release (Currently long MA vol into GDP numbers).
My process certainly over fits the data especially for the economic indicator trades, and I was wondering if I could get some advice on how to reduce the bias. Once I get my data into a data frame, I work through each ticker in the top/bottom percentile to find the best candidate for the trade (time consuming and definitely not efficient).
On a side note, if anyone knows of a cheaper source than Trading Economics please lmk.
Process For earnings: This is mostly done on Bberg, I look at the supply chain of each company and scan for the highest $ relationship divided by the suppliers/customers mkt cap. I also use peers. Once a list of stocks is created, I compare how much the stocks move on the target companies earnings dates vs how much they move on the target company NON-earnings dates. Any interesting candidates will be further looked into. The problem lies in how significant the move really is. I'll post code and the simple math down below.
Process for Econ Indicators: This is really biased and probably my biggest issue. We have GDP numbers this week, and what I have done is scanned for the stocks that move the most on GDP announcement days. I use the mean, median and STD of the returns. I am currently using the free data (last 4 quarters) at Economic Trading.
The code is in R but I will right sentences in ## so you can follow along. This is screening for trades around economic releases. I am only using the 4 historical dates from Trading Economics (Usually use Bberg, however this is not a forever source).
Code:
##Get all unique tickers from S&P500, Nasdaq 100 and tickers that have weeklies. But lets get rid of BRK/B.
get.tickers = function(){
spx.url = "https://en.wikipedia.org/wiki/List_of_S%26P_500_companies"
nas.url = "https://en.wikipedia.org/wiki/NASDAQ-100"
cboe.url = "http://www.cboe.com/products/weeklys-options/available-weeklys"
table.s = read_html(spx.url)%>%html_nodes("table")%>%.[1]%>%html_table(fill = T)
spx.ticker = as.data.frame(table.s)$Symbol
table.n = read_html(nas.url)%>%html_nodes("table")%>%.[3]%>%html_table(fill=T)
nas100 = as.data.frame(table.n)$Ticker
table.w = as.data.frame(read_html(cboe.url)%>%html_nodes("table")%>%.[5]%>%html_table(fill = T))
colnames(table.w) = table.w[1, ] # the first row will be the header
table.w = table.w[-1, ]
tickers = str_replace(table.w$Ticker, '\\*', '')
tickers = tickers[1:which(tickers == "ZTS")]
all.tickers = unique(c(tickers, spx.ticker, nas100))
return(all.tickers)
}
tickers = get.tickers()
tickers = tickers[-c(which(ticker == "BRK/B"))]
Code:
##GDP QoQ dates
dates.gdp = c("2018-10-26", "2018-11-28", "2018-12-21", "2019-02-28")
##Get the stock price data and /returns for all dates and dates on GDP day since 2018 since that is where our first GDP date starts.
##I am not using parallel programming for this call because I can not get the data into an environment/list if I do.
hub = new.env()
lapply(tickers, getSymbols, from = "2018-01-01", env = hub)
adjusted = lapply(hub, Ad)
returns = do.call(merge, lapply(adjusted, ROC))
abs.returns = na.omit(abs(returns))
non.event.mean = apply(abs.returns, 2, mean)
non.event.median = apply(abs.returns, 2, median)
event.mean = apply(abs.returns[dates.gdp], 2, mean)
event.median = apply(abs.returns[dates.gdp], 2, median)
sd.event.mean = apply(abs.returns[dates.gdp], 2, sd)
median.dif = log(event.median/non.event.median)
mean.dif = log(event.mean/non.event.mean)
all.data = as.data.frame(cbind(median.dif, mean.dif, sd.event.mean))
View(all.data)
Here are the top long vol trades for the GDP announcement. Then I need to make sure none of the stocks had earnings on the GDP dates etc.. you can see this becomes daunting. Any help here would be MUCH appreciated thank you!!!!