Yahoo Historical Data - Did they change the URL recently?

d08 · Jul 12, 2017

No, not at all. They seem to be fixing it yet again but now it's inconsistent. The website has one set of data, downloaded CSV another. At least they are doing something but...makes you wonder what kind of people do they hire, they seem to be struggling with the basics.
Take a look at EFA.

Pre 19th June dividend on website:
19 Jun 2017,66.64,66.79,66.60,66.66,65.60,26706100

Downloaded CSV:
2017-06-19,66.639999,66.790001,66.599998,65.598000,66.660004,26706100

Notice the switch in the 5th and 6th column. The correct adjusted close should be 65.60.

inCom · Jul 12, 2017

I can relate to your frustration, I guess, which is the same as any of us who has been using Yahoo so far, but.. truth be told, their data was not perfect even before may. Glitches, missing samples and "null"s have always been there. Not to mention whole pieces of history that were missing if you downloaded today, magically reappearing the following day. Let's hope things will get better. I'm not defending them, but they are clearly passing through a transition phase, euphemistically speaking.

Only, my point is: maybe other sources are not bullet proof either. Nasdaq.com, for example, gives still other numbers compared to Quandl. Which one is telling the truth? Would you be better off by choosing any paid service among others, if all of them gave different data? This is what I'd like to understand.

d08 · Jul 12, 2017

Which numbers differ? Yahoo's adjustment formula is different from Quandl's a little. This is where the differences come from.

I contacted them regarding the missing history (beginning of July) and it was fixed the next day, I also contacted them because June 30 was missing for almost everything just a few days ago (same for 2016). Fixed the next day. So they're very responsive.
I doubt the paid sources are ideal either but I'll let others comment further.

toonerdy · Jul 16, 2017

I am grateful to Dennis Lee for writing updated Yahoo historical data fetching code at https://github.com/dennislwy/YahooFinanceAPI , discussed at https://stackoverflow.com/questions/44030983/yahoo-finance-url-not-working/44050039 .

Using his code as a guide, I have written the following shell script, which I hereby place in the public domain, that uses the grep, sed and wget programs for fetching Yahoo historical data for a stock symbol in comma separated variable (csv) format.

Of course, I disclaim any responsibility for any defects, bad data, or any other results of your using this script. Use it at your own risk.

Code:

#!/bin/sh

cookiefile=/tmp/saved-cookies.$$.txt

if [ $# -lt 1 ] ; then
  echo "Usage: yahooquote-historic symbol" >&2
  exit 1
fi

symbol="$1"

download_summary() {
  wget --quiet --no-check-certificate --no-cache --keep-session-cookies \
    --output-document=- --save-cookies=${cookiefile} \
    "https://finance.yahoo.com/quote/${symbol}?p=${symbol}"
}

extract_crumb() {
  grep CrumbStore |
   sed 's/^.*CrumbStore/CrumbStore/;s/}.*$//;s/.*"crumb":"//;s/"$//;s|\\u002F|/|g'
}

crumb=$(download_summary "$symbol" | extract_crumb)

period2=$(date +%s)

wget --quiet --no-check-certificate --output-document=- \
  --load-cookies=${cookiefile} \
  "https://query1.finance.yahoo.com/v7/finance/download/UVXY?period1=1&period2=1500194366&interval=1d&events=history&crumb=${crumb}"

status=$?
rm -f "${cookiefile}"
exit "$status"

bashatrader · Jul 30, 2017

Be careful using data from Google and other sources. Check for missing entries. Google misses dates over entire database.

toonerdy · Jul 30, 2017

I made a mistake in the script I previously posted, that has the effect of providing historical data only before July 16th, 2017. I had left a numeric unix time code for a time during that date, 1500194366, in the final wget command in that script, instead of "${period2}". Here is a corrected version of the script.

Code:

#!/bin/sh

cookiefile=/tmp/saved-cookies.$$.txt

if [ $# -lt 1 ] ; then
  echo "Usage: yahooquote-historic symbol" >&2
  exit 1
fi

symbol="$1"

download_summary() {
  wget --quiet --no-check-certificate --no-cache --keep-session-cookies \
  --output-document=- --save-cookies=${cookiefile} \
  "https://finance.yahoo.com/quote/${symbol}?p=${symbol}"
}

extract_crumb() {
  grep CrumbStore |
  sed 's/^.*CrumbStore/CrumbStore/;s/}.*$//;s/.*"crumb":"//;s/"$//;s|\\u002F|/|g'
}

crumb=$(download_summary "$symbol" | extract_crumb)

period2=$(date +%s)

wget --quiet --no-check-certificate --output-document=- \
  --load-cookies=${cookiefile} \
  "https://query1.finance.yahoo.com/v7/finance/download/UVXY?period1=1&period2=${period2}&interval=1d&events=history&crumb=${crumb}"

status=$?
rm -f "${cookiefile}"

inCom · Aug 2, 2017

In the end I gave up chasing Yahoo and their mess and started using Quandl API only. Their data seems to be quite stable and reliable, and it's true that they email back within 24h to any question. Also the API is very linear and easy to use.

They don't offer ETFs such as SPY or QQQ in their free plan, so I will probably subscribe to some of their paid versions.

Case closed (for now).

d08 · Aug 19, 2017

Just an update. Yahoo data as of the moment is still full of abnormalities. There are too many to list and some appear to be deliberate so that the data is unusable if downloaded.

d08 · Aug 19, 2017

colion said:
In addition to the data suppliers that you list, there is a new "kid" on the block that might be worth looking into:

https://www.tiingo.com/

Anyone tried them? Looks great but there's no way to properly evaluate their data quality without subscribing.

mokwit · Aug 19, 2017

There is talk on the forums of a new Yahoo historical data JSON API that does not require crumb and cookie and links to the code
https://forums.yahoo.net/t5/Yahoo-Finance-help/Is-Yahoo-Finance-API-broken/td-p/250503/page/22

https://stackoverflow.com/questions/44034229/yahoo-finance-api-stopped-working

https://stackoverflow.com/questions/44030983/yahoo-finance-url-not-working/44050039#44050039