Yahoo Historical Data - Did they change the URL recently?

No, not at all. They seem to be fixing it yet again but now it's inconsistent. The website has one set of data, downloaded CSV another. At least they are doing something but...makes you wonder what kind of people do they hire, they seem to be struggling with the basics.
Take a look at EFA.

Pre 19th June dividend on website:
19 Jun 2017,66.64,66.79,66.60,66.66,65.60,26706100

Downloaded CSV:
2017-06-19,66.639999,66.790001,66.599998,65.598000,66.660004,26706100

Notice the switch in the 5th and 6th column. The correct adjusted close should be 65.60.
 
I can relate to your frustration, I guess, which is the same as any of us who has been using Yahoo so far, but.. truth be told, their data was not perfect even before may. Glitches, missing samples and "null"s have always been there. Not to mention whole pieces of history that were missing if you downloaded today, magically reappearing the following day. Let's hope things will get better. I'm not defending them, but they are clearly passing through a transition phase, euphemistically speaking.

Only, my point is: maybe other sources are not bullet proof either. Nasdaq.com, for example, gives still other numbers compared to Quandl. Which one is telling the truth? Would you be better off by choosing any paid service among others, if all of them gave different data? This is what I'd like to understand.
 
Which numbers differ? Yahoo's adjustment formula is different from Quandl's a little. This is where the differences come from.

I contacted them regarding the missing history (beginning of July) and it was fixed the next day, I also contacted them because June 30 was missing for almost everything just a few days ago (same for 2016). Fixed the next day. So they're very responsive.
I doubt the paid sources are ideal either but I'll let others comment further.
 
I am grateful to Dennis Lee for writing updated Yahoo historical data fetching code at https://github.com/dennislwy/YahooFinanceAPI , discussed at https://stackoverflow.com/questions/44030983/yahoo-finance-url-not-working/44050039 .

Using his code as a guide, I have written the following shell script, which I hereby place in the public domain, that uses the grep, sed and wget programs for fetching Yahoo historical data for a stock symbol in comma separated variable (csv) format.

Of course, I disclaim any responsibility for any defects, bad data, or any other results of your using this script. Use it at your own risk.

Code:
#!/bin/sh

cookiefile=/tmp/saved-cookies.$$.txt

if [ $# -lt 1 ] ; then
  echo "Usage: yahooquote-historic symbol" >&2
  exit 1
fi

symbol="$1"

download_summary() {
  wget --quiet --no-check-certificate --no-cache --keep-session-cookies \
    --output-document=- --save-cookies=${cookiefile} \
    "https://finance.yahoo.com/quote/${symbol}?p=${symbol}"
}

extract_crumb() {
  grep CrumbStore |
   sed 's/^.*CrumbStore/CrumbStore/;s/}.*$//;s/.*"crumb":"//;s/"$//;s|\\u002F|/|g'
}

crumb=$(download_summary "$symbol" | extract_crumb)

period2=$(date +%s)

wget --quiet --no-check-certificate --output-document=- \
  --load-cookies=${cookiefile} \
  "https://query1.finance.yahoo.com/v7/finance/download/UVXY?period1=1&period2=1500194366&interval=1d&events=history&crumb=${crumb}"

status=$?
rm -f "${cookiefile}"
exit "$status"
 
Last edited:
I made a mistake in the script I previously posted, that has the effect of providing historical data only before July 16th, 2017. I had left a numeric unix time code for a time during that date, 1500194366, in the final wget command in that script, instead of "${period2}". Here is a corrected version of the script.

Code:
#!/bin/sh

cookiefile=/tmp/saved-cookies.$$.txt

if [ $# -lt 1 ] ; then
  echo "Usage: yahooquote-historic symbol" >&2
  exit 1
fi

symbol="$1"

download_summary() {
  wget --quiet --no-check-certificate --no-cache --keep-session-cookies \
  --output-document=- --save-cookies=${cookiefile} \
  "https://finance.yahoo.com/quote/${symbol}?p=${symbol}"
}

extract_crumb() {
  grep CrumbStore |
  sed 's/^.*CrumbStore/CrumbStore/;s/}.*$//;s/.*"crumb":"//;s/"$//;s|\\u002F|/|g'
}

crumb=$(download_summary "$symbol" | extract_crumb)

period2=$(date +%s)

wget --quiet --no-check-certificate --output-document=- \
  --load-cookies=${cookiefile} \
  "https://query1.finance.yahoo.com/v7/finance/download/UVXY?period1=1&period2=${period2}&interval=1d&events=history&crumb=${crumb}"

status=$?
rm -f "${cookiefile}"
 
In the end I gave up chasing Yahoo and their mess and started using Quandl API only. Their data seems to be quite stable and reliable, and it's true that they email back within 24h to any question. Also the API is very linear and easy to use.

They don't offer ETFs such as SPY or QQQ in their free plan, so I will probably subscribe to some of their paid versions.

Case closed (for now).
 
  • Like
Reactions: d08
Just an update. Yahoo data as of the moment is still full of abnormalities. There are too many to list and some appear to be deliberate so that the data is unusable if downloaded.
 
Back
Top