-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Yahoo finance changed chart base url. Updated _get_hist_yahoo #5812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
do the current tests for this fail? |
I didn't run the tests, although I probably should have. This is what I saw before: In [19]: get_data_yahoo('AAPL', start='1/1/2010', end='12/31/2013')
---------------------------------------------------------------------------
IOError Traceback (most recent call last)
<ipython-input-19-26f3355b8c21> in <module>()
----> 1 get_data_yahoo('AAPL', start='1/1/2010', end='12/31/2013')
/usr/local/anaconda/lib/python2.7/site-packages/pandas-0.12.0_1327_g12eb775-py2.7-macosx-10.5-x86_64.egg/pandas/io/data.pyc in get_data_yahoo(symbols, start, end, retry_count, pause, adjust_price, ret_index, chunksize, name)
394 """
395 return _get_data_from(symbols, start, end, retry_count, pause,
--> 396 adjust_price, ret_index, chunksize, 'yahoo', name)
397
398
/usr/local/anaconda/lib/python2.7/site-packages/pandas-0.12.0_1327_g12eb775-py2.7-macosx-10.5-x86_64.egg/pandas/io/data.pyc in _get_data_from(symbols, start, end, retry_count, pause, adjust_price, ret_index, chunksize, source, name)
340 # If a single symbol, (e.g., 'GOOG')
341 if isinstance(symbols, (compat.string_types, int)):
--> 342 hist_data = src_fn(symbols, start, end, retry_count, pause)
343 # Or multiple symbols, (e.g., ['GOOG', 'AAPL', 'MSFT'])
344 elif isinstance(symbols, DataFrame):
/usr/local/anaconda/lib/python2.7/site-packages/pandas-0.12.0_1327_g12eb775-py2.7-macosx-10.5-x86_64.egg/pandas/io/data.pyc in _get_hist_yahoo(sym, start, end, retry_count, pause)
194 '&g=d' +
195 '&ignore=.csv')
--> 196 return _retry_read_url(url, retry_count, pause, 'Yahoo!')
197
198
/usr/local/anaconda/lib/python2.7/site-packages/pandas-0.12.0_1327_g12eb775-py2.7-macosx-10.5-x86_64.egg/pandas/io/data.pyc in _retry_read_url(url, retry_count, pause, name)
173
174 raise IOError("after %d tries, %s did not "
--> 175 "return a 200 for url %r" % (retry_count, name, url))
176
177
IOError: after 3 tries, Yahoo! did not return a 200 for url 'http://ichart.yahoo.com/table.csv?s=AAPL&a=0&b=1&c=2010&d=11&e=31&f=2013&g=d&ignore=.csv' This is what I see after making the change: In [3]: get_data_yahoo('AAPL', start='1/1/2010', end='12/31/2013')
Out[3]:
Open High Low Close Volume Adj Close
Date
2010-01-04 213.43 214.50 212.38 214.01 17633200 206.93
2010-01-05 214.60 215.59 213.25 214.38 21496600 207.29
2010-01-06 214.38 215.23 210.75 210.97 19720000 203.99
2010-01-07 211.75 212.00 209.05 210.58 17040400 203.61
2010-01-08 210.30 212.00 209.06 211.98 15986100 204.97
2010-01-11 212.80 213.00 208.45 210.11 16508200 203.16
2010-01-12 209.19 209.77 206.42 207.72 21230700 200.85
2010-01-13 207.87 210.93 204.10 210.65 21639000 203.68
2010-01-14 210.11 210.46 209.02 209.43 15460500 202.50
2010-01-15 210.93 211.60 205.87 205.93 21216700 199.12
2010-01-19 208.33 215.19 207.24 215.04 26071700 207.92
2010-01-20 214.91 215.55 209.50 211.73 21862600 204.72
2010-01-21 212.08 213.31 207.21 208.07 21719800 201.18
2010-01-22 206.78 207.50 197.16 197.75 31491700 191.21
2010-01-25 202.51 204.70 200.19 203.07 38060700 196.35
2010-01-26 205.95 213.71 202.58 205.94 66682500 199.12
2010-01-27 206.85 210.58 199.53 207.88 61520300 201.00
2010-01-28 204.93 205.50 198.70 199.29 41910800 192.70
2010-01-29 201.08 202.20 190.25 192.06 44498300 185.70
2010-02-01 192.37 196.00 191.30 194.73 26781300 188.29
2010-02-02 195.91 196.32 193.38 195.86 24940800 189.38
2010-02-03 195.17 200.20 194.42 199.23 21976000 192.64
2010-02-04 196.73 198.37 191.57 192.05 27059000 185.69
2010-02-05 192.63 196.00 190.85 195.46 30368100 188.99
2010-02-08 195.69 197.88 194.00 194.12 17081100 187.70
2010-02-09 196.42 197.50 194.75 196.19 22603100 189.70
2010-02-10 195.89 196.60 194.26 195.12 13227200 188.66
2010-02-11 194.88 199.75 194.06 198.67 19655200 192.10
2010-02-12 198.11 201.64 195.50 200.38 23409600 193.75
2010-02-16 201.94 203.69 201.52 203.40 19419200 196.67
2010-02-17 204.19 204.31 200.86 202.55 15585600 195.85
2010-02-18 201.63 203.89 200.92 202.93 15100900 196.21
2010-02-19 201.86 203.20 201.11 201.67 14838200 195.00
2010-02-22 202.34 202.50 199.19 200.42 13948700 193.79
2010-02-23 200.00 201.33 195.71 197.06 20539100 190.54
2010-02-24 198.23 201.44 197.84 200.66 16448800 194.02
2010-02-25 197.38 202.86 196.89 202.00 23754500 195.32
2010-02-26 202.38 205.17 202.00 204.62 18123600 197.85
2010-03-01 205.75 209.50 205.45 208.99 19646200 202.07
2010-03-02 209.93 210.83 207.74 208.85 20233800 201.94
2010-03-03 208.94 209.87 207.94 209.33 13287600 202.40
2010-03-04 209.28 210.92 208.63 210.71 13072900 203.74
2010-03-05 214.94 219.70 214.63 218.95 32129300 211.70
2010-03-08 220.01 220.09 218.25 219.08 15353200 211.83
2010-03-09 218.31 225.00 217.89 223.02 32866400 215.64
2010-03-10 223.83 225.48 223.20 224.84 21293500 217.40
2010-03-11 223.91 225.50 223.32 225.50 14489300 218.04
2010-03-12 227.37 227.73 225.75 226.60 14868700 219.10
2010-03-15 225.38 225.50 220.25 223.84 17625100 216.43
2010-03-16 224.18 224.98 222.51 224.45 15961000 217.02
2010-03-17 224.90 226.45 223.27 224.12 16105600 216.70
2010-03-18 224.10 225.00 222.61 224.65 12218200 217.22
2010-03-19 224.79 225.24 221.23 222.25 19980200 214.90
2010-03-22 220.47 226.00 220.15 224.75 16300700 217.31
2010-03-23 225.64 228.78 224.10 228.36 21515400 220.80
2010-03-24 227.64 230.20 227.51 229.37 21349300 221.78
2010-03-25 230.92 230.97 226.25 226.65 19367300 219.15
2010-03-26 228.95 231.95 228.55 230.90 22888400 223.26
2010-03-29 233.00 233.87 231.62 232.39 19312300 224.70
2010-03-30 236.60 237.48 234.25 235.85 18832500 228.05
... ... ... ... ... ...
[1005 rows x 6 columns] |
the existing tests should fail if the URL is broken so maybe suppressing the error |
I just came across this error as well. I don't see any tests for get_data_yahoo. |
no just should check that it works |
you were right. it is in test. it passes the test because it does not receive a 200 for the url test_read_yahoo (main.TestDataReader) ... SKIP: Skipping test after 3 tries, Yahoo! did not return a 200 for url 'http://ichart.yahoo.com/table.csv?s=GS&a=0&b=1&c=2010&d=11&e=31&f=2013&g=d&ignore=.csv' |
hmm @cpcloud do you know how these tests r suppose to validate? |
It looks like it only checks for a 200 code, but urllib2 error is actually On Tue, Dec 31, 2013 at 8:42 PM, jreback [email protected] wrote:
|
The reason that this wasn't caught in the tests is that the The I don't have a suggestion on how to fix that. |
we prob need a way to test temporarily (eg before release) that the network tests work and are valid urls |
BUG: Yahoo finance changed chart base url. Updated _get_hist_yahoo
Thanks for the fix. We'll move the @network problem to it's own issue. |
That error is actually a little misleading in its message. To be clear, DataReader fails with the equivalent of: IOError("after 3 tries, Yahoo! did not return a 200 for url") Which gets caught by the network decorator b/c it's an IOError and converted into : |
...so now every pandas version out there except git master is broken? nice move, yahoo. |
cc @yarikoptic. You should probably add this as a distro patch for 0.13.0. |
@y-p would it be difficult for us to put out a new release right now with this patch? (even if we had to create a separate branch or something) |
We could tag a 0.13.1 right now, but we may still have to put out a 0.13.2 in two weeks If we put out 0.13.0 and commit to a 0.13.1 in a couple of weeks no matter what, we can Not sure, it's just that getting the binaries up is a bottleneck. But if you and @jreback feel differently, we can do that. |
i think the current plan is fine then do a 0.13.1 in a few weeks can post an easy monkey patch that would work for the yahoo fix (just monkey patch the function it's pretty small to issue / ml) |
jreback [email protected] wrote:
I thought 0.13.0 was released already... let's not grow zombies Sent from a phone which beats iPhone. |
0.13.0 has been tagged. We're waiting for wes to push the binaries to pypi, after which We hope that on top of the RC that is enough to ensure a stable release is available That's the plan for now. |
On Sat, 04 Jan 2014, y-p wrote:
sounds good to me, besides I would have forgot about 0.13.0 in my case -- currently I have in the patch queue (will upload to Debian deb_skip_sequencelike_on_armel unfortunately those numexpr "fixes" Yaroslav O. Halchenko, Ph.D. |
Okay, for now I'm going to block using numexpr with pandas for versions |
yep better - |
It seems Yahoo made the old ichart url working again. I am using 0.12. In [5]: web.get_data_yahoo('HMIN','12/23/2010','12-12-2013')
Out[5]:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 748 entries, 2010-12-23 00:00:00 to 2013-12-12 00:00:00
Data columns (total 6 columns):
Open 748 non-null values
High 748 non-null values
Low 748 non-null values
Close 748 non-null values
Volume 748 non-null values
Adj Close 748 non-null values
dtypes: float64(5), int64(1) |
that's good, was a good opportunity to expose the URLs used anyways, so |
The start of the old url was:
http://ichart.yahoo.com/
and yahoo now useshttp://ichart.finance.yahoo.com/