-
Notifications
You must be signed in to change notification settings - Fork 679
Replace Yahoo iCharts API #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas_datareader/yahoo/daily.py
Outdated
params=self.params, headers=self.headers) | ||
out = str(self._sanitize_response(response)) | ||
# Matches: {"crumb":"AlphaNumeric"} | ||
regex = re.search(r'{"crumb" ?: ?"([A-Za-z0-9.]{11,})"}', out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it is just me, but I had to change the regex a bit to make it work:
regex = re.search(r'"crumb":"([A-Za-z0-9.]{11})"', out)
I guess stripping out {}
might be the key as I saw the crumb show up in the middle of the array:
"UHAccountSwitchStore":{"site":"fpctx","crumb":"KV.dLYWGrgK","sendRequest":false,"isEnabled":true}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good note, I'll push an update. All of my test cases found the crumbs nested in brackets but that's not very sustainable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 👍
Latest commit addresses the remaining tests. I'm noticing Yahoo's API fails periodically - I tried addressing this with a pause multiplier and cookie refresher for subsequent requests, which significantly improves successful requests, but I still hit "Unable to read URL" errors every so often. This is resolved by simply running the test again, but there ought to be a better way of getting around this. Anyone have ideas here? Also, I'm failing the test_yahoo_DataReader test but I can't figure out why - the frames appear to be identical to me. Not sure what's gone wrong with Eurostat but I can't get that one to pass locally either, even on master. |
…stock still failing (pydata#315)
…dling for empty data requests, and updates some test logic for pandas 0.20.x (notably ix deprecation)
When the old API bombed on me, and it did during initial fetches, but not so much on incremental updates for historical data, I suspect it had to do with bandwidth per as opposed to requests per. I say this, because I had 4 processes swallowing a queue of symbols to fetch and went after them one per process at a time. It could be more convenient and faster to fetch multiple symbols at once, but for large chunks of time or small periods with medium chunks of time that bandwidth per minute making it bomb out would make sense. On unable to read url errors, have you tried giving it a second chance on all fetches inside the fetcher instead of the test? |
pandas_datareader/yahoo/daily.py
Outdated
params=self.params, headers=self.headers) | ||
out = str(self._sanitize_response(response)) | ||
# Matches: {"crumb":"AlphaNumeric"} | ||
regex = re.search(r'"crumb" ?: ?"([A-Za-z0-9.\\]{11,})"', out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This token can contain some special characters and escaped unicode. For my local workaround this seems to work consistently (python3, may need adjusting for compat)
pat = r'"CrumbStore":{"crumb":"([^"]+)"}'
token = re.findall(pat, out)[0]
token = token.encode('ascii').decode('unicode-escape')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Chris! I've updated the pattern for your update, this does seem to handle edge cases that I was missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idealy you can add some more tests to cover the actual error conditions. not sure how tricky that is though.
conda.recipe/bld.bat
Outdated
@@ -1,8 +0,0 @@ | |||
"%PYTHON%" setup.py install | |||
if errorlevel 1 exit 1 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are remove in master (so rebase to remove them from your PR)
@@ -85,6 +86,8 @@ def _read_url_as_StringIO(self, url, params=None): | |||
response = self._get_response(url, params=params) | |||
text = self._sanitize_response(response) | |||
out = StringIO() | |||
if len(text) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally return a more informative error (e.g. service name / url)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the error to include subclass and requested URL, and cleaned up my PR.
@@ -132,17 +127,18 @@ def test_get_data_multiple_symbols(self): | |||
def test_get_data_multiple_symbols_two_dates(self): | |||
pan = web.get_data_yahoo(['GE', 'MSFT', 'INTC'], 'JAN-01-12', | |||
'JAN-31-12') | |||
result = pan.Close.ix['01-18-12'] | |||
assert len(result) == 3 | |||
result = pan.Close['01-18-12'].transpose() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.T is common
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add a release note (will push a new whatsnew a bit later)
Are these changes are only in the master branch but not in PyPi? |
@arose13 These have not yet been merged in. |
@rgkimball my bad, is that because of the CI tests? |
No worries - these changes have been approved, we just need some release documentation. |
@rgkimball so a couple of tests are failing can you fix up the tests as needed (if they are applicable to this change) |
Swiss and Tokyo Exchanges are not working. Tokyo isn't showing up in the English version of Yahoo Finance for some reason but Swiss Does. |
Not all assets are returned as numeric type. For instance symbol 'BRK-A' is returned as a string data type and I must apply numeric coercion to it |
Is this still on going and will be merge into the new release of 0.4.1? |
Eventually, that's the plan. I don't have time to work on this until next week, unfortunately. |
@rgkimball found that there are many 'null' string in the data with new API. Will it possible to fix that? 2016-06-08,AWG,0.600000,0.600000,0.600000,0.600000,0.600000,0 |
regarding many 'null' string in the data with new API see |
can u see if u can get this passing |
Thanks - what's the status of this? |
I just submitted a documentation change PR #349 and Travis is fails with number of errors https://travis-ci.org/pydata/pandas-datareader/builds/244005471 |
superseded by #355 |
Based on multiple issues raised with this and similar libraries, it appears that Yahoo's iCharts API is being phased out. The new API requires a cookie and updated URL/parameter structure, implemented by this PR. See:
This will fix standard pulls for interval prices for a single stock, but not yet for multiple stocks - if the maintainers agree with the approach I can continue to debug and build this out, otherwise we can scrap it.