Replace Yahoo iCharts API #331

rgkimball · 2017-05-17T18:35:13Z

Based on multiple issues raised with this and similar libraries, it appears that Yahoo's iCharts API is being phased out. The new API requires a cookie and updated URL/parameter structure, implemented by this PR. See:

This will fix standard pulls for interval prices for a single stock, but not yet for multiple stocks - if the maintainers agree with the approach I can continue to debug and build this out, otherwise we can scrap it.

heyuhere · 2017-05-17T21:55:46Z

pandas_datareader/yahoo/daily.py

+                                      params=self.params, headers=self.headers)
+        out = str(self._sanitize_response(response))
+        # Matches: {"crumb":"AlphaNumeric"}
+        regex = re.search(r'{"crumb" ?: ?"([A-Za-z0-9.]{11,})"}', out)


Not sure if it is just me, but I had to change the regex a bit to make it work:

regex = re.search(r'"crumb":"([A-Za-z0-9.]{11})"', out)

I guess stripping out {} might be the key as I saw the crumb show up in the middle of the array:

"UHAccountSwitchStore":{"site":"fpctx","crumb":"KV.dLYWGrgK","sendRequest":false,"isEnabled":true}

Good note, I'll push an update. All of my test cases found the crumbs nested in brackets but that's not very sustainable.

rgkimball · 2017-05-18T16:59:31Z

Latest commit addresses the remaining tests. I'm noticing Yahoo's API fails periodically - I tried addressing this with a pause multiplier and cookie refresher for subsequent requests, which significantly improves successful requests, but I still hit "Unable to read URL" errors every so often. This is resolved by simply running the test again, but there ought to be a better way of getting around this. Anyone have ideas here?

Also, I'm failing the test_yahoo_DataReader test but I can't figure out why - the frames appear to be identical to me.

Not sure what's gone wrong with Eurostat but I can't get that one to pass locally either, even on master.

…stock still failing (pydata#315)

…xtracted.

…dling for empty data requests, and updates some test logic for pandas 0.20.x (notably ix deprecation)

aking1012 · 2017-05-19T09:48:21Z

When the old API bombed on me, and it did during initial fetches, but not so much on incremental updates for historical data, I suspect it had to do with bandwidth per as opposed to requests per. I say this, because I had 4 processes swallowing a queue of symbols to fetch and went after them one per process at a time. It could be more convenient and faster to fetch multiple symbols at once, but for large chunks of time or small periods with medium chunks of time that bandwidth per minute making it bomb out would make sense.

On unable to read url errors, have you tried giving it a second chance on all fetches inside the fetcher instead of the test?

chris-b1 · 2017-05-19T17:59:52Z

pandas_datareader/yahoo/daily.py

+                                      params=self.params, headers=self.headers)
+        out = str(self._sanitize_response(response))
+        # Matches: {"crumb":"AlphaNumeric"}
+        regex = re.search(r'"crumb" ?: ?"([A-Za-z0-9.\\]{11,})"', out)


This token can contain some special characters and escaped unicode. For my local workaround this seems to work consistently (python3, may need adjusting for compat)

pat = r'"CrumbStore":{"crumb":"([^"]+)"}' token = re.findall(pat, out)[0] token = token.encode('ascii').decode('unicode-escape')

Thanks, Chris! I've updated the pattern for your update, this does seem to handle edge cases that I was missing.

… per chris-b1

jreback

idealy you can add some more tests to cover the actual error conditions. not sure how tricky that is though.

jreback · 2017-05-19T20:59:25Z

conda.recipe/bld.bat

@@ -1,8 +0,0 @@
-"%PYTHON%" setup.py install
-if errorlevel 1 exit 1
-


these are remove in master (so rebase to remove them from your PR)

jreback · 2017-05-19T21:00:02Z

pandas_datareader/base.py

@@ -85,6 +86,8 @@ def _read_url_as_StringIO(self, url, params=None):
        response = self._get_response(url, params=params)
        text = self._sanitize_response(response)
        out = StringIO()
+        if len(text) == 0:


ideally return a more informative error (e.g. service name / url)

Updated the error to include subclass and requested URL, and cleaned up my PR.

jreback · 2017-05-19T21:00:29Z

pandas_datareader/tests/yahoo/test_yahoo.py

@@ -132,17 +127,18 @@ def test_get_data_multiple_symbols(self):
    def test_get_data_multiple_symbols_two_dates(self):
        pan = web.get_data_yahoo(['GE', 'MSFT', 'INTC'], 'JAN-01-12',
                                 'JAN-31-12')
-        result = pan.Close.ix['01-18-12']
-        assert len(result) == 3
+        result = pan.Close['01-18-12'].transpose()


.T is common

…reback review.

jreback

pls add a release note (will push a new whatsnew a bit later)

arose13 · 2017-05-21T14:25:01Z

Are these changes are only in the master branch but not in PyPi?

rgkimball · 2017-05-21T14:32:54Z

@arose13 These have not yet been merged in.

arose13 · 2017-05-21T14:33:43Z

@rgkimball my bad, is that because of the CI tests?

rgkimball · 2017-05-21T14:36:13Z

No worries - these changes have been approved, we just need some release documentation.

jreback · 2017-05-22T12:47:15Z

@rgkimball so a couple of tests are failing can you fix up the tests as needed (if they are applicable to this change)

mijcs18 · 2017-05-23T10:45:24Z

Swiss and Tokyo Exchanges are not working. Tokyo isn't showing up in the English version of Yahoo Finance for some reason but Swiss Does.

willfleury · 2017-05-23T12:40:29Z

Not all assets are returned as numeric type. For instance symbol 'BRK-A' is returned as a string data type and I must apply numeric coercion to it df.apply(pd.to_numeric, errors='coerce')

bkcollection · 2017-05-31T14:00:05Z

Is this still on going and will be merge into the new release of 0.4.1?

rgkimball · 2017-05-31T14:12:47Z

Eventually, that's the plan. I don't have time to work on this until next week, unfortunately.

bkcollection · 2017-06-03T07:06:29Z

@rgkimball found that there are many 'null' string in the data with new API. Will it possible to fix that?

2016-06-08,AWG,0.600000,0.600000,0.600000,0.600000,0.600000,0
2016-06-09,AWG,0.600000,0.600000,0.600000,0.600000,0.600000,0
2016-06-10,AWG,null,null,null,null,null,null
2016-06-13,AWG,null,null,null,null,null,null
2016-06-14,AWG,0.600000,0.600000,0.600000,0.600000,0.600000,1000
2016-06-15,AWG,0.600000,0.600000,0.600000,0.600000,0.600000,2000
2016-06-16,AWG,0.600000,0.600000,0.600000,0.600000,0.600000,0
2016-06-17,AWG,0.600000,0.600000,0.600000,0.600000,0.600000,0

OlegShteynbuk · 2017-06-05T01:43:03Z

regarding many 'null' string in the data with new API see
pandas-dev/pandas#16471
#342 has temporary fix

jreback · 2017-06-10T10:11:18Z

can u see if u can get this passing

gusgordon · 2017-06-14T03:04:45Z

Thanks - what's the status of this?

gliptak · 2017-06-17T16:08:45Z

I just submitted a documentation change PR #349 and Travis is fails with number of errors https://travis-ci.org/pydata/pandas-datareader/builds/244005471

gliptak · 2017-06-24T16:14:25Z

After #352 #351 #350 are on master, I open a PR to update for Yahoo API changes. Thanks

jreback · 2017-07-02T15:11:08Z

superseded by #355

This was referenced May 17, 2017

Can't get Crude Oil data from Yahoo! Finance #290

Closed

Having weird issues with the data reader fetching yahoo finance #332

Closed

heyuhere reviewed May 17, 2017

View reviewed changes

rgkimball added 9 commits May 18, 2017 20:15

Replaces ichart API for single-stock price exports from Yahoo, multi-…

c86cbf8

…stock still failing (pydata#315)

Restores change necessary for Google to function

a9bcc9e

Fixes yahoo-actions per API endpoint update

605eb43

Update regex pattern for crumbs, per heyuhere's review

77f349b

'v' is no longer a valid interval value

01cfe68

Fixes Yahoo intervals and cases where the Yahoo cookie could not be e…

5de5d1e

…xtracted.

Implements multi-stock queries to Yahoo API

9e1c1e0

Adds a pause multiplier for subsequent requests from Yahoo, error han…

61fc09e

…dling for empty data requests, and updates some test logic for pandas 0.20.x (notably ix deprecation)

Check object type before checking contents

683ab75

chris-b1 reviewed May 19, 2017

View reviewed changes

Replacement regex logic for additional Yahoo cookie token structures,…

9c1d506

… per chris-b1

rgkimball mentioned this pull request May 19, 2017

Issues with the data reader fetching yahoo finance #315

Closed

jreback approved these changes May 19, 2017

View reviewed changes

Improved error handling and refactoring test to best practices, per j…

3561dbd

…reback review.

rgkimball force-pushed the fix-yahoo branch from 2d10233 to 3561dbd Compare May 19, 2017 21:45

jreback reviewed May 19, 2017

View reviewed changes

rgkimball force-pushed the fix-yahoo branch from f24dd8d to a7bacac Compare May 19, 2017 22:40

gusgordon mentioned this pull request May 24, 2017

Yahoo URL Error quantopian/pyfolio#385

Closed

Applies additional null type, per OlegShteynbuk pydata#342

b3cc479

rgkimball force-pushed the fix-yahoo branch from a7bacac to cea91ad Compare June 8, 2017 23:20

rgkimball added 2 commits June 9, 2017 09:27

Update test to sort DF columns.

159d25e

REL: Release 0.4.1 pydata#315

6324028

rgkimball force-pushed the fix-yahoo branch from cea91ad to 6324028 Compare June 9, 2017 15:28

gliptak mentioned this pull request Jun 18, 2017

Fails on Yahoo API change davidastephens/pandas-finance#5

Closed

This was referenced Jun 28, 2017

Yahoo Finance data broken again #354

Closed

Replace Yahoo iCharts API #355

Merged

jreback closed this Jul 2, 2017

		@@ -1,8 +0,0 @@
		"%PYTHON%" setup.py install
		if errorlevel 1 exit 1

Replace Yahoo iCharts API #331

Replace Yahoo iCharts API #331

Uh oh!

Conversation

rgkimball commented May 17, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rgkimball commented May 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aking1012 commented May 19, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chris-b1 May 19, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

arose13 commented May 21, 2017

Uh oh!

rgkimball commented May 21, 2017

Uh oh!

arose13 commented May 21, 2017

Uh oh!

rgkimball commented May 21, 2017

Uh oh!

jreback commented May 22, 2017

Uh oh!

mijcs18 commented May 23, 2017

Uh oh!

willfleury commented May 23, 2017

Uh oh!

bkcollection commented May 31, 2017

Uh oh!

rgkimball commented May 31, 2017

Uh oh!

bkcollection commented Jun 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

OlegShteynbuk commented Jun 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback commented Jun 10, 2017

Uh oh!

gusgordon commented Jun 14, 2017

Uh oh!

gliptak commented Jun 17, 2017

Uh oh!

gliptak commented Jun 24, 2017

Uh oh!

jreback commented Jul 2, 2017

Uh oh!

Uh oh!

rgkimball commented May 18, 2017 •

edited

Loading

aking1012 commented May 19, 2017 •

edited

Loading

chris-b1 May 19, 2017 •

edited

Loading

bkcollection commented Jun 3, 2017 •

edited

Loading

OlegShteynbuk commented Jun 5, 2017 •

edited

Loading