-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Patch for skipping seek() when loading Excel files from urllib3.response.HTTPResponse object #28826
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm OK thanks. I think this was intentionally removed here https://github.com/pandas-dev/pandas/pull/26233/files#r279220473 so @tdamsma might have thoughts as well.
In any case is there a way to add a test for this?
I just tested with 25.1 (on osx), can't reproduce. I took this from #20437: import pandas
url = "https://raw.github.com/pandas-dev/pandas/master/pandas/tests/io/data/test1.xls"
df = pandas.read_excel(url)
print(df) I believe there are tests for downloading from a url included in the test suite. |
Thanks @tdamsma ! I actually was not able to reproduce this on master either. @hellbe can you try there? |
Oh, slight difference in use actually. The reason is that I downloaded first using
Throws |
So in my case I can easily fix my code by passing the url directly instead. However, there might be other situations where extra functionality from https://requests.kennethreitz.org/en/master/user/quickstart/#raw-response-content So So I think the patch is still valid, but maybe there's a nicer way to mitigate? |
Updated the issue with the above information. |
Updated the PR commit to match above and refer to new issue |
Rebased |
Surely possible to create a test case specifically for reading urllib3's HTTPResponse objects but I'm not sure if it's meaningful tbh |
Probably. Let me check. I still have the old PR here waiting for me. (I'm at a code sprint at the moment, Might tackle it again this week.) |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Discovered that 0.25.x reintroduces #20437. Proposing the same fix here.