-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: read_html - file path cannot be pathlib.Path type #37736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @inspurwusixuan for the PR!
Can you add tests? As reviewers that's the first thing we look for
You'll also need a whatsnew entry (probably 1.2)
Thanks for the reply! @arw2019
Do you mean adding test cases in the test_html.py?
Can you explain a little bit more what is the whatsnew entry and where should I start? This is actually my first open-source contribution and I'm still trying to get familiar with everything. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a test in test_html.py. You can for example try to read pandas/tests/io/data/html/spam.html
one time with the path as a string and then with pathlib.Path and then compare that both create the same DataFrame.
The whatsnew is here. You probably want to add one line in the Bug fix I/O section.
Yes that sounds like the place
We document user-facing changes to the library on the whatsnew docs page. Here is the most recent one: The entry for this patch should probably go under
Welcome to pandas! |
Thanks for the comments! I will follow up with the test case and whatsnew entry. @arw2019 |
Hello @inspurwusixuan! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-11-10 23:34:15 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - mostly minor comments
Co-authored-by: William Ayd <[email protected]>
Co-authored-by: William Ayd <[email protected]>
doc/source/whatsnew/v1.2.0.rst
Outdated
@@ -507,6 +507,11 @@ I/O | |||
- Bug in :class:`HDFStore` was dropping timezone information when exporting :class:`Series` with ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`20594`) | |||
- :func:`read_csv` was closing user-provided binary file handles when ``engine="c"`` and an ``encoding`` was requested (:issue:`36980`) | |||
- Bug in :meth:`DataFrame.to_hdf` was not dropping missing rows with ``dropna=True`` (:issue:`35719`) | |||
<<<<<<< HEAD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge issue here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed merge issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm outside of merge issue @arw2019 commented on
thanks @inspurwusixuan |
Fix
read_html
TypeError when parsingpathlib.Path
object.