-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TST: Test non-nanosecond datetimes in PyArrow Parquet dataframes #59393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@natmokval @WillAyd Please review when you have time. Thank you! |
All tests pass in my Docker container, but the CI build fails. I'll look into it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, though I'm wondering if this is that much different than the test_read_dtype_backend_pyarrow_config
that already exists?
@jbrockmendel does the most with datetimes so may want to chime in here as well
Also this does potentially solve the example from the OP in #49236 , although we may not want to make this as closing that. Not sure if there is a need for the requested keyword beyond this one example |
@WillAyd Thank you very much for your review!
While |
Sounds good and thanks for explaining. I think it's ok to add - generally more tests are good. |
Whoops sorry for the bad guidance on ensure_clean() |
@mroeschke thanks for the suggestions! |
Co-authored-by: Matthew Roeschke <[email protected]>
In the failed environment, pandas 1.5.3 is used, which fails with the same error from the original issue ticket. |
@mroeschke could you please help with my question above? Thanks! |
You'll probably need specify the |
Thanks @EduardAkhmetshin |
This PR tests that the current version of pandas supports non-nanosecond PyArrow dataframes without using timestamp_as_object param.
to_pandas_kwargs
inread_parquet
for pyarrow engine #49236doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.