-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TST: Add tests to ensure proper conversion of datetime64/timedelta64/timestamp/duration to and from pyarrow #54356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST: Add tests to ensure proper conversion of datetime64/timedelta64/timestamp/duration to and from pyarrow #54356
Conversation
As per this issue from the pyarrow repository there was error in convert_dtypes and pa.table()/pd.to_pandas() in which pyarrow would always return [ns] as unit regardless of associated metadeta. I have added tests to ensure the conversion always preserves associated unit. Pyarrow has since fixed and closed the issue in their current main branch (dev version 13.0), and so when it is publicly available the if conditionals mentioned in #54191 can be considered for removal with these tests to ensure functionality remains the same after their removal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you only add 1 test for the example in #54191? The test should go in test_convert_dtypes.py
…pyarrow-conversion-tests bring local up to date
…ian048/pandas into add-pyarrow-conversion-tests bring branch up to date with local
I've fixed the test to match the issue and put it in the proper location, but I'm still having a failing CI "minimum versions" and I can't pinpoint why if anyone could help with that. |
Might help to merge main again |
def test_convert_dtypes_timestamp_and_duration(self): | ||
# GH 54191 | ||
pytest.importorskip("pyarrow") | ||
timestamp_series = pd.Series( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you mirror the exact code snippet in #54191 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what you're referencing right?
s = pd.Series(pd.date_range("2020-01-01", "2020-01-02", freq="1min"))
s = s.astype("timestamp[ms][pyarrow]")
t = s.convert_dtypes(dtype_backend="pyarrow")
pd.testing.assert_series_equal(s, t)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct
…ian048/pandas into add-pyarrow-conversion-tests Bring branch up to date with main
…yarrow-conversion-tests
…ian048/pandas into add-pyarrow-conversion-tests Bring up to date with main
I apologize for all the commits, git is new to me |
Thanks @Julian048 |
…timestamp/duration to and from pyarrow (pandas-dev#54356) * add pyarrow conversion pytests * make test match GH issue and move test to proper location * fix naming of expected and result variables for assertion * optimize test * make test mirror GH pandas-dev#54191 * fix test logic
convert_dtypes(dtype_backend="pyarrow")
converts"timestamp[ms][pyarrow]"
to"timestamp[ns][pyarrow]"
#54191