-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: during interchanging from non-pandas tz-aware data #54287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: during interchanging from non-pandas tz-aware data #54287
Conversation
|
||
arr = pa.array([datetime(2020, 1, 1), None, datetime(2020, 1, 2)]) | ||
arr = pc.assume_timezone(arr, "Asia/Kathmandu") | ||
result = pa.table({"arr": arr}).to_pandas() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait we need to go via the interchange protocol - could you use result = from_dataframe(pa.table(...))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, my mistake. I'll do that via the interchange protocol.
|
||
arr = pa.array([datetime(2020, 1, 1), None, datetime(2020, 1, 2)]) | ||
arr = pc.assume_timezone(arr, "Asia/Kathmandu") | ||
result = from_dataframe(pa.table({"arr": arr}).to_pandas()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the interchange protocol means we can avoid to_pandas
calls - so long as __dataframe__
is available (which it is for pyarrow tables) we can just do:
result = from_dataframe(pa.table({"arr": arr}).to_pandas()) | |
result = from_dataframe(pa.table({"arr": arr})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strange, it doesn't work for me. If I chage the line 307 to result = from_dataframe(pa.table({"arr": arr}))
, I get an error
pandas.errors.SettingWithCopyError: modifications to a method of a datetimelike object are not supported and are discarded. Change values on the original.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks - that looks like a warning which should be suppressed somewhere
could you please:
- in this test, just add
with tm.assert_produces_warning
to silence it and get the test to pass - open a new issue about how the snippet above produces a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems I can’t silence it, because it’s an error. Looks like it comes from the method _delegate_method
. I corrected the test, now it checks this error.
|
||
def test_interchange_from_non_pandas_tz_aware(): | ||
# GH 54239 | ||
import pyarrow as pa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import pyarrow as pa | |
pa = pytest.importorskip("pyarrow") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I tried it, but still have the same error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will help CI builds that don't have pyarrow installed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, that is the reason why many checks failed
exchange_df = table.__dataframe__() | ||
|
||
msg = "modifications to a method of a datetimelike object are not supported." | ||
with pytest.raises(SettingWithCopyError, match=msg): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be raising an error though
In
try:
data[null_pos] = None
except TypeError:
# TypeError happens if the `data` dtype appears to be non-nullable
# in numpy notation (bool, int, uint). If this happens,
# cast the `data` to nullable float dtype.
data = data.astype(float)
data[null_pos] = None
, could we add an extra except
, which catches SettingWithCopyError
, and in that case makes a copy of data
(data = data.copy()
) and then sets the nulls (data[null_pos] = None
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I added the extra except
, it works very well. I updated my test and added a line to whatsnew/v2.1.0.rst as well.
|
||
def test_interchange_from_non_pandas_tz_aware(): | ||
# GH 54239, 54287 | ||
pa = pytest.importorskip("pyarrow") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may need to set a minimum version here
pa = pytest.importorskip("pyarrow") | |
pa = pytest.importorskip("pyarrow", "11.0.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, I added the minimum version here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments, but the rest looks good to me!
doc/source/whatsnew/v2.1.0.rst
Outdated
@@ -682,6 +682,7 @@ Other | |||
- Bug in :class:`DataFrame` and :class:`Series` raising for data of complex dtype when ``NaN`` values are present (:issue:`53627`) | |||
- Bug in :class:`DatetimeIndex` where ``repr`` of index passed with time does not print time is midnight and non-day based freq(:issue:`53470`) | |||
- Bug in :class:`FloatingArray.__contains__` with ``NaN`` item incorrectly returning ``False`` when ``NaN`` values are present (:issue:`52840`) | |||
- Bug in :func:`api.interchange.from_dataframe` was raising during interchanging from non-pandas tz-aware data containing ``NaN`` values (:issue:`54287`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we write
non-pandas tz-aware data containing null values
?
technically nan
is different, and specifically refers to invalid mathematical operations like 0/0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, my mistake. I corrected that.
# SettingWithCopyError happens if the `data` appears | ||
# to have 'NaT'. If this happens, copy the `data`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"appears to" sounds a bit mysterious :) how about
`SettingWithCopyError` may happen for datetime-like with missing values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, I changed the wording as you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work! looks good to me on green
Thanks @natmokval |
xref #54246
Added a test for interchanging from non-pandas tz-aware.
EDIT: fixed bug which was raising during interchanging from non-pandas tz-aware data.