-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Closes #36541 (BUG: ValueError: cannot convert float NaN to integer when resetting MultiIndex with NaT values) #36563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
doc/source/whatsnew/v1.2.0.rst
Outdated
@@ -310,6 +310,7 @@ MultiIndex | |||
^^^^^^^^^^ | |||
|
|||
- Bug in :meth:`DataFrame.xs` when used with :class:`IndexSlice` raises ``TypeError`` with message `Expected label or tuple of labels` (:issue:`35301`) | |||
- Bug in :func:`construct_1d_arraylike_from_scalar` when using :meth:`DataFrame.reset_index` with ``NaT`` values in index raises ``ValueError`` with message `cannot convert float NaN to integer` (:issue:`36541`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the first part - these are always user facing notes
pandas/core/dtypes/cast.py
Outdated
@@ -1559,6 +1559,11 @@ def construct_1d_arraylike_from_scalar( | |||
dtype = np.dtype("object") | |||
if not isna(value): | |||
value = ensure_str(value) | |||
elif isinstance(dtype, np.dtype) and dtype.kind == "M" and value is NaT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use is_nat_for_dtype (is not exactly this same but similar)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_valid_nat_for_dtype is the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't just use is_valid_nat_for_dtype
as it also returned true for is_valid_nat_for_dtype(nan, float64)
(which was a problem for the test_modulo2
test case for instance (I didn't check others which also failed)).
result = result.reset_index() | ||
|
||
expected = pd.DataFrame({"a": [pd.NaT, pd.NaT], "b": [1, 2], "x": [11, 12]}) | ||
pd.testing.assert_frame_equal(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use tm.assert...
* simplified whatsnew to reflect intended users * use `is_valid_nat_for_dtype` * use `tm.assert_...`
* Still require kind == "M" check
I might need some help regarding the build error (Arm64, Xenial, Py3.7). Seems like there are some files missing, but I’m not sure what I could do to fix that...? |
# Conflicts: # doc/source/whatsnew/v1.2.0.rst
|
||
def test_nat_multi_index(): | ||
# GH36541: that reset_index() does not raise ValueError | ||
ix = pd.MultiIndex.from_tuples([(pd.NaT, 1), (pd.NaT, 2)], names=["a", "b"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you try this with one element a Timestamp (and one a NaT)
also add a case where you have a Timedelta and a NaT (and I think we might need to change the code slightly)
can parameterize the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added test cases. please have another look and advise.
* Added NaT/Timestamp test case * Added NaT/Timedelta test case
pandas/core/dtypes/cast.py
Outdated
@@ -1559,6 +1559,11 @@ def construct_1d_arraylike_from_scalar( | |||
dtype = np.dtype("object") | |||
if not isna(value): | |||
value = ensure_str(value) | |||
elif dtype.kind == "M" and is_valid_nat_for_dtype(value, dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be dtype.kind in ['M', 'm']
(do the tests fail w/o this?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, they didn't before with "M" only (except for the ARM tests which I believe have an unrelated problem).
Trying with timedelta "m" now...
pls merge master as well |
ready to merge @jreback |
thanks @ssche very nice! |
… integer when resetting MultiIndex with NaT values) (pandas-dev#36563)
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff