Skip to content

Fixed unexpected np.nan value with reindex on pd.series with pd.Inter… #54549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

raj-thapa
Copy link
Contributor

@raj-thapa raj-thapa commented Aug 15, 2023

The test added in this issue is failing on Linux-32-bit. This is caused by another issue as discussed in #23440. The description of the the error is also described below in the comments. The test is marked xfail for 32-bit system in this PR.

@raj-thapa raj-thapa requested a review from WillAyd as a code owner August 15, 2023 00:38
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the 32 bit tests are failing

@mroeschke mroeschke added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Interval Interval data type labels Aug 15, 2023
@raj-thapa
Copy link
Contributor Author

raj-thapa commented Aug 15, 2023

I have tried to enforce the 64-bit integers on the index arrays.

def test_interval_index_reindex_behavior(base, expected_result):
    # GH 51826
    left = np.arange(base, dtype=np.int64)
    right = np.arange(1, base + 1, dtype=np.int64)
    d = Series(
        range(base),
        index=IntervalIndex.from_arrays(left, right)
    )
    result = d.reindex(index=[np.nan, 1.0])
    tm.assert_series_equal(result, expected_result)

However, the 32 bit tests still seems to fail. The specific error is

FAILED pandas/tests/indexing/interval/test_interval.py::TestIntervalIndexInsideMultiIndex::test_interval_index_reindex_behavior[101-expected_result2] - TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'
FAILED pandas/tests/indexing/interval/test_interval.py::TestIntervalIndexInsideMultiIndex::test_interval_index_reindex_behavior[1010-expected_result3] - TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe'

And the error also points to line:196 on pandas/_libs/intervaltree.pxi which is this return statement.

cdef take(ndarray source, ndarray indices):
    """Take the given positions from a 1D ndarray
    """
    return PyArray_Take(source, indices, 0)

Do anyone have ideas on solving this unexpected casting issue?

Update: This error is already discussed in #23440.

@raj-thapa raj-thapa requested a review from mroeschke August 18, 2023 01:19
@mroeschke mroeschke added this to the 2.2 milestone Aug 18, 2023
@mroeschke mroeschke merged commit 67b19eb into pandas-dev:main Aug 18, 2023
@mroeschke
Copy link
Member

Thanks @raj-thapa

mroeschke pushed a commit to mroeschke/pandas that referenced this pull request Aug 18, 2023
pandas-dev#54549)

* Fixed unexpected np.nan value with reindex on pd.series with pd.IntervalIndex

* Moving expected result to test body

* Fixed issues with Linux-32-bit test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Interval Interval data type Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
2 participants