Skip to content

Remove some dead code in core/arrays/interval.py #46128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

Dr-Irv
Copy link
Contributor

@Dr-Irv Dr-Irv commented Feb 23, 2022

The typing work both @twoertwein and I did in #46080 and #44922 found some code that I conjecture is dead code as it doesn't make sense from a typing perspective.

So this PR removes the dead code, which will then allow either of those PR's to get rid of some #type : ignore lines.

Not clear to me how to add tests for dead code removal. In theory, all tests should pass when dead code is removed.

@Dr-Irv Dr-Irv added Clean Interval Interval data type labels Feb 23, 2022
@Dr-Irv Dr-Irv requested a review from jbrockmendel February 23, 2022 16:10
new_left = nc[:, 0].view(dtype)
new_right = nc[:, 1].view(dtype)
new_left = nc[:, 0].view(dtype)
new_right = nc[:, 1].view(dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this will break if self._left is a DTA/TDA?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly not sure what type _left and _right has (in #44922 I assumed np.ndarray but added a note that this is probably not correct - needs some mypy ignore comments). In your attempt at interval.pyi, you marked it as on of np.ndarray, "DatetimeArray", "TimedeltaArray".
https://github.com/pandas-dev/pandas/pull/41059/files#diff-fb052ef16d456d32858ec68d7e9449d4749166ea97c9a79b22d0ccf233dbb3a4R205

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this will break if self._left is a DTA/TDA?

Well, that's why I did the test. I don't think that self._left can be a DTA or TDA. I guess there is other logic that causes that never to be the case, and the tests seem to confirm that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dti = pd.date_range("2016-01-01", periods=3)
ii = pd.IntervalIndex.from_breaks(dti)
ser = pd.Series(ii)

ser.unique()  # <- gets here with DTA for self._left

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the dead code doesn't go there. The old code is:

        if needs_i8_conversion(dtype):
            new_left = type(self._left)._from_sequence(nc[:, 0], dtype=dtype)
            new_right = type(self._right)._from_sequence(nc[:, 1], dtype=dtype)
        else:
            new_left = nc[:, 0].view(dtype)
            new_right = nc[:, 1].view(dtype)

The new code is just:

        new_left = nc[:, 0].view(dtype)
        new_right = nc[:, 1].view(dtype)

In your example, the test needs_i8_conversion(dtype) is False

Here is what happens with 1.4.0 and this PR:

>>> import pandas as pd
>>> pd.__version__
'1.4.0'
>>> dti = pd.date_range("2016-01-01", periods=3)
>>> ii = pd.IntervalIndex.from_breaks(dti)
>>> ser = pd.Series(ii)
>>> ser.unique()
<IntervalArray>
[(2016-01-01, 2016-01-02], (2016-01-02, 2016-01-03]]
Length: 2, dtype: interval[datetime64[ns], right]

With the PR:

>>> import pandas as pd
>>> pd.__version__
'1.5.0.dev0+401.g84b119faea'
>>> dti = pd.date_range("2016-01-01", periods=3)
>>> ii = pd.IntervalIndex.from_breaks(dti)
>>> ser = pd.Series(ii)
>>> ser.unique()
<IntervalArray>
[(2016-01-01, 2016-01-02], (2016-01-02, 2016-01-03]]
Length: 2, dtype: interval[datetime64[ns], right]

That's why we can remove the if test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your example, the test needs_i8_conversion(dtype) is False

This doesn't make sense to me. dtype should be np.dtype("M8[ns]"), for which needs_i8_conversion(dtype) is True.

For a starker example, if you pass a tz to the date_range, the ser.unique raises in this branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, the tz example shows that what I thought was dead code can't be removed. Will close this PR then.

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Feb 25, 2022

closing per comment above.

@Dr-Irv Dr-Irv closed this Feb 25, 2022
@Dr-Irv Dr-Irv deleted the deadinterval branch February 25, 2022 18:27
@jbrockmendel
Copy link
Member

Could use a test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Interval Interval data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants