-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: fix Categorical.astype for dtype=np.int32 argument #39615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine. can you add a note in fixed regressions for 1.2.2 & re-run appropriate asv's to make sure not much change.
@arw2019 when do you think you can do this? 1.2.2 release scheduled for today. we have a couple of blockers though. |
@simonjayhawkins this is ready to go assuming nothing comes up in benchmarks. Running them now, will post once they're done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok to merge ,a followup comment (which can be here or in a followon for 1.3)
pandas/tests/series/test_dtypes.py
Outdated
@@ -94,6 +94,14 @@ def cmp(a, b): | |||
result = ser.astype("object").astype(CategoricalDtype()) | |||
tm.assert_series_equal(result, roundtrip_expected) | |||
|
|||
def test_categorical_int_to_int32(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be named test_categorical_astype_to_int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pandas/tests/series/test_dtypes.py
Outdated
@@ -94,6 +94,14 @@ def cmp(a, b): | |||
result = ser.astype("object").astype(CategoricalDtype()) | |||
tm.assert_series_equal(result, roundtrip_expected) | |||
|
|||
def test_categorical_int_to_int32(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you parameterize on the any_int_dtype, ideally this also works on any_int_nullable_dtype (but not sure that will just work).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done - created a new fixture any_int_or_nullable_int_dtype
@jreback addressed test comments here - does work already for all int dtypes, including nullable |
Doesn't seem to affect basic Categorical benchmarks (but will run some others that I think hit this)
|
@@ -1242,6 +1242,32 @@ def any_nullable_int_dtype(request): | |||
return request.param | |||
|
|||
|
|||
@pytest.fixture(params=tm.ALL_INT_DTYPES + tm.ALL_EA_INT_DTYPES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i guess this is a missing one
Some issues in groupby benchmarks (I don't have experience with how noisy these are - can rerun to see)
|
re-running the groupby benchmarks that had been affected:
|
since a different combination seems to be affected on a re-run i'm inclined to think that the perf regressions are (at least largely?) noise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm am concerned about this led issue
let's push this to 1.2.3 until can look at closer
ok! anything i can do to move this forward (is it worth getting reliable numbers for some of these with |
since categorical is used in groupby this slows down everything see if it's possible to bypass / special case the conversion (when it doesn't need to be converted); eg it's already int64 - might fix it |
Benchmarks as of e108f0d (change is to use
|
ok great |
thanks @arw2019 |
@meeseeksdev backport 1.2.x |
This comment has been minimized.
This comment has been minimized.
cc @simonjayhawkins @arw2019 for manual backport |
…p.int32 argument'
…gument' (#39676) Co-authored-by: Andrew Wieteska <[email protected]>
Fixing a bug I introduced in #37355