Skip to content

CLN: Remove MultiIndex._get_grouper_for_level #49597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

codamuse
Copy link
Contributor

@codamuse codamuse commented Nov 9, 2022

@codamuse codamuse marked this pull request as ready for review November 9, 2022 03:11
@codamuse codamuse changed the title CLN: deleted function CLN: Remove MultiIndex._get_grouper_for_level Nov 9, 2022
@mroeschke mroeschke requested a review from rhshadrach November 9, 2022 19:38
@mroeschke
Copy link
Member

I think according to #49452 (comment), we would like this to return a NotImplementedError

@rhshadrach
Copy link
Member

rhshadrach commented Nov 9, 2022

I think according to #49452 (comment), we would like this to return a NotImplementedError

I think so - the idea being that for a MultiIndex, the current implementation of Index._get_grouper_for_level would likely give odd errors or incorrect results if called on a MultiIndex. We should fail clearly if this is attempted.

@codamuse
Copy link
Contributor Author

codamuse commented Nov 9, 2022

Ok no problem. I'll push again today.

Sorry for some reason I thought I'd read a conversation where the decision was to remove it entirely instead of raise.

@jbrockmendel
Copy link
Member

if its just a 1-liner, id suggest putting it in the base class method with a quick check

if self._is_multi:
    raise NotImplementedError(...)

that way you can add @final to the Index method, which I always find helpful

@codamuse
Copy link
Contributor Author

if its just a 1-liner, id suggest putting it in the base class method with a quick check

if self._is_multi:
    raise NotImplementedError(...)

that way you can add @final to the Index method, which I always find helpful

Okay implemented as suggested in this last commit. But a few more comments/questions:

It seems to me that level is now only being asserted in _get_grouper_for_level but not used for anything else (and dropna not used at all). It looks like either grouper is set to the index or whatever mapping the mapper does on it and the only argument being used in here is mapper. If it's @final too then is there a reason to keep those extra arguments in? I'm sure there's a good reason and I'm just not familiar enough with any of the code base.

@rhshadrach
Copy link
Member

If it's @final too then is there a reason to keep those extra arguments in? I'm sure there's a good reason and I'm just not familiar enough with any of the code base.

Prior to #49373 they were used for MultiIndex, but now they no longer are. These can be removed - and doing so here would be great.

Moisan and others added 15 commits November 11, 2022 22:49
* Categorical.reorder_categories perf

* whatsnew
…#49594)

* API: Index(object_dtype_bool_ndarray) retain object dtype

* GH ref, test
* BUG: Series(index=[]) should have dtype=object

* parametrize tests

* accept dtype

* fix

Co-authored-by: Terji Petersen <[email protected]>
* CLN: collect fastpath in Series.__init__

* small clean

Co-authored-by: Terji Petersen <[email protected]>
…v#49592)

* DEPR: Enforce DataFrame(list_with_categorical) deprecation

* Update doc/source/whatsnew/v2.0.0.rst

Co-authored-by: Matthew Roeschke <[email protected]>

Co-authored-by: Matthew Roeschke <[email protected]>
…ons (pandas-dev#49551)

* WIP

* DEPR: Enforce deprecation of numeric_only=None in DataFrame aggregations

* Partial reverts

* numeric_only in generic/series, fixup

* cleanup

* Remove docs warning

* fixups

* Fixups
* add/timedeltas-seconds-documentation

* Fix line lenght

* fixing whitespace

* fixing whitespace

* fixing line lenght

* Adding suggestions

* Fixing single line error
jbrockmendel and others added 20 commits November 11, 2022 22:49
* DEPR: date_range(closed)

* Disallow partial slicing of missing

* Review + failed test
…andas-dev#49615)

* REGR: Better warning in pivot_table when dropping nuisance columns

* type-hint fixups
* REGR: MultiIndex.join does not work for ea dtypes

* Update base.py
…as-dev#49613)

* BUG: groupby with sort=False still sorts an ordered categorical

* Add versionchanged
… 1.5.0 (pandas-dev#49610)

BUG: Use naive wall time to perform offsets datetime64 conversion
* API: make Timestamp/Timedelta _as_unit public as_unit

* update test

* update test

* update tests

* fix pyi typo

* fixup

* fixup
* DEPR: Enforce Series(float_with_nan, dtype=inty)

* update asv

* troubleshoot asv

* suggested asv edit
…andas-dev#49628)

DEPR: Disallow missing nesed label when indexing MultiIndex level
added the `git fetch upstream` command
…49556)

* for pandas-dev#49508 changing Doc for DataFrame.astype

added the series in input in the doc of DataFrame.astype

* up

* up2

* up3

* up4

* up5
* DEPR: Remove df.reduction(level)

* test_*_consistency

* Fix asv

* Add issue ref
…das-dev#49622)

* DEPR: Enforce default of numeric_only=False

* Remove unused functions

* Add versionchanged

* Add Series.rank to whatsnew

* newline in docs
* STYLE: fix pylint reimported warnings

* fixup! STYLE: fix pylint reimported warnings
@rhshadrach
Copy link
Member

Looks like something went awry with pulling from main. In general, you should only need to do git pull upstream main in order to incorporate changes from main into your branch. At this point, there are a few options, but maybe the easiest would be to just siphon off your changes into a new branch. You can see changes from individual commits or consecutive groups of commits using the UI on github in the "Files changed" tab.

@codamuse
Copy link
Contributor Author

Looks like something went awry with pulling from main. In general, you should only need to do git pull upstream main in order to incorporate changes from main into your branch. At this point, there are a few options, but maybe the easiest would be to just siphon off your changes into a new branch. You can see changes from individual commits or consecutive groups of commits using the UI on github in the "Files changed" tab.

Ya my bad, I rebased my branch. I'm going to just start a new branch as suggested.

While removing the args from the Index class it looks like _get_grouper_for_level is returning extra Nones as well. In a new clean branch I'll make all the changes I think can happen to clean up and you can tell me if I've gone too far? Unfortunately still haven't been able to run the tests locally. I'm using the docker container and panda-dev environment. Maybe a fresh pull etc and this will all be cleared up but to be sure, I should expect the tests to all pass locally right?

Thank you!

@rhshadrach
Copy link
Member

I rebased my branch.

When I rebase, I first make sure all my changes are on origin (in case something goes wrong) and then do:

git merge-base HEAD upstream/main
[prints a commit hash]
git rebase -i [commit hash] -X theirs

Then pick the first commit and squash the rest. Sometimes if there are interleaving changes you get conflicts that need to be manually resolved, but most of the time everything goes through cleanly.

I'll make all the changes I think can happen to clean up and you can tell me if I've gone too far?

Sounds good!

I should expect the tests to all pass locally right?

For the most part, yes. Sometimes there are flakey tests, but otherwise, if the CI is green and your dev environment was created based on what's currently in main then the tests should pass.

@codamuse
Copy link
Contributor Author

Will open a new one shortly

@codamuse codamuse closed this Nov 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CLN: Remove MultiIndex._get_grouper_for_level