Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
BUG: DataFrameGroupBy.value_counts() fails if as_index=False and there are duplicate column labels #45160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: DataFrameGroupBy.value_counts() fails if as_index=False and there are duplicate column labels #45160
Changes from 3 commits
696130b
6b03989
db2f38a
9093374
4f65829
85cf095
68ae88b
faa17e5
44ff075
c097e5d
7532cc0
d490187
837c850
89c90c4
92999fb
6e55670
a47bbf7
ad164dc
0f4b155
f6be00d
0b4853c
fece32b
df127ef
36f2b0d
207b55e
e3b245c
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result is a Series at this point? ._values is generally preferable to .values, as the latter will cast dt64tz to ndarray[object]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I decided to do this differently following a suggestion from a while ago (#44755 (comment))
I really wanted these methods to be public, but it is not happening...
Anyway, this way is more DRY and I shouldn't really be writing my own version of reset_index().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this almost O(n^2) inefficient.
so what you need to do is
pls don't conflate this issue with reset_index or allow_duplicates, they are completely orthogonal.
this is not going to move forward this keeps re-inventing the wheel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not O(n^2)! But O(n), and of course I would rather not be itererating through an axis. That is the whole point of Pandas!
concat does not work, as I explained many times. The best I could do was:
which fails 6 of my tests due to bool/object problems in MultiIndex. These will probably be fixed by #45061, but I see that has been deferred until 1.5.
Meanwhile, I employed your column renaming suggestion and there is no more looping. It is now all green (apart from the usual ci problems)