-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Allow column grouping in DataFrame.plot #29944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@NicolasHug is this still active? Looks like a merge conflict and CI failure need to be addressed |
I'm still happy to work on this. I just haven't received initial feedback yet (and don't want to push either, I understand this isn't necessarily a high priority)
|
Understood. I doubt anyone has reviewed since CI has been red, so definitely want to get that green to start |
Will do |
@charlesdong1991 any thoughts on this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed this feature request a while ago and personally really like this feature and find it quite potentially useful ^^ @WillAyd but I think the behaviour of this needs to be well defined.
Many thanks for the PR, it's very nice! @NicolasHug some minor comments:
- As mentioned in review, the input for
subplots
needs to be carefully defined, thus only checking if it isIterable
orbool
might not be sufficient. - And it would be very nice if you could add some detailed docstring for
subplots
parameter because this is a behaviour change, i think somewhere inPlotAccessor
in_core.py
. - And some more tests would be appreciated (see comments below)
- Nicer to have a docstring for
_col_idx_to_axis_idx
although i think it is quite straightforward - Nicer to have a whatsnew note in
1.1.0
rst !
Thanks a lot for the review @charlesdong1991 , I addressed your comments and added a bunch of validation / tests. It looks like the code works for most kinds of plots. You can see a few illustrations in this gist. Looks like the subplot titles could be improved in some cases, I'll see what I can do. But otherwise I think this is ready for another round ;) |
Regarding the docstring, I'm a bit confused because it seems that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice! some minor comments
Could you also please make some docs for subplots
? I think you could add it in PlotAcessor
, there are some missing docs for parameters, so very nice this one is found out! ^^
pandas/plotting/_matplotlib/core.py
Outdated
"kde", | ||
"density", | ||
"area", | ||
"pie", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at your impressive examples, maybe we should exclude pie
chart here? doesnt seem to fit as good as other charts here? any thoughts on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it doesn't look good, but it seems to be consistent (i.e. equally bad) as when subplots=True
. I updated the notebook at the end where you can see how it looks. So maybe it can be kept? No strong opinion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have no strong opinion either, my only concern is for pie chart, if they share the same category, they will have the same color in the pie, but different sizes depending on the corresponding values.
but no strong opinion
pandas/plotting/_matplotlib/core.py
Outdated
) | ||
|
||
if any( | ||
not isinstance(group, Iterable) or isinstance(group, str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should only not isinstance(group, Iterable)
be enough here? isinstance(group, str)
also falls into this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, we want group
to be an iterable but we don't want group
to be a str, which is also iterable.
Only using if any(not isinstance(group, Iterable)
would allow strings and we don't want that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Pending CI checks.
for group_idx, group in enumerate(self.subplots): | ||
if col_idx in group: | ||
return group_idx | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose, that once you added return type, mypy
started complaining.
Inside the loop you return only in case if col_idx in group
, this is the reason for the mypy
error.
I suggest that you modify it:
for group_idx, group in enumerate(self.subplots):
if col_idx in group:
return group_idx
raise KeyError(f"Failed to find {col_idx} in {self.subplots}")
Or something like this.
@NicolasHug was looking good. if you want to continue, pls merge master and can relook. |
sorry for the delay here @NicolasHug was looking good. if you want to continue, pls merge master and can relook. |
PR is up to date with |
@NicolasHug can you merge master and we'll see if can get in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sticking with this @NicolasHug. I merged master and did a small refactor to validate the subplot groups in one pass.
I think this looks pretty good to merge. I believe it requires @jreback approval to move forward.
thanks @NicolasHug very nice! |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
This is an attempt at #29688 i.e. group some columns on the same ax when calling
df.plot
See this gist for illustration of different supported plots
Thanks!