Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ENH: Allow column grouping in DataFrame.plot #29944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Allow column grouping in DataFrame.plot #29944
Changes from 53 commits
469eff8
b96a659
c5b187b
99cc049
06da910
49c728a
b370ba5
d1a2ae0
10ac363
9774d42
572de08
4519f22
c6edb9f
73dd7eb
b804e57
d7ee47c
8fb298d
d84f356
ba0f982
fe98b86
f81d295
8255405
f54459e
f962425
dda80b3
a457b38
8d85ba9
bbaeffa
81f2a7f
fdd7610
fe965b0
4478597
ac51202
722c83b
c601c2d
c99abba
ea6c73d
9184e64
34515a8
ae90509
827c5b5
5aace24
b28d3b3
587cbc1
1517ae3
8c61b83
b75d86c
55928ff
b5f47f2
43070d3
e3c0146
a4c2676
71b7b39
4157618
9b44546
2d89f73
8cd6342
ffd4b18
66342bc
48c9f32
aa128dc
69efd18
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, are the docs rendered correctly with this empty line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need an empty line because we're starting a list.
Is there a CI job that builds the docs? I couldn't find it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, CI runs to build the docs (but I am not sure that it checks for these features).
But you can build the particular page (for
DataFrame.plot
) yourself to ensure it is correct.https://pandas.pydata.org/pandas-docs/dev/development/contributing_documentation.html#building-the-documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't build the docs
fails with stuff like
WARNING: autodoc: failed to import method 'plot.scatter' from module 'DataFrame'; the following exception was raised: No module named 'DataFrame'
(and with some theme issue)
So instead I just added a line so that the docstring for subplots is consistent with the rest. If the rest is fine, then this one should be fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if
subplots
is boolean, then would it be necessary to validateself._kind
?If so, then, I guess, it deserves a separate method to be invoked at the moment of
self.kind
initialization.By the way, this is quite confusing for me (probably regardless of the given PR): inside
MPLPlot
one initializesself.kind
, but it seems that it is not used anywhere, onlyself._kind
does.Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When subplots is boolean we return immediately as we fall back to the current behaviour, as in master. I haven't double checked but I would hope that
self._kind
is validated separately - I mean, it's_kind
, notkind
.The validation happening in
_validate_subplots_kwarg
is only forsubplots
, and its interactions with other parameters (such as_kind
) whensubplots
is an iterable. But its job is not to validate_kind
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really a desirable behavior?
Why not allowing one to plot the same line against different sets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently (in master), only one column per plot is possible.
Allowing more than one column per plot should be possible, but would make the PR more complex. This PR is already 1.5+ years long so I would leave that improvement for a subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, agree, could be a follow-up enhancement PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this
else
clause required? I guess, that it may be dropped.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a matter of style. I personally think that this way is less ambiguous when the first return statement is deeply nested with 4 indentation levels: it's easy to miss and to think that the function only returns
col_idx
. Theelse
clause avoids that potential confusion. This choice was deliberate on my part, but I don't mind changing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose, that once you added return type,
mypy
started complaining.Inside the loop you return only in case
if col_idx in group
, this is the reason for themypy
error.I suggest that you modify it:
Or something like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think mypy is wrong here. Because of the validation done earlier, there must be a valid column. Raising an exception might make mypy happy but it would never be hit (because of the validation), and we wouldn't be able to test against it so this would reduce test coverage. I changed the for loop into a
next()
, which seems to trick mypy into being reasonable again.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good parametrization!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really a desirable behavior (see other comment)?