-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: Remove integer position args from xy for plotting #20371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: Remove integer position args from xy for plotting #20371
Conversation
# n.b. there appears to be no public method to get the colorbar | ||
# label | ||
assert ax.collections[0].colorbar._label == 'z' | ||
if self.mpl_ge_1_3_1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TomAugspurger we should be able to blow this code away as we > 1.4.3, yes (separate PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a sub-section in the whatsnew about this in api-breaking changes. Note we may want to consider deprecating this and removing in a future version @jorisvandenbossche @TomAugspurger . though I am fine with just changing it.
df = DataFrame({"A": [1, 2], 'B': [3, 4], 'C': [5, 6]}) | ||
with pytest.raises(ValueError): | ||
df.plot(x=x, y=y) | ||
|
||
@pytest.mark.parametrize("x,y,colnames", [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a test that checks the raise when passing ints and a non-int index
If it worked, I think we should deprecate it. But did it actually work? Because #20056 seems to indicate not? (if it didn't work, of course fine to just remove) |
So it seems that for |
I don't think we should remove this. I know positional plotting gets some use, especially for exploratory analysis when you have long column names. I'll look more at why the original issue was failiing. |
@jorisvandenbossche yes I believe this is due to the structure of the code - the index setting I mentioned in #20056 only happens to
@TomAugspurger why not ask the user to supply |
IME, DataFrame.plot is most often used in exploratory analysis. For this,
speed typing becomes an API consideration.
If you have a long column name, but want to quickly plot it, `df.plot(y=0)`
is nicer than `df.plot(y='percent_return_for_foo_over_bar')`.
instead of us making assumptions
I don't think we ever make assumptions? If there's ever an integer in the
column we don't fall back to positional indexing.
IIUC there was just a bug in how we computed the positions when both `x`
and `y` are specified.
…On Fri, Mar 16, 2018 at 9:26 AM, Mason Gallo ***@***.***> wrote:
So it seems that for line, bar, area it doesn't work, but for scatter, pie
and hexbin it does?
@jorisvandenbossche <https://github.com/jorisvandenbossche> yes I believe
this is due to the structure of the code - the index setting I mentioned in
#20056 <#20056> only happens
to kind not from that set - called _dataframe_kinds and _series_kinds.
I don't think we should remove this. I know positional plotting gets some
use, especially for exploratory analysis when you have long column names.
@TomAugspurger <https://github.com/tomaugspurger> why not ask the user to
supply data.columns[ind] instead of us making assumptions and calculating
it for them? IMO that would make the API less complex.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20371 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIpGFXfj-iGl2wL60jY-nACHpf7MNks5te8uQgaJpZM4Ss7iz>
.
|
That's fair - you have much more context than I do on usage.
We make an assumption by allowing mixed types in column names - i.e. we check with Seems like we don't have consensus on whether this should be removed from API. How should we proceed? I can close this and fix the integer indexing bug in #20000 ? |
Perhaps it's just semantics then, but I wouldn't call that assuming :) To me the behavior is clear (but could be better documented). Specifying
If you're comfortable with maybe wasting some effort, I would work around the bug in #20000. We can wait to hear what the other maintainers say before closing or working further on this issue, but I would be sad to see this go. |
Although there might be some "guessing" cases (mixed integer columns), this is also behaviour that used a lot throughout the code base (eg -- I don't have a strong feeling about this. If it is not too difficult to fix this to actually work, I am certainly fine with keeping it. |
FYI: will be closing this in favor of #20000 |
git diff upstream/master -u -- "*.py" | flake8 --diff
This is based on discussion from #20000 where we decided it was confusing to allow
df.plot
to support both labels AND positions. I wasn't sure where to put this in whatsnew, so lmk and I'll update it.