-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Fix scatter norm keyword #45966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix scatter norm keyword #45966
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will also need an entry in doc/source/whatsnew/v1.5.0.rst
in the bug fix / visualization section
small comment & if you can merge master; ping on green |
ax = df.plot.scatter(x="a", y="b", c="c") | ||
color_min_max = (df.c.min(), df.c.max()) | ||
default_norm = mpl.colors.Normalize(*color_min_max) | ||
assert all(df.c.apply(lambda x: ax.collections[0].norm(x) == default_norm(x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be done without apply? It will be easier to debug if this assertion doesn't go through necessary code paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the better approach? Just a loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the old check you had prior was good. The data doesn't have to be random as well if that helps shorten your loop and allows looping over less values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll revert to the previous check. I think looping over the values in the dataframe makes sense; it's replicating what's going on in the actual plot, and there's no real benefit to including values that are out of bounds - that's matplotlib normalize functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes made
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm over to you @mroeschke
Great PR @Andy-Grigg, thanks. |
@@ -421,7 +421,7 @@ Plotting | |||
- Bug in :meth:`DataFrame.plot.box` that prevented labeling the x-axis (:issue:`45463`) | |||
- Bug in :meth:`DataFrame.boxplot` that prevented passing in ``xlabel`` and ``ylabel`` (:issue:`45463`) | |||
- Bug in :meth:`DataFrame.boxplot` that prevented specifying ``vert=False`` (:issue:`36918`) | |||
- | |||
- Bug in :meth:`DataFrame.scatter` that prevented specifying ``norm`` (:issue:`45809`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this PR has now been merged, but I think this is supposed to be `DataFrame.plot.scatter`
. Are the changelogs worth fixing?
sure can do another PR to update |
* TST: add test for norm scatter plot parameter (pandas-dev#45809) * BUG: don't duplicate norm parameter for scatter plots (pandas-dev#45809) * TST: Add github issue numbers (GH45809) * TST: remove dependence on private attributes (pandas-dev#45809) * DOC: add entry to visualization bug fixes (pandas-dev#45809) * TST: reduce number of norm comparisons (pandas-dev#45809) * TST: simplify test (pandas-dev#45809)
norm
keyword fails with 'multiple values for keyword argument' TypeError #45809doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.This PR addresses a bug where the
norm
parameter is used for a scatter plot where thec
color is a non-categorical type (e.g. a numeric type). This combination of inputs would result in thenorm
parameter being specified both as an explicit keyword argument and inself.kwds
.The behavior has changed such that
norm
is only ever specified once, and in the scenario described above the value specified by the user will be honored.I tried to test that the specific
norm
value ended up in the plot, but there doesn't seem to be a good way of comparingnorm._scale
objects without comparing the individual attributes. I'm happy to add this in if required.