Skip to content

Fix scatter norm keyword #45966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Mar 6, 2022
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.5.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -375,7 +375,7 @@ Plotting
- Bug in :meth:`DataFrame.plot.box` that prevented labeling the x-axis (:issue:`45463`)
- Bug in :meth:`DataFrame.boxplot` that prevented passing in ``xlabel`` and ``ylabel`` (:issue:`45463`)
- Bug in :meth:`DataFrame.boxplot` that prevented specifying ``vert=False`` (:issue:`36918`)
-
- Bug in :meth:`DataFrame.scatter` that prevented specifying ``norm`` (:issue:`45809`)
Copy link
Contributor

@tdy tdy Mar 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this PR has now been merged, but I think this is supposed to be `DataFrame.plot.scatter`. Are the changelogs worth fixing?


Groupby/resample/rolling
^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
2 changes: 1 addition & 1 deletion pandas/plotting/_matplotlib/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -1082,7 +1082,7 @@ def _make_plot(self):
bounds = np.linspace(0, n_cats, n_cats + 1)
norm = colors.BoundaryNorm(bounds, cmap.N)
else:
norm = None
norm = self.kwds.pop("norm", None)
# plot colorbar if
# 1. colormap is assigned, and
# 2.`c` is a column containing only numeric values
Expand Down
19 changes: 19 additions & 0 deletions pandas/tests/plotting/frame/test_frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -787,6 +787,25 @@ def test_plot_scatter_with_s(self):
ax = df.plot.scatter(x="a", y="b", s="c")
tm.assert_numpy_array_equal(df["c"].values, right=ax.collections[0].get_sizes())

def test_plot_scatter_with_norm(self):
# added while fixing GH 45809
import matplotlib as mpl

df = DataFrame(np.random.random((10, 3)) * 100, columns=["a", "b", "c"])
norm = mpl.colors.LogNorm()
ax = df.plot.scatter(x="a", y="b", c="c", norm=norm)
assert ax.collections[0].norm is norm

def test_plot_scatter_without_norm(self):
# added while fixing GH 45809
import matplotlib as mpl

df = DataFrame(np.random.random((10, 3)) * 100, columns=["a", "b", "c"])
ax = df.plot.scatter(x="a", y="b", c="c")
color_min_max = (df.c.min(), df.c.max())
default_norm = mpl.colors.Normalize(*color_min_max)
assert all(df.c.apply(lambda x: ax.collections[0].norm(x) == default_norm(x)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be done without apply? It will be easier to debug if this assertion doesn't go through necessary code paths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the better approach? Just a loop?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the old check you had prior was good. The data doesn't have to be random as well if that helps shorten your loop and allows looping over less values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll revert to the previous check. I think looping over the values in the dataframe makes sense; it's replicating what's going on in the actual plot, and there's no real benefit to including values that are out of bounds - that's matplotlib normalize functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes made


@pytest.mark.slow
def test_plot_bar(self):
df = DataFrame(
Expand Down