Skip to content

BUG: Fix error for boxplot when using a pre-grouped DataFrame with more than one grouping #57985

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 31, 2024
Merged
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -409,7 +409,7 @@ Period

Plotting
^^^^^^^^
-
- Bug in :meth:`.DataFrameGroupBy.boxplot` passes a ``tuple`` instead of a ``string`` when input ``DataFrame`` is pre-grouped using more than one ``column`` (:issue:`14701`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer accurate. I'd recommend something more generic, such as

Bug in :meth:`.DataFrameGroupBy.boxplot` failed when there were multiple groupings (:issue:14701)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I will update it.

-

Groupby/resample/rolling
Expand Down
8 changes: 5 additions & 3 deletions pandas/plotting/_matplotlib/boxplot.py
Original file line number Diff line number Diff line change
Expand Up @@ -533,14 +533,16 @@ def boxplot_frame_groupby(
)
axes = flatten_axes(axes)

ret = pd.Series(dtype=object)

data = {}
for (key, group), ax in zip(grouped, axes):
d = group.boxplot(
ax=ax, column=column, fontsize=fontsize, rot=rot, grid=grid, **kwds
)
ax.set_title(pprint_thing(key))
ret.loc[key] = d
# GH 14701 refactored to allow the 'key' to be passed as a tuple,
# which occurs when there is more than one group
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment should be removed.

data[key] = d
ret = pd.Series(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to be creating ret on every loop, only after the for loop is done. This line can be dedented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem.

maybe_adjust_figure(fig, bottom=0.15, top=0.9, left=0.1, right=0.9, wspace=0.2)
else:
keys, frames = zip(*grouped)
Expand Down
14 changes: 14 additions & 0 deletions pandas/tests/plotting/test_boxplot_method.py
Original file line number Diff line number Diff line change
Expand Up @@ -740,3 +740,17 @@ def test_boxplot_multiindex_column(self):
expected_xticklabel = ["(bar, one)", "(bar, two)"]
result_xticklabel = [x.get_text() for x in axes.get_xticklabels()]
assert expected_xticklabel == result_xticklabel

@pytest.mark.parametrize("group", ["X", ["X", "Y"]])
def test_boxplot_multi_groupby_groups(self, group):
# GH 14701
rows = 20
df = DataFrame(
np.random.default_rng(12).normal(size=(rows, 2)), columns=["Col1", "Col2"]
)
df["X"] = Series(np.repeat(["A", "B"], int(rows / 2)))
df["Y"] = Series(np.tile(["C", "D"], int(rows / 2)))
grouped = df.groupby(group)
_check_plot_works(df.boxplot, by=group, default_axes=True)
_check_plot_works(df.plot.box, by=group, default_axes=True)
_check_plot_works(grouped.boxplot, default_axes=True)
Loading