Skip to content

Commit 60c1c26

Browse files
committed
API: Change faceted boxplot return_type
Aligns behavior of `Groupby.boxplot` and DataFrame.boxplot(by=.) to return a Series.
1 parent c19df34 commit 60c1c26

File tree

9 files changed

+73
-1085
lines changed

9 files changed

+73
-1085
lines changed

doc/source/visualization.rst

+21-16
Original file line numberDiff line numberDiff line change
@@ -458,22 +458,27 @@ columns:
458458

459459
.. warning::
460460

461-
The default changed from ``'dict'`` to ``'axes'`` in version 0.18.0.
462-
463-
Plot functions return scalar or arrays of :class:`matplotlib Axes <matplotlib.axes.Axes>`.
464-
In ``boxplot``, the return type can be controlled by the ``return_type``, keyword. The valid choices are ``{"axes", "dict", "both"}``. If the ``by`` argument is ``None``,
465-
466-
* ``'axes'`` returns a single matplotlib axes.
467-
* ``'dict'`` returns a dict of matplotlib artists, similar to the matplotlib boxplot function.
468-
* ``'both'`` returns a named tuple of axes and dicts.
469-
470-
When ``by`` is not None, you get back an ``OrderedDict`` of whatever ``return_type`` is.
471-
Unless ``return_type`` is just ``None``, in which case you get back an array of axes.
472-
473-
Finally, when calling boxplot on a :class:`Groupby` object, an ``OrderedDict`` of ``return_type``
474-
is returned, where the keys are the same as the Groupby object. The plot has a
475-
facet for each key, with each facet containing a box for each column of the
476-
DataFrame.
461+
The default changed from ``'dict'`` to ``'axes'`` in version 0.19.0.
462+
463+
In ``boxplot``, the return type can be controlled by the ``return_type``, keyword. The valid choices are ``{"axes", "dict", "both", None}``.
464+
Faceting, created by ``DataFrame.boxplot`` with the ``by``
465+
keyword, will affect the output type as well:
466+
467+
================ ======= ==========================
468+
``return_type=`` Faceted Output type
469+
---------------- ------- --------------------------
470+
471+
``None`` No axes
472+
``None`` Yes 2-D ndarray of axes
473+
``'axes'`` No axes
474+
``'axes'`` Yes Series of axes
475+
``'dict'`` No dict of artists
476+
``'dict'`` Yes Series of dicts of artists
477+
``'both'`` No namedtuple
478+
``'both'`` Yes Series of namedtuples
479+
================ ======= ==========================
480+
481+
``Groupby.boxplot`` always returns a Series of ``return_type``.
477482

478483
.. ipython:: python
479484
:okwarning:

doc/source/whatsnew/v0.18.0.txt

-1
Original file line numberDiff line numberDiff line change
@@ -1168,7 +1168,6 @@ Removal of prior version deprecations/changes
11681168
- Removal of ``rolling_corr_pairwise`` in favor of ``.rolling().corr(pairwise=True)`` (:issue:`4950`)
11691169
- Removal of ``expanding_corr_pairwise`` in favor of ``.expanding().corr(pairwise=True)`` (:issue:`4950`)
11701170
- Removal of ``DataMatrix`` module. This was not imported into the pandas namespace in any event (:issue:`12111`)
1171-
- Changed the default value for the ``return_type`` parameter for ``DataFrame.plot.box`` and ``DataFrame.boxplot`` from ``None`` to ``"axes"``. These methods will now return a matplotlib axes by default instead of a dictionary of artists. See :ref:`here <visualization.box.return>` (:issue:`6581`).
11721171
- Removal of ``cols`` keyword in favor of ``subset`` in ``DataFrame.duplicated()`` and ``DataFrame.drop_duplicates()`` (:issue:`6680`)
11731172
- Removal of the ``read_frame`` and ``frame_query`` (both aliases for ``pd.read_sql``)
11741173
and ``write_frame`` (alias of ``to_sql``) functions in the ``pd.io.sql`` namespace,

doc/source/whatsnew/v0.19.0.txt

+2-1
Original file line numberDiff line numberDiff line change
@@ -494,6 +494,7 @@ API changes
494494
- ``__setitem__`` will no longer apply a callable rhs as a function instead of storing it. Call ``where`` directly to get the previous behavior. (:issue:`13299`)
495495
- Passing ``Period`` with multiple frequencies to normal ``Index`` now returns ``Index`` with ``object`` dtype (:issue:`13664`)
496496
- ``PeriodIndex.fillna`` with ``Period`` has different freq now coerces to ``object`` dtype (:issue:`13664`)
497+
- Faceted boxplots from ``DataFrame.boxplot(by=col)`` now return a ``Series`` when ``return_type`` is not None. Previously these returned an ``OrderedDict``. Note that when ``return_type=None``, the default, these still return a 2-D NumPy array. (:issue:`12216`, :issue:`7096`)
497498
- More informative exceptions are passed through the csv parser. The exception type would now be the original exception type instead of ``CParserError``. (:issue:`13652`)
498499
- ``astype()`` will now accept a dict of column name to data types mapping as the ``dtype`` argument. (:issue:`12086`)
499500
- The ``pd.read_json`` and ``DataFrame.to_json`` has gained support for reading and writing json lines with ``lines`` option see :ref:`Line delimited json <io.jsonl>` (:issue:`9180`)
@@ -1282,9 +1283,9 @@ Removal of prior version deprecations/changes
12821283

12831284
Now legacy time rules raises ``ValueError``. For the list of currently supported offsets, see :ref:`here <timeseries.offset_aliases>`
12841285

1286+
- The default value for the ``return_type`` parameter for ``DataFrame.plot.box`` and ``DataFrame.boxplot`` changed from ``None`` to ``"axes"``. These methods will now return a matplotlib axes by default instead of a dictionary of artists. See :ref:`here <visualization.box.return>` (:issue:`6581`).
12851287
- The ``tquery`` and ``uquery`` functions in the ``pandas.io.sql`` module are removed (:issue:`5950`).
12861288

1287-
12881289
.. _whatsnew_0190.performance:
12891290

12901291
Performance Improvements

pandas/tests/plotting/common.py

+4-3
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@
55
import os
66
import warnings
77

8-
from pandas import DataFrame
9-
from pandas.compat import zip, iteritems, OrderedDict
8+
from pandas import DataFrame, Series
9+
from pandas.compat import zip, iteritems
1010
from pandas.util.decorators import cache_readonly
1111
from pandas.types.api import is_list_like
1212
import pandas.util.testing as tm
@@ -445,7 +445,8 @@ def _check_box_return_type(self, returned, return_type, expected_keys=None,
445445
self.assertIsInstance(r, Axes)
446446
return
447447

448-
self.assertTrue(isinstance(returned, OrderedDict))
448+
self.assertTrue(isinstance(returned, Series))
449+
449450
self.assertEqual(sorted(returned.keys()), sorted(expected_keys))
450451
for key, value in iteritems(returned):
451452
self.assertTrue(isinstance(value, types[return_type]))

pandas/tests/plotting/test_boxplot_method.py

+15-13
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,12 @@ def test_boxplot_legacy(self):
9292
lines = list(itertools.chain.from_iterable(d.values()))
9393
self.assertEqual(len(ax.get_lines()), len(lines))
9494

95+
@slow
96+
def test_boxplot_return_type_none(self):
97+
# GH 12216; return_type=None & by=None -> axes
98+
result = self.hist_df.boxplot()
99+
self.assertTrue(isinstance(result, self.plt.Axes))
100+
95101
@slow
96102
def test_boxplot_return_type_legacy(self):
97103
# API change in https://github.com/pydata/pandas/pull/7096
@@ -103,10 +109,8 @@ def test_boxplot_return_type_legacy(self):
103109
with tm.assertRaises(ValueError):
104110
df.boxplot(return_type='NOTATYPE')
105111

106-
with tm.assert_produces_warning(FutureWarning):
107-
result = df.boxplot()
108-
# change to Axes in future
109-
self._check_box_return_type(result, 'dict')
112+
result = df.boxplot()
113+
self._check_box_return_type(result, 'axes')
110114

111115
with tm.assert_produces_warning(False):
112116
result = df.boxplot(return_type='dict')
@@ -140,6 +144,7 @@ def _check_ax_limits(col, ax):
140144
p = df.boxplot(['height', 'weight', 'age'], by='category')
141145
height_ax, weight_ax, age_ax = p[0, 0], p[0, 1], p[1, 0]
142146
dummy_ax = p[1, 1]
147+
143148
_check_ax_limits(df['height'], height_ax)
144149
_check_ax_limits(df['weight'], weight_ax)
145150
_check_ax_limits(df['age'], age_ax)
@@ -163,8 +168,7 @@ def test_boxplot_legacy(self):
163168
grouped = self.hist_df.groupby(by='gender')
164169
with tm.assert_produces_warning(UserWarning):
165170
axes = _check_plot_works(grouped.boxplot, return_type='axes')
166-
self._check_axes_shape(list(axes.values()), axes_num=2, layout=(1, 2))
167-
171+
self._check_axes_shape(list(axes.values), axes_num=2, layout=(1, 2))
168172
axes = _check_plot_works(grouped.boxplot, subplots=False,
169173
return_type='axes')
170174
self._check_axes_shape(axes, axes_num=1, layout=(1, 1))
@@ -175,7 +179,7 @@ def test_boxplot_legacy(self):
175179
grouped = df.groupby(level=1)
176180
with tm.assert_produces_warning(UserWarning):
177181
axes = _check_plot_works(grouped.boxplot, return_type='axes')
178-
self._check_axes_shape(list(axes.values()), axes_num=10, layout=(4, 3))
182+
self._check_axes_shape(list(axes.values), axes_num=10, layout=(4, 3))
179183

180184
axes = _check_plot_works(grouped.boxplot, subplots=False,
181185
return_type='axes')
@@ -184,8 +188,7 @@ def test_boxplot_legacy(self):
184188
grouped = df.unstack(level=1).groupby(level=0, axis=1)
185189
with tm.assert_produces_warning(UserWarning):
186190
axes = _check_plot_works(grouped.boxplot, return_type='axes')
187-
self._check_axes_shape(list(axes.values()), axes_num=3, layout=(2, 2))
188-
191+
self._check_axes_shape(list(axes.values), axes_num=3, layout=(2, 2))
189192
axes = _check_plot_works(grouped.boxplot, subplots=False,
190193
return_type='axes')
191194
self._check_axes_shape(axes, axes_num=1, layout=(1, 1))
@@ -226,8 +229,7 @@ def test_grouped_box_return_type(self):
226229
expected_keys=['height', 'weight', 'category'])
227230

228231
# now for groupby
229-
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
230-
result = df.groupby('gender').boxplot()
232+
result = df.groupby('gender').boxplot(return_type='dict')
231233
self._check_box_return_type(
232234
result, 'dict', expected_keys=['Male', 'Female'])
233235

@@ -347,7 +349,7 @@ def test_grouped_box_multiple_axes(self):
347349
with tm.assert_produces_warning(UserWarning):
348350
returned = df.boxplot(column=['height', 'weight', 'category'],
349351
by='gender', return_type='axes', ax=axes[0])
350-
returned = np.array(list(returned.values()))
352+
returned = np.array(list(returned.values))
351353
self._check_axes_shape(returned, axes_num=3, layout=(1, 3))
352354
self.assert_numpy_array_equal(returned, axes[0])
353355
self.assertIs(returned[0].figure, fig)
@@ -357,7 +359,7 @@ def test_grouped_box_multiple_axes(self):
357359
returned = df.groupby('classroom').boxplot(
358360
column=['height', 'weight', 'category'],
359361
return_type='axes', ax=axes[1])
360-
returned = np.array(list(returned.values()))
362+
returned = np.array(list(returned.values))
361363
self._check_axes_shape(returned, axes_num=3, layout=(1, 3))
362364
self.assert_numpy_array_equal(returned, axes[1])
363365
self.assertIs(returned[0].figure, fig)

pandas/tests/plotting/test_frame.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1233,7 +1233,7 @@ def test_boxplot_subplots_return_type(self):
12331233

12341234
# normal style: return_type=None
12351235
result = df.plot.box(subplots=True)
1236-
self.assertIsInstance(result, np.ndarray)
1236+
self.assertIsInstance(result, Series)
12371237
self._check_box_return_type(result, None, expected_keys=[
12381238
'height', 'weight', 'category'])
12391239

0 commit comments

Comments
 (0)