Skip to content

Commit 6eda77e

Browse files
mabelvjjorisvandenbossche
authored andcommitted
DOC: update pandas.DataFrame.boxplot docstring. Fixes #8847 (#20152)
1 parent e71c02a commit 6eda77e

File tree

1 file changed

+144
-30
lines changed

1 file changed

+144
-30
lines changed

pandas/plotting/_core.py

+144-30
Original file line numberDiff line numberDiff line change
@@ -1995,50 +1995,164 @@ def plot_series(data, kind='line', ax=None, # Series unique
19951995

19961996

19971997
_shared_docs['boxplot'] = """
1998-
Make a box plot from DataFrame column optionally grouped by some columns or
1999-
other inputs
1998+
Make a box plot from DataFrame columns.
1999+
2000+
Make a box-and-whisker plot from DataFrame columns, optionally grouped
2001+
by some other columns. A box plot is a method for graphically depicting
2002+
groups of numerical data through their quartiles.
2003+
The box extends from the Q1 to Q3 quartile values of the data,
2004+
with a line at the median (Q2). The whiskers extend from the edges
2005+
of box to show the range of the data. The position of the whiskers
2006+
is set by default to `1.5 * IQR (IQR = Q3 - Q1)` from the edges of the box.
2007+
Outlier points are those past the end of the whiskers.
2008+
2009+
For further details see
2010+
Wikipedia's entry for `boxplot <https://en.wikipedia.org/wiki/Box_plot>`_.
20002011
20012012
Parameters
20022013
----------
2003-
data : the pandas object holding the data
2004-
column : column name or list of names, or vector
2005-
Can be any valid input to groupby
2006-
by : string or sequence
2007-
Column in the DataFrame to group by
2008-
ax : Matplotlib axes object, optional
2009-
fontsize : int or string
2010-
rot : label rotation angle
2014+
column : str or list of str, optional
2015+
Column name or list of names, or vector.
2016+
Can be any valid input to :meth:`pandas.DataFrame.groupby`.
2017+
by : str or array-like, optional
2018+
Column in the DataFrame to :meth:`pandas.DataFrame.groupby`.
2019+
One box-plot will be done per value of columns in `by`.
2020+
ax : object of class matplotlib.axes.Axes, optional
2021+
The matplotlib axes to be used by boxplot.
2022+
fontsize : float or str
2023+
Tick label font size in points or as a string (e.g., `large`).
2024+
rot : int or float, default 0
2025+
The rotation angle of labels (in degrees)
2026+
with respect to the screen coordinate sytem.
2027+
grid : boolean, default True
2028+
Setting this to True will show the grid.
20112029
figsize : A tuple (width, height) in inches
2012-
grid : Setting this to True will show the grid
2013-
layout : tuple (optional)
2014-
(rows, columns) for the layout of the plot
2015-
return_type : {None, 'axes', 'dict', 'both'}, default None
2016-
The kind of object to return. The default is ``axes``
2017-
'axes' returns the matplotlib axes the boxplot is drawn on;
2018-
'dict' returns a dictionary whose values are the matplotlib
2019-
Lines of the boxplot;
2020-
'both' returns a namedtuple with the axes and dict.
2021-
2022-
When grouping with ``by``, a Series mapping columns to ``return_type``
2023-
is returned, unless ``return_type`` is None, in which case a NumPy
2024-
array of axes is returned with the same shape as ``layout``.
2025-
See the prose documentation for more.
2026-
2027-
`**kwds` : Keyword Arguments
2030+
The size of the figure to create in matplotlib.
2031+
layout : tuple (rows, columns), optional
2032+
For example, (3, 5) will display the subplots
2033+
using 3 columns and 5 rows, starting from the top-left.
2034+
return_type : {'axes', 'dict', 'both'} or None, default 'axes'
2035+
The kind of object to return. The default is ``axes``.
2036+
2037+
* 'axes' returns the matplotlib axes the boxplot is drawn on.
2038+
* 'dict' returns a dictionary whose values are the matplotlib
2039+
Lines of the boxplot.
2040+
* 'both' returns a namedtuple with the axes and dict.
2041+
* when grouping with ``by``, a Series mapping columns to
2042+
``return_type`` is returned.
2043+
2044+
If ``return_type`` is `None`, a NumPy array
2045+
of axes with the same shape as ``layout`` is returned.
2046+
**kwds
20282047
All other plotting keyword arguments to be passed to
2029-
matplotlib's boxplot function
2048+
:func:`matplotlib.pyplot.boxplot`.
20302049
20312050
Returns
20322051
-------
2033-
lines : dict
2034-
ax : matplotlib Axes
2035-
(ax, lines): namedtuple
2052+
result :
2053+
2054+
The return type depends on the `return_type` parameter:
2055+
2056+
* 'axes' : object of class matplotlib.axes.Axes
2057+
* 'dict' : dict of matplotlib.lines.Line2D objects
2058+
* 'both' : a nametuple with strucure (ax, lines)
2059+
2060+
For data grouped with ``by``:
2061+
2062+
* :class:`~pandas.Series`
2063+
* :class:`~numpy.array` (for ``return_type = None``)
2064+
2065+
See Also
2066+
--------
2067+
Series.plot.hist: Make a histogram.
2068+
matplotlib.pyplot.boxplot : Matplotlib equivalent plot.
20362069
20372070
Notes
20382071
-----
20392072
Use ``return_type='dict'`` when you want to tweak the appearance
20402073
of the lines after plotting. In this case a dict containing the Lines
20412074
making up the boxes, caps, fliers, medians, and whiskers is returned.
2075+
2076+
Examples
2077+
--------
2078+
2079+
Boxplots can be created for every column in the dataframe
2080+
by ``df.boxplot()`` or indicating the columns to be used:
2081+
2082+
.. plot::
2083+
:context: close-figs
2084+
2085+
>>> np.random.seed(1234)
2086+
>>> df = pd.DataFrame(np.random.randn(10,4),
2087+
... columns=['Col1', 'Col2', 'Col3', 'Col4'])
2088+
>>> boxplot = df.boxplot(column=['Col1', 'Col2', 'Col3'])
2089+
2090+
Boxplots of variables distributions grouped by the values of a third
2091+
variable can be created using the option ``by``. For instance:
2092+
2093+
.. plot::
2094+
:context: close-figs
2095+
2096+
>>> df = pd.DataFrame(np.random.randn(10, 2),
2097+
... columns=['Col1', 'Col2'])
2098+
>>> df['X'] = pd.Series(['A', 'A', 'A', 'A', 'A',
2099+
... 'B', 'B', 'B', 'B', 'B'])
2100+
>>> boxplot = df.boxplot(by='X')
2101+
2102+
A list of strings (i.e. ``['X', 'Y']``) can be passed to boxplot
2103+
in order to group the data by combination of the variables in the x-axis:
2104+
2105+
.. plot::
2106+
:context: close-figs
2107+
2108+
>>> df = pd.DataFrame(np.random.randn(10,3),
2109+
... columns=['Col1', 'Col2', 'Col3'])
2110+
>>> df['X'] = pd.Series(['A', 'A', 'A', 'A', 'A',
2111+
... 'B', 'B', 'B', 'B', 'B'])
2112+
>>> df['Y'] = pd.Series(['A', 'B', 'A', 'B', 'A',
2113+
... 'B', 'A', 'B', 'A', 'B'])
2114+
>>> boxplot = df.boxplot(column=['Col1', 'Col2'], by=['X', 'Y'])
2115+
2116+
The layout of boxplot can be adjusted giving a tuple to ``layout``:
2117+
2118+
.. plot::
2119+
:context: close-figs
2120+
2121+
>>> boxplot = df.boxplot(column=['Col1', 'Col2'], by='X',
2122+
... layout=(2, 1))
2123+
2124+
Additional formatting can be done to the boxplot, like suppressing the grid
2125+
(``grid=False``), rotating the labels in the x-axis (i.e. ``rot=45``)
2126+
or changing the fontsize (i.e. ``fontsize=15``):
2127+
2128+
.. plot::
2129+
:context: close-figs
2130+
2131+
>>> boxplot = df.boxplot(grid=False, rot=45, fontsize=15)
2132+
2133+
The parameter ``return_type`` can be used to select the type of element
2134+
returned by `boxplot`. When ``return_type='axes'`` is selected,
2135+
the matplotlib axes on which the boxplot is drawn are returned:
2136+
2137+
>>> boxplot = df.boxplot(column=['Col1','Col2'], return_type='axes')
2138+
>>> type(boxplot)
2139+
<class 'matplotlib.axes._subplots.AxesSubplot'>
2140+
2141+
When grouping with ``by``, a Series mapping columns to ``return_type``
2142+
is returned:
2143+
2144+
>>> boxplot = df.boxplot(column=['Col1', 'Col2'], by='X',
2145+
... return_type='axes')
2146+
>>> type(boxplot)
2147+
<class 'pandas.core.series.Series'>
2148+
2149+
If ``return_type`` is `None`, a NumPy array of axes with the same shape
2150+
as ``layout`` is returned:
2151+
2152+
>>> boxplot = df.boxplot(column=['Col1', 'Col2'], by='X',
2153+
... return_type=None)
2154+
>>> type(boxplot)
2155+
<class 'numpy.ndarray'>
20422156
"""
20432157

20442158

0 commit comments

Comments
 (0)