Skip to content

Commit 9db49c9

Browse files
committed
[WIP] DOC Fixes #8447 created new example and fixed issues
1 parent b1f0756 commit 9db49c9

File tree

1 file changed

+91
-77
lines changed

1 file changed

+91
-77
lines changed

pandas/plotting/_core.py

+91-77
Original file line numberDiff line numberDiff line change
@@ -1980,58 +1980,87 @@ def plot_series(data, kind='line', ax=None, # Series unique
19801980

19811981

19821982
_shared_docs['boxplot'] = """
1983-
Make a box-and-whisker plot from DataFrame column optionally grouped
1984-
by some columns or other inputs. The box extends from the Q1 to Q3
1985-
quartile values of the data, with a line at the median (Q2).
1986-
The whiskers extend from the edges of box to show the range of the data.
1987-
Flier points (outliers) are those past the end of the whiskers.
1988-
The position of the whiskers is set by default to 1.5 IQR (`whis=1.5``)
1989-
from the edge of the box.
1983+
Make a box plot from DataFrame columns.
1984+
1985+
Make a box-and-whisker plot from DataFrame columns optionally grouped
1986+
by some other columns. A box plot is a method for graphically depicting
1987+
groups of numerical data through their quartiles.
1988+
The box extends from the Q1 to Q3 quartile values of the data,
1989+
with a line at the median (Q2).The whiskers extend from the edges
1990+
of box to show the range of the data. The position of the whiskers
1991+
is set by default to 1.5*IQR (IQR = Q3 - Q1) from the edges of the box.
1992+
Outlier points are those past the end of the whiskers.
19901993
19911994
For further details see
1992-
Wikipedia's entry for `boxplot <https://en.wikipedia.org/wiki/Box_plot/>`_.
1995+
Wikipedia's entry for `boxplot <https://en.wikipedia.org/wiki/Box_plot>`_.
19931996
19941997
Parameters
19951998
----------
1996-
column : column name or list of names, or vector
1999+
column : str or list of str, optional
2000+
Column name or list of names, or vector.
19972001
Can be any valid input to groupby.
1998-
by : string or sequence
2002+
by : str or array-like
19992003
Column in the DataFrame to groupby.
2000-
ax : Matplotlib axes object, (default `None`)
2004+
ax : object of class matplotlib.axes.Axes, default `None`
20012005
The matplotlib axes to be used by boxplot.
2002-
fontsize : int or string
2003-
The font-size used by matplotlib.
2004-
rot : label rotation angle
2005-
The rotation angle of labels.
2006-
grid : boolean( default `True`)
2006+
fontsize : float or str
2007+
Tick label font size in points or as a string (e.g., ‘large’)
2008+
(see `matplotlib.axes.Axes.tick_params
2009+
<https://matplotlib.org/api/_as_gen/
2010+
matplotlib.axes.Axes.tick_params.html>`_).
2011+
rot : int or float, default 0
2012+
The rotation angle of labels (in degrees)
2013+
with respect to the screen coordinate sytem.
2014+
grid : boolean, default `True`
20072015
Setting this to True will show the grid.
20082016
figsize : A tuple (width, height) in inches
2009-
The size of the figure to create in inches by default.
2010-
layout : tuple (optional)
2011-
Tuple (rows, columns) used for the layout of the plot.
2012-
return_type : {None, 'axes', 'dict', 'both'}, default None
2013-
The kind of object to return. The default is ``axes``
2014-
'axes' returns the matplotlib axes the boxplot is drawn on;
2015-
'dict' returns a dictionary whose values are the matplotlib
2016-
Lines of the boxplot;
2017-
'both' returns a namedtuple with the axes and dict.
2018-
When grouping with ``by``, a Series mapping columns to ``return_type``
2019-
is returned, unless ``return_type`` is None, in which case a NumPy
2020-
array of axes is returned with the same shape as ``layout``.
2021-
See the prose documentation for more.
2022-
kwds : Keyword Arguments (optional)
2017+
The size of the figure to create in matplotlib.
2018+
layout : tuple (rows, columns) (optional)
2019+
For example, (3, 5) will display the subplots
2020+
using 3 columns and 5 rows, starting from the top-left.
2021+
return_type : {None, 'axes', 'dict', 'both'}, default 'axes'
2022+
The kind of object to return. The default is ``axes``.
2023+
2024+
* 'axes' returns the matplotlib axes the boxplot is drawn on.
2025+
* 'dict' returns a dictionary whose values are the matplotlib
2026+
Lines of the boxplot.
2027+
* 'both' returns a namedtuple with the axes and dict.
2028+
* when grouping with ``by``, a Series mapping columns to
2029+
``return_type`` is returned (i.e.
2030+
``df.boxplot(column=['Col1','Col2'], by='var',return_type='axes')``
2031+
may return ``Series([AxesSubplot(..),AxesSubplot(..)],
2032+
index=['Col1','Col2'])``).
2033+
2034+
If ``return_type`` is `None`, a NumPy array
2035+
of axes with the same shape as ``layout`` is returned
2036+
(i.e. ``df.boxplot(column=['Col1','Col2'],
2037+
by='var',return_type=None)`` may return a
2038+
``array([<matplotlib.axes._subplots.AxesSubplot object at ..>,
2039+
<matplotlib.axes._subplots.AxesSubplot object at ..>],
2040+
dtype=object)``).
2041+
**kwds : Keyword Arguments (optional)
20232042
All other plotting keyword arguments to be passed to
2024-
matplotlib's function.
2043+
`matplotlib.pyplot.boxplot <https://matplotlib.org/api/_as_gen/
2044+
matplotlib.pyplot.boxplot.html#matplotlib.pyplot.boxplot>`_.
20252045
20262046
Returns
20272047
-------
2028-
lines : dict
2029-
ax : matplotlib Axes
2030-
(ax, lines): namedtuple
2048+
result:
2049+
Options:
2050+
2051+
* ax : object of class
2052+
matplotlib.axes.Axes (for ``return_type='axes'``)
2053+
* lines : dict (for ``return_type='dict'``)
2054+
* (ax, lines): namedtuple (for ``return_type='both'``)
2055+
* :class:`~pandas.Series` (for ``return_type != None``
2056+
and data grouped with ``by``)
2057+
* :class:`~numpy.array` (for ``return_type=None``
2058+
and data grouped with ``by``)
20312059
20322060
See Also
20332061
--------
20342062
matplotlib.pyplot.boxplot: Make a box and whisker plot.
2063+
matplotlib.pyplot.hist: Make a hsitogram.
20352064
20362065
Notes
20372066
-----
@@ -2041,72 +2070,57 @@ def plot_series(data, kind='line', ax=None, # Series unique
20412070
20422071
Examples
20432072
--------
2073+
2074+
Boxplots can be created for every column in the dataframe
2075+
by ``df.boxplot()`` or indicating the columns to be used:
2076+
20442077
.. plot::
20452078
:context: close-figs
20462079
20472080
>>> np.random.seed(1234)
2081+
>>> df = pd.DataFrame(np.random.rand(10,4),
2082+
... columns=['Col1', 'Col2', 'Col3', 'Col4'])
2083+
>>> boxplot = df.boxplot(column=['Col1', 'Col2', 'Col3'])
20482084
2049-
>>> df = pd.DataFrame({
2050-
... u'stratifying_var': np.random.uniform(0, 100, 20),
2051-
... u'price': np.random.normal(100, 5, 20),
2052-
... u'demand': np.random.normal(100, 10, 20)})
2053-
2054-
>>> df[u'quartiles'] = pd.qcut(
2055-
... df[u'stratifying_var'], 4,
2056-
... labels=[u'0-25%%', u'25-50%%', u'50-75%%', u'75-100%%'])
2057-
2058-
>>> df
2059-
stratifying_var price demand quartiles
2060-
0 19.151945 106.605791 108.416747 0-25%%
2061-
1 62.210877 92.265472 123.909605 50-75%%
2062-
2 43.772774 98.986768 100.761996 25-50%%
2063-
3 78.535858 96.720153 94.335541 75-100%%
2064-
4 77.997581 100.967107 100.361419 50-75%%
2065-
5 27.259261 102.767195 79.250224 0-25%%
2066-
6 27.646426 106.590758 102.477922 0-25%%
2067-
7 80.187218 97.653474 91.028432 75-100%%
2068-
8 95.813935 103.377770 98.632052 75-100%%
2069-
9 87.593263 90.914864 100.182892 75-100%%
2070-
10 35.781727 99.084457 107.554140 0-25%%
2071-
11 50.099513 105.294846 102.152686 25-50%%
2072-
12 68.346294 98.010799 108.410088 50-75%%
2073-
13 71.270203 101.687188 85.541899 50-75%%
2074-
14 37.025075 105.237893 85.980267 25-50%%
2075-
15 56.119619 105.229691 98.990818 25-50%%
2076-
16 50.308317 104.318586 94.517576 25-50%%
2077-
17 1.376845 99.389542 98.553805 0-25%%
2078-
18 77.282662 100.623565 103.540203 50-75%%
2079-
19 88.264119 98.386026 99.644870 75-100%%
2080-
2081-
To plot the boxplot of the ``demand`` just put:
2085+
Boxplots of variables distributions grouped by a third variable values
2086+
can be created using the option ``by``. For instance:
20822087
20832088
.. plot::
20842089
:context: close-figs
20852090
2086-
>>> boxplot = df.boxplot(column=u'demand', by=u'quartiles')
2091+
>>> df = pd.DataFrame(np.random.rand(10,2), columns=['Col1', 'Col2'] )
2092+
>>> df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
2093+
>>> boxplot = df.boxplot(by='X')
20872094
2088-
Use ``grid=False`` to hide the grid:
2095+
A list of strings (i.e. ``['X','Y']``) containing can be passed to boxplot
2096+
in order to group the data by combination of the variables in the x-axis:
20892097
20902098
.. plot::
20912099
:context: close-figs
20922100
2093-
>>> boxplot = df.boxplot(column=u'demand', by=u'quartiles', grid=False)
2101+
>>> df = pd.DataFrame(np.random.rand(10,3),
2102+
... columns=['Col1', 'Col2', 'Col3'])
2103+
>>> df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
2104+
>>> df['Y'] = pd.Series(['A','B','A','B','A','B','A','B','A','B'])
2105+
>>> boxplot = df.boxplot(column=['Col1','Col2'], by=['X','Y'])
20942106
2095-
Optionally, the layout can be changed by setting ``layout=(rows, cols)``:
2107+
The layout of boxplot can be adjusted giving a tuple to ``layout``:
20962108
20972109
.. plot::
20982110
:context: close-figs
20992111
2100-
>>> boxplot = df.boxplot(column=[u'price',u'demand'],
2101-
... by=u'quartiles', layout=(1,2),
2102-
... figsize=(8,5))
2112+
>>> df = pd.DataFrame(np.random.rand(10,2), columns=['Col1', 'Col2'])
2113+
>>> df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
2114+
>>> boxplot = df.boxplot(by='X', layout=(2,1))
2115+
2116+
Additional formatting can be done to the boxplot, like suppressing the grid
2117+
(``grid=False``), rotating the labels in the x-axis (i.e. ``rot=45``)
2118+
or changing the fontsize (i.e. ``fontsize=15``):
21032119
21042120
.. plot::
21052121
:context: close-figs
21062122
2107-
>>> boxplot = df.boxplot(column=[u'price',u'demand'],
2108-
... by=u'quartiles', layout=(2,1),
2109-
... figsize=(5,8))
2123+
>>> boxplot = df.boxplot(grid=False, rot=45, fontsize=15)
21102124
"""
21112125

21122126
@Appender(_shared_docs['boxplot'] % _shared_doc_kwargs)

0 commit comments

Comments
 (0)