Skip to content

Commit 7daedd4

Browse files
committed
[WIP] DOC Fixes #8447 created new example and fixed issues
1 parent aef44c8 commit 7daedd4

File tree

1 file changed

+91
-77
lines changed

1 file changed

+91
-77
lines changed

pandas/plotting/_core.py

+91-77
Original file line numberDiff line numberDiff line change
@@ -1935,58 +1935,87 @@ def plot_series(data, kind='line', ax=None, # Series unique
19351935

19361936

19371937
_shared_docs['boxplot'] = """
1938-
Make a box-and-whisker plot from DataFrame column optionally grouped
1939-
by some columns or other inputs. The box extends from the Q1 to Q3
1940-
quartile values of the data, with a line at the median (Q2).
1941-
The whiskers extend from the edges of box to show the range of the data.
1942-
Flier points (outliers) are those past the end of the whiskers.
1943-
The position of the whiskers is set by default to 1.5 IQR (`whis=1.5``)
1944-
from the edge of the box.
1938+
Make a box plot from DataFrame columns.
1939+
1940+
Make a box-and-whisker plot from DataFrame columns optionally grouped
1941+
by some other columns. A box plot is a method for graphically depicting
1942+
groups of numerical data through their quartiles.
1943+
The box extends from the Q1 to Q3 quartile values of the data,
1944+
with a line at the median (Q2).The whiskers extend from the edges
1945+
of box to show the range of the data. The position of the whiskers
1946+
is set by default to 1.5*IQR (IQR = Q3 - Q1) from the edges of the box.
1947+
Outlier points are those past the end of the whiskers.
19451948
19461949
For further details see
1947-
Wikipedia's entry for `boxplot <https://en.wikipedia.org/wiki/Box_plot/>`_.
1950+
Wikipedia's entry for `boxplot <https://en.wikipedia.org/wiki/Box_plot>`_.
19481951
19491952
Parameters
19501953
----------
1951-
column : column name or list of names, or vector
1954+
column : str or list of str, optional
1955+
Column name or list of names, or vector.
19521956
Can be any valid input to groupby.
1953-
by : string or sequence
1957+
by : str or array-like
19541958
Column in the DataFrame to groupby.
1955-
ax : Matplotlib axes object, (default `None`)
1959+
ax : object of class matplotlib.axes.Axes, default `None`
19561960
The matplotlib axes to be used by boxplot.
1957-
fontsize : int or string
1958-
The font-size used by matplotlib.
1959-
rot : label rotation angle
1960-
The rotation angle of labels.
1961-
grid : boolean( default `True`)
1961+
fontsize : float or str
1962+
Tick label font size in points or as a string (e.g., ‘large’)
1963+
(see `matplotlib.axes.Axes.tick_params
1964+
<https://matplotlib.org/api/_as_gen/
1965+
matplotlib.axes.Axes.tick_params.html>`_).
1966+
rot : int or float, default 0
1967+
The rotation angle of labels (in degrees)
1968+
with respect to the screen coordinate sytem.
1969+
grid : boolean, default `True`
19621970
Setting this to True will show the grid.
19631971
figsize : A tuple (width, height) in inches
1964-
The size of the figure to create in inches by default.
1965-
layout : tuple (optional)
1966-
Tuple (rows, columns) used for the layout of the plot.
1967-
return_type : {None, 'axes', 'dict', 'both'}, default None
1968-
The kind of object to return. The default is ``axes``
1969-
'axes' returns the matplotlib axes the boxplot is drawn on;
1970-
'dict' returns a dictionary whose values are the matplotlib
1971-
Lines of the boxplot;
1972-
'both' returns a namedtuple with the axes and dict.
1973-
When grouping with ``by``, a Series mapping columns to ``return_type``
1974-
is returned, unless ``return_type`` is None, in which case a NumPy
1975-
array of axes is returned with the same shape as ``layout``.
1976-
See the prose documentation for more.
1977-
kwds : Keyword Arguments (optional)
1972+
The size of the figure to create in matplotlib.
1973+
layout : tuple (rows, columns) (optional)
1974+
For example, (3, 5) will display the subplots
1975+
using 3 columns and 5 rows, starting from the top-left.
1976+
return_type : {None, 'axes', 'dict', 'both'}, default 'axes'
1977+
The kind of object to return. The default is ``axes``.
1978+
1979+
* 'axes' returns the matplotlib axes the boxplot is drawn on.
1980+
* 'dict' returns a dictionary whose values are the matplotlib
1981+
Lines of the boxplot.
1982+
* 'both' returns a namedtuple with the axes and dict.
1983+
* when grouping with ``by``, a Series mapping columns to
1984+
``return_type`` is returned (i.e.
1985+
``df.boxplot(column=['Col1','Col2'], by='var',return_type='axes')``
1986+
may return ``Series([AxesSubplot(..),AxesSubplot(..)],
1987+
index=['Col1','Col2'])``).
1988+
1989+
If ``return_type`` is `None`, a NumPy array
1990+
of axes with the same shape as ``layout`` is returned
1991+
(i.e. ``df.boxplot(column=['Col1','Col2'],
1992+
by='var',return_type=None)`` may return a
1993+
``array([<matplotlib.axes._subplots.AxesSubplot object at ..>,
1994+
<matplotlib.axes._subplots.AxesSubplot object at ..>],
1995+
dtype=object)``).
1996+
**kwds : Keyword Arguments (optional)
19781997
All other plotting keyword arguments to be passed to
1979-
matplotlib's function.
1998+
`matplotlib.pyplot.boxplot <https://matplotlib.org/api/_as_gen/
1999+
matplotlib.pyplot.boxplot.html#matplotlib.pyplot.boxplot>`_.
19802000
19812001
Returns
19822002
-------
1983-
lines : dict
1984-
ax : matplotlib Axes
1985-
(ax, lines): namedtuple
2003+
result:
2004+
Options:
2005+
2006+
* ax : object of class
2007+
matplotlib.axes.Axes (for ``return_type='axes'``)
2008+
* lines : dict (for ``return_type='dict'``)
2009+
* (ax, lines): namedtuple (for ``return_type='both'``)
2010+
* :class:`~pandas.Series` (for ``return_type != None``
2011+
and data grouped with ``by``)
2012+
* :class:`~numpy.array` (for ``return_type=None``
2013+
and data grouped with ``by``)
19862014
19872015
See Also
19882016
--------
19892017
matplotlib.pyplot.boxplot: Make a box and whisker plot.
2018+
matplotlib.pyplot.hist: Make a hsitogram.
19902019
19912020
Notes
19922021
-----
@@ -1996,72 +2025,57 @@ def plot_series(data, kind='line', ax=None, # Series unique
19962025
19972026
Examples
19982027
--------
2028+
2029+
Boxplots can be created for every column in the dataframe
2030+
by ``df.boxplot()`` or indicating the columns to be used:
2031+
19992032
.. plot::
20002033
:context: close-figs
20012034
20022035
>>> np.random.seed(1234)
2036+
>>> df = pd.DataFrame(np.random.rand(10,4),
2037+
... columns=['Col1', 'Col2', 'Col3', 'Col4'])
2038+
>>> boxplot = df.boxplot(column=['Col1', 'Col2', 'Col3'])
20032039
2004-
>>> df = pd.DataFrame({
2005-
... u'stratifying_var': np.random.uniform(0, 100, 20),
2006-
... u'price': np.random.normal(100, 5, 20),
2007-
... u'demand': np.random.normal(100, 10, 20)})
2008-
2009-
>>> df[u'quartiles'] = pd.qcut(
2010-
... df[u'stratifying_var'], 4,
2011-
... labels=[u'0-25%%', u'25-50%%', u'50-75%%', u'75-100%%'])
2012-
2013-
>>> df
2014-
stratifying_var price demand quartiles
2015-
0 19.151945 106.605791 108.416747 0-25%%
2016-
1 62.210877 92.265472 123.909605 50-75%%
2017-
2 43.772774 98.986768 100.761996 25-50%%
2018-
3 78.535858 96.720153 94.335541 75-100%%
2019-
4 77.997581 100.967107 100.361419 50-75%%
2020-
5 27.259261 102.767195 79.250224 0-25%%
2021-
6 27.646426 106.590758 102.477922 0-25%%
2022-
7 80.187218 97.653474 91.028432 75-100%%
2023-
8 95.813935 103.377770 98.632052 75-100%%
2024-
9 87.593263 90.914864 100.182892 75-100%%
2025-
10 35.781727 99.084457 107.554140 0-25%%
2026-
11 50.099513 105.294846 102.152686 25-50%%
2027-
12 68.346294 98.010799 108.410088 50-75%%
2028-
13 71.270203 101.687188 85.541899 50-75%%
2029-
14 37.025075 105.237893 85.980267 25-50%%
2030-
15 56.119619 105.229691 98.990818 25-50%%
2031-
16 50.308317 104.318586 94.517576 25-50%%
2032-
17 1.376845 99.389542 98.553805 0-25%%
2033-
18 77.282662 100.623565 103.540203 50-75%%
2034-
19 88.264119 98.386026 99.644870 75-100%%
2035-
2036-
To plot the boxplot of the ``demand`` just put:
2040+
Boxplots of variables distributions grouped by a third variable values
2041+
can be created using the option ``by``. For instance:
20372042
20382043
.. plot::
20392044
:context: close-figs
20402045
2041-
>>> boxplot = df.boxplot(column=u'demand', by=u'quartiles')
2046+
>>> df = pd.DataFrame(np.random.rand(10,2), columns=['Col1', 'Col2'] )
2047+
>>> df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
2048+
>>> boxplot = df.boxplot(by='X')
20422049
2043-
Use ``grid=False`` to hide the grid:
2050+
A list of strings (i.e. ``['X','Y']``) containing can be passed to boxplot
2051+
in order to group the data by combination of the variables in the x-axis:
20442052
20452053
.. plot::
20462054
:context: close-figs
20472055
2048-
>>> boxplot = df.boxplot(column=u'demand', by=u'quartiles', grid=False)
2056+
>>> df = pd.DataFrame(np.random.rand(10,3),
2057+
... columns=['Col1', 'Col2', 'Col3'])
2058+
>>> df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
2059+
>>> df['Y'] = pd.Series(['A','B','A','B','A','B','A','B','A','B'])
2060+
>>> boxplot = df.boxplot(column=['Col1','Col2'], by=['X','Y'])
20492061
2050-
Optionally, the layout can be changed by setting ``layout=(rows, cols)``:
2062+
The layout of boxplot can be adjusted giving a tuple to ``layout``:
20512063
20522064
.. plot::
20532065
:context: close-figs
20542066
2055-
>>> boxplot = df.boxplot(column=[u'price',u'demand'],
2056-
... by=u'quartiles', layout=(1,2),
2057-
... figsize=(8,5))
2067+
>>> df = pd.DataFrame(np.random.rand(10,2), columns=['Col1', 'Col2'])
2068+
>>> df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
2069+
>>> boxplot = df.boxplot(by='X', layout=(2,1))
2070+
2071+
Additional formatting can be done to the boxplot, like suppressing the grid
2072+
(``grid=False``), rotating the labels in the x-axis (i.e. ``rot=45``)
2073+
or changing the fontsize (i.e. ``fontsize=15``):
20582074
20592075
.. plot::
20602076
:context: close-figs
20612077
2062-
>>> boxplot = df.boxplot(column=[u'price',u'demand'],
2063-
... by=u'quartiles', layout=(2,1),
2064-
... figsize=(5,8))
2078+
>>> boxplot = df.boxplot(grid=False, rot=45, fontsize=15)
20652079
"""
20662080

20672081
@Appender(_shared_docs['boxplot'] % _shared_doc_kwargs)

0 commit comments

Comments
 (0)