@@ -1995,50 +1995,164 @@ def plot_series(data, kind='line', ax=None, # Series unique
1995
1995
1996
1996
1997
1997
_shared_docs ['boxplot' ] = """
1998
- Make a box plot from DataFrame column optionally grouped by some columns or
1999
- other inputs
1998
+ Make a box plot from DataFrame columns.
1999
+
2000
+ Make a box-and-whisker plot from DataFrame columns, optionally grouped
2001
+ by some other columns. A box plot is a method for graphically depicting
2002
+ groups of numerical data through their quartiles.
2003
+ The box extends from the Q1 to Q3 quartile values of the data,
2004
+ with a line at the median (Q2). The whiskers extend from the edges
2005
+ of box to show the range of the data. The position of the whiskers
2006
+ is set by default to `1.5 * IQR (IQR = Q3 - Q1)` from the edges of the box.
2007
+ Outlier points are those past the end of the whiskers.
2008
+
2009
+ For further details see
2010
+ Wikipedia's entry for `boxplot <https://en.wikipedia.org/wiki/Box_plot>`_.
2000
2011
2001
2012
Parameters
2002
2013
----------
2003
- data : the pandas object holding the data
2004
- column : column name or list of names, or vector
2005
- Can be any valid input to groupby
2006
- by : string or sequence
2007
- Column in the DataFrame to group by
2008
- ax : Matplotlib axes object, optional
2009
- fontsize : int or string
2010
- rot : label rotation angle
2014
+ column : str or list of str, optional
2015
+ Column name or list of names, or vector.
2016
+ Can be any valid input to :meth:`pandas.DataFrame.groupby`.
2017
+ by : str or array-like, optional
2018
+ Column in the DataFrame to :meth:`pandas.DataFrame.groupby`.
2019
+ One box-plot will be done per value of columns in `by`.
2020
+ ax : object of class matplotlib.axes.Axes, optional
2021
+ The matplotlib axes to be used by boxplot.
2022
+ fontsize : float or str
2023
+ Tick label font size in points or as a string (e.g., `large`).
2024
+ rot : int or float, default 0
2025
+ The rotation angle of labels (in degrees)
2026
+ with respect to the screen coordinate sytem.
2027
+ grid : boolean, default True
2028
+ Setting this to True will show the grid.
2011
2029
figsize : A tuple (width, height) in inches
2012
- grid : Setting this to True will show the grid
2013
- layout : tuple (optional)
2014
- (rows, columns) for the layout of the plot
2015
- return_type : {None, 'axes', 'dict', 'both'}, default None
2016
- The kind of object to return. The default is ``axes``
2017
- 'axes' returns the matplotlib axes the boxplot is drawn on;
2018
- 'dict' returns a dictionary whose values are the matplotlib
2019
- Lines of the boxplot;
2020
- 'both' returns a namedtuple with the axes and dict.
2021
-
2022
- When grouping with ``by``, a Series mapping columns to ``return_type``
2023
- is returned, unless ``return_type`` is None, in which case a NumPy
2024
- array of axes is returned with the same shape as ``layout``.
2025
- See the prose documentation for more.
2026
-
2027
- `**kwds` : Keyword Arguments
2030
+ The size of the figure to create in matplotlib.
2031
+ layout : tuple (rows, columns), optional
2032
+ For example, (3, 5) will display the subplots
2033
+ using 3 columns and 5 rows, starting from the top-left.
2034
+ return_type : {'axes', 'dict', 'both'} or None, default 'axes'
2035
+ The kind of object to return. The default is ``axes``.
2036
+
2037
+ * 'axes' returns the matplotlib axes the boxplot is drawn on.
2038
+ * 'dict' returns a dictionary whose values are the matplotlib
2039
+ Lines of the boxplot.
2040
+ * 'both' returns a namedtuple with the axes and dict.
2041
+ * when grouping with ``by``, a Series mapping columns to
2042
+ ``return_type`` is returned.
2043
+
2044
+ If ``return_type`` is `None`, a NumPy array
2045
+ of axes with the same shape as ``layout`` is returned.
2046
+ **kwds
2028
2047
All other plotting keyword arguments to be passed to
2029
- matplotlib's boxplot function
2048
+ :func:` matplotlib.pyplot. boxplot`.
2030
2049
2031
2050
Returns
2032
2051
-------
2033
- lines : dict
2034
- ax : matplotlib Axes
2035
- (ax, lines): namedtuple
2052
+ result :
2053
+
2054
+ The return type depends on the `return_type` parameter:
2055
+
2056
+ * 'axes' : object of class matplotlib.axes.Axes
2057
+ * 'dict' : dict of matplotlib.lines.Line2D objects
2058
+ * 'both' : a nametuple with strucure (ax, lines)
2059
+
2060
+ For data grouped with ``by``:
2061
+
2062
+ * :class:`~pandas.Series`
2063
+ * :class:`~numpy.array` (for ``return_type = None``)
2064
+
2065
+ See Also
2066
+ --------
2067
+ Series.plot.hist: Make a histogram.
2068
+ matplotlib.pyplot.boxplot : Matplotlib equivalent plot.
2036
2069
2037
2070
Notes
2038
2071
-----
2039
2072
Use ``return_type='dict'`` when you want to tweak the appearance
2040
2073
of the lines after plotting. In this case a dict containing the Lines
2041
2074
making up the boxes, caps, fliers, medians, and whiskers is returned.
2075
+
2076
+ Examples
2077
+ --------
2078
+
2079
+ Boxplots can be created for every column in the dataframe
2080
+ by ``df.boxplot()`` or indicating the columns to be used:
2081
+
2082
+ .. plot::
2083
+ :context: close-figs
2084
+
2085
+ >>> np.random.seed(1234)
2086
+ >>> df = pd.DataFrame(np.random.randn(10,4),
2087
+ ... columns=['Col1', 'Col2', 'Col3', 'Col4'])
2088
+ >>> boxplot = df.boxplot(column=['Col1', 'Col2', 'Col3'])
2089
+
2090
+ Boxplots of variables distributions grouped by the values of a third
2091
+ variable can be created using the option ``by``. For instance:
2092
+
2093
+ .. plot::
2094
+ :context: close-figs
2095
+
2096
+ >>> df = pd.DataFrame(np.random.randn(10, 2),
2097
+ ... columns=['Col1', 'Col2'])
2098
+ >>> df['X'] = pd.Series(['A', 'A', 'A', 'A', 'A',
2099
+ ... 'B', 'B', 'B', 'B', 'B'])
2100
+ >>> boxplot = df.boxplot(by='X')
2101
+
2102
+ A list of strings (i.e. ``['X', 'Y']``) can be passed to boxplot
2103
+ in order to group the data by combination of the variables in the x-axis:
2104
+
2105
+ .. plot::
2106
+ :context: close-figs
2107
+
2108
+ >>> df = pd.DataFrame(np.random.randn(10,3),
2109
+ ... columns=['Col1', 'Col2', 'Col3'])
2110
+ >>> df['X'] = pd.Series(['A', 'A', 'A', 'A', 'A',
2111
+ ... 'B', 'B', 'B', 'B', 'B'])
2112
+ >>> df['Y'] = pd.Series(['A', 'B', 'A', 'B', 'A',
2113
+ ... 'B', 'A', 'B', 'A', 'B'])
2114
+ >>> boxplot = df.boxplot(column=['Col1', 'Col2'], by=['X', 'Y'])
2115
+
2116
+ The layout of boxplot can be adjusted giving a tuple to ``layout``:
2117
+
2118
+ .. plot::
2119
+ :context: close-figs
2120
+
2121
+ >>> boxplot = df.boxplot(column=['Col1', 'Col2'], by='X',
2122
+ ... layout=(2, 1))
2123
+
2124
+ Additional formatting can be done to the boxplot, like suppressing the grid
2125
+ (``grid=False``), rotating the labels in the x-axis (i.e. ``rot=45``)
2126
+ or changing the fontsize (i.e. ``fontsize=15``):
2127
+
2128
+ .. plot::
2129
+ :context: close-figs
2130
+
2131
+ >>> boxplot = df.boxplot(grid=False, rot=45, fontsize=15)
2132
+
2133
+ The parameter ``return_type`` can be used to select the type of element
2134
+ returned by `boxplot`. When ``return_type='axes'`` is selected,
2135
+ the matplotlib axes on which the boxplot is drawn are returned:
2136
+
2137
+ >>> boxplot = df.boxplot(column=['Col1','Col2'], return_type='axes')
2138
+ >>> type(boxplot)
2139
+ <class 'matplotlib.axes._subplots.AxesSubplot'>
2140
+
2141
+ When grouping with ``by``, a Series mapping columns to ``return_type``
2142
+ is returned:
2143
+
2144
+ >>> boxplot = df.boxplot(column=['Col1', 'Col2'], by='X',
2145
+ ... return_type='axes')
2146
+ >>> type(boxplot)
2147
+ <class 'pandas.core.series.Series'>
2148
+
2149
+ If ``return_type`` is `None`, a NumPy array of axes with the same shape
2150
+ as ``layout`` is returned:
2151
+
2152
+ >>> boxplot = df.boxplot(column=['Col1', 'Col2'], by='X',
2153
+ ... return_type=None)
2154
+ >>> type(boxplot)
2155
+ <class 'numpy.ndarray'>
2042
2156
"""
2043
2157
2044
2158
0 commit comments