@@ -1995,58 +1995,87 @@ def plot_series(data, kind='line', ax=None, # Series unique
1995
1995
1996
1996
1997
1997
_shared_docs ['boxplot' ] = """
1998
- Make a box-and-whisker plot from DataFrame column optionally grouped
1999
- by some columns or other inputs. The box extends from the Q1 to Q3
2000
- quartile values of the data, with a line at the median (Q2).
2001
- The whiskers extend from the edges of box to show the range of the data.
2002
- Flier points (outliers) are those past the end of the whiskers.
2003
- The position of the whiskers is set by default to 1.5 IQR (`whis=1.5``)
2004
- from the edge of the box.
1998
+ Make a box plot from DataFrame columns.
1999
+
2000
+ Make a box-and-whisker plot from DataFrame columns optionally grouped
2001
+ by some other columns. A box plot is a method for graphically depicting
2002
+ groups of numerical data through their quartiles.
2003
+ The box extends from the Q1 to Q3 quartile values of the data,
2004
+ with a line at the median (Q2).The whiskers extend from the edges
2005
+ of box to show the range of the data. The position of the whiskers
2006
+ is set by default to 1.5*IQR (IQR = Q3 - Q1) from the edges of the box.
2007
+ Outlier points are those past the end of the whiskers.
2005
2008
2006
2009
For further details see
2007
- Wikipedia's entry for `boxplot <https://en.wikipedia.org/wiki/Box_plot/ >`_.
2010
+ Wikipedia's entry for `boxplot <https://en.wikipedia.org/wiki/Box_plot>`_.
2008
2011
2009
2012
Parameters
2010
2013
----------
2011
- column : column name or list of names, or vector
2014
+ column : str or list of str, optional
2015
+ Column name or list of names, or vector.
2012
2016
Can be any valid input to groupby.
2013
- by : string or sequence
2017
+ by : str or array-like
2014
2018
Column in the DataFrame to groupby.
2015
- ax : Matplotlib axes object, ( default `None`)
2019
+ ax : object of class matplotlib.axes.Axes, default `None`
2016
2020
The matplotlib axes to be used by boxplot.
2017
- fontsize : int or string
2018
- The font-size used by matplotlib.
2019
- rot : label rotation angle
2020
- The rotation angle of labels.
2021
- grid : boolean( default `True`)
2021
+ fontsize : float or str
2022
+ Tick label font size in points or as a string (e.g., ‘large’)
2023
+ (see `matplotlib.axes.Axes.tick_params
2024
+ <https://matplotlib.org/api/_as_gen/
2025
+ matplotlib.axes.Axes.tick_params.html>`_).
2026
+ rot : int or float, default 0
2027
+ The rotation angle of labels (in degrees)
2028
+ with respect to the screen coordinate sytem.
2029
+ grid : boolean, default `True`
2022
2030
Setting this to True will show the grid.
2023
2031
figsize : A tuple (width, height) in inches
2024
- The size of the figure to create in inches by default.
2025
- layout : tuple (optional)
2026
- Tuple (rows, columns) used for the layout of the plot.
2027
- return_type : {None, 'axes', 'dict', 'both'}, default None
2028
- The kind of object to return. The default is ``axes``
2029
- 'axes' returns the matplotlib axes the boxplot is drawn on;
2030
- 'dict' returns a dictionary whose values are the matplotlib
2031
- Lines of the boxplot;
2032
- 'both' returns a namedtuple with the axes and dict.
2033
- When grouping with ``by``, a Series mapping columns to ``return_type``
2034
- is returned, unless ``return_type`` is None, in which case a NumPy
2035
- array of axes is returned with the same shape as ``layout``.
2036
- See the prose documentation for more.
2037
- kwds : Keyword Arguments (optional)
2032
+ The size of the figure to create in matplotlib.
2033
+ layout : tuple (rows, columns) (optional)
2034
+ For example, (3, 5) will display the subplots
2035
+ using 3 columns and 5 rows, starting from the top-left.
2036
+ return_type : {None, 'axes', 'dict', 'both'}, default 'axes'
2037
+ The kind of object to return. The default is ``axes``.
2038
+
2039
+ * 'axes' returns the matplotlib axes the boxplot is drawn on.
2040
+ * 'dict' returns a dictionary whose values are the matplotlib
2041
+ Lines of the boxplot.
2042
+ * 'both' returns a namedtuple with the axes and dict.
2043
+ * when grouping with ``by``, a Series mapping columns to
2044
+ ``return_type`` is returned (i.e.
2045
+ ``df.boxplot(column=['Col1','Col2'], by='var',return_type='axes')``
2046
+ may return ``Series([AxesSubplot(..),AxesSubplot(..)],
2047
+ index=['Col1','Col2'])``).
2048
+
2049
+ If ``return_type`` is `None`, a NumPy array
2050
+ of axes with the same shape as ``layout`` is returned
2051
+ (i.e. ``df.boxplot(column=['Col1','Col2'],
2052
+ by='var',return_type=None)`` may return a
2053
+ ``array([<matplotlib.axes._subplots.AxesSubplot object at ..>,
2054
+ <matplotlib.axes._subplots.AxesSubplot object at ..>],
2055
+ dtype=object)``).
2056
+ **kwds : Keyword Arguments (optional)
2038
2057
All other plotting keyword arguments to be passed to
2039
- matplotlib's function.
2058
+ `matplotlib.pyplot.boxplot <https://matplotlib.org/api/_as_gen/
2059
+ matplotlib.pyplot.boxplot.html#matplotlib.pyplot.boxplot>`_.
2040
2060
2041
2061
Returns
2042
2062
-------
2043
- lines : dict
2044
- ax : matplotlib Axes
2045
- (ax, lines): namedtuple
2063
+ result:
2064
+ Options:
2065
+
2066
+ * ax : object of class
2067
+ matplotlib.axes.Axes (for ``return_type='axes'``)
2068
+ * lines : dict (for ``return_type='dict'``)
2069
+ * (ax, lines): namedtuple (for ``return_type='both'``)
2070
+ * :class:`~pandas.Series` (for ``return_type != None``
2071
+ and data grouped with ``by``)
2072
+ * :class:`~numpy.array` (for ``return_type=None``
2073
+ and data grouped with ``by``)
2046
2074
2047
2075
See Also
2048
2076
--------
2049
2077
matplotlib.pyplot.boxplot: Make a box and whisker plot.
2078
+ matplotlib.pyplot.hist: Make a hsitogram.
2050
2079
2051
2080
Notes
2052
2081
-----
@@ -2056,72 +2085,57 @@ def plot_series(data, kind='line', ax=None, # Series unique
2056
2085
2057
2086
Examples
2058
2087
--------
2088
+
2089
+ Boxplots can be created for every column in the dataframe
2090
+ by ``df.boxplot()`` or indicating the columns to be used:
2091
+
2059
2092
.. plot::
2060
2093
:context: close-figs
2061
2094
2062
2095
>>> np.random.seed(1234)
2096
+ >>> df = pd.DataFrame(np.random.rand(10,4),
2097
+ ... columns=['Col1', 'Col2', 'Col3', 'Col4'])
2098
+ >>> boxplot = df.boxplot(column=['Col1', 'Col2', 'Col3'])
2063
2099
2064
- >>> df = pd.DataFrame({
2065
- ... u'stratifying_var': np.random.uniform(0, 100, 20),
2066
- ... u'price': np.random.normal(100, 5, 20),
2067
- ... u'demand': np.random.normal(100, 10, 20)})
2068
-
2069
- >>> df[u'quartiles'] = pd.qcut(
2070
- ... df[u'stratifying_var'], 4,
2071
- ... labels=[u'0-25%%', u'25-50%%', u'50-75%%', u'75-100%%'])
2072
-
2073
- >>> df
2074
- stratifying_var price demand quartiles
2075
- 0 19.151945 106.605791 108.416747 0-25%%
2076
- 1 62.210877 92.265472 123.909605 50-75%%
2077
- 2 43.772774 98.986768 100.761996 25-50%%
2078
- 3 78.535858 96.720153 94.335541 75-100%%
2079
- 4 77.997581 100.967107 100.361419 50-75%%
2080
- 5 27.259261 102.767195 79.250224 0-25%%
2081
- 6 27.646426 106.590758 102.477922 0-25%%
2082
- 7 80.187218 97.653474 91.028432 75-100%%
2083
- 8 95.813935 103.377770 98.632052 75-100%%
2084
- 9 87.593263 90.914864 100.182892 75-100%%
2085
- 10 35.781727 99.084457 107.554140 0-25%%
2086
- 11 50.099513 105.294846 102.152686 25-50%%
2087
- 12 68.346294 98.010799 108.410088 50-75%%
2088
- 13 71.270203 101.687188 85.541899 50-75%%
2089
- 14 37.025075 105.237893 85.980267 25-50%%
2090
- 15 56.119619 105.229691 98.990818 25-50%%
2091
- 16 50.308317 104.318586 94.517576 25-50%%
2092
- 17 1.376845 99.389542 98.553805 0-25%%
2093
- 18 77.282662 100.623565 103.540203 50-75%%
2094
- 19 88.264119 98.386026 99.644870 75-100%%
2095
-
2096
- To plot the boxplot of the ``demand`` just put:
2100
+ Boxplots of variables distributions grouped by a third variable values
2101
+ can be created using the option ``by``. For instance:
2097
2102
2098
2103
.. plot::
2099
2104
:context: close-figs
2100
2105
2101
- >>> boxplot = df.boxplot(column=u'demand', by=u'quartiles')
2106
+ >>> df = pd.DataFrame(np.random.rand(10,2), columns=['Col1', 'Col2'] )
2107
+ >>> df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
2108
+ >>> boxplot = df.boxplot(by='X')
2102
2109
2103
- Use ``grid=False`` to hide the grid:
2110
+ A list of strings (i.e. ``['X','Y']``) containing can be passed to boxplot
2111
+ in order to group the data by combination of the variables in the x-axis:
2104
2112
2105
2113
.. plot::
2106
2114
:context: close-figs
2107
2115
2108
- >>> boxplot = df.boxplot(column=u'demand', by=u'quartiles', grid=False)
2116
+ >>> df = pd.DataFrame(np.random.rand(10,3),
2117
+ ... columns=['Col1', 'Col2', 'Col3'])
2118
+ >>> df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
2119
+ >>> df['Y'] = pd.Series(['A','B','A','B','A','B','A','B','A','B'])
2120
+ >>> boxplot = df.boxplot(column=['Col1','Col2'], by=['X','Y'])
2109
2121
2110
- Optionally, the layout can be changed by setting ``layout=(rows, cols) ``:
2122
+ The layout of boxplot can be adjusted giving a tuple to ``layout``:
2111
2123
2112
2124
.. plot::
2113
2125
:context: close-figs
2114
2126
2115
- >>> boxplot = df.boxplot(column=[u'price',u'demand'],
2116
- ... by=u'quartiles', layout=(1,2),
2117
- ... figsize=(8,5))
2127
+ >>> df = pd.DataFrame(np.random.rand(10,2), columns=['Col1', 'Col2'])
2128
+ >>> df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
2129
+ >>> boxplot = df.boxplot(by='X', layout=(2,1))
2130
+
2131
+ Additional formatting can be done to the boxplot, like suppressing the grid
2132
+ (``grid=False``), rotating the labels in the x-axis (i.e. ``rot=45``)
2133
+ or changing the fontsize (i.e. ``fontsize=15``):
2118
2134
2119
2135
.. plot::
2120
2136
:context: close-figs
2121
2137
2122
- >>> boxplot = df.boxplot(column=[u'price',u'demand'],
2123
- ... by=u'quartiles', layout=(2,1),
2124
- ... figsize=(5,8))
2138
+ >>> boxplot = df.boxplot(grid=False, rot=45, fontsize=15)
2125
2139
"""
2126
2140
2127
2141
@Appender (_shared_docs ['boxplot' ] % _shared_doc_kwargs )
0 commit comments