@@ -2613,26 +2613,73 @@ def hist(self, bins=10, **kwds):
2613
2613
2614
2614
def kde (self , bw_method = None , ind = None , ** kwds ):
2615
2615
"""
2616
- Kernel Density Estimate plot
2616
+ Kernel Density Estimate plot using Gaussian kernels.
2617
+
2618
+ In statistics, kernel density estimation (KDE) is a non-parametric way
2619
+ to estimate the probability density function (PDF) of a random
2620
+ variable. This function uses Gaussian kernels and includes automatic
2621
+ bandwith determination.
2617
2622
2618
2623
Parameters
2619
2624
----------
2620
- bw_method: str, scalar or callable, optional
2621
- The method used to calculate the estimator bandwidth. This can be
2625
+ bw_method : str, scalar or callable, optional
2626
+ The method used to calculate the estimator bandwidth. This can be
2622
2627
'scott', 'silverman', a scalar constant or a callable.
2623
2628
If None (default), 'scott' is used.
2624
2629
See :class:`scipy.stats.gaussian_kde` for more information.
2625
2630
ind : NumPy array or integer, optional
2626
- Evaluation points. If None (default), 1000 equally spaced points
2627
- are used. If `ind` is a NumPy array, the kde is evaluated at the
2628
- points passed. If `ind` is an integer, `ind` number of equally
2629
- spaced points are used.
2630
- `** kwds` : optional
2631
+ Evaluation points for the estimated PDF . If None (default),
2632
+ 1000 equally spaced points are used. If `ind` is a NumPy array, the
2633
+ kde is evaluated at the points passed. If `ind` is an integer,
2634
+ `ind` number of equally spaced points are used.
2635
+ kwds : optional
2631
2636
Keyword arguments to pass on to :py:meth:`pandas.Series.plot`.
2632
2637
2633
2638
Returns
2634
2639
-------
2635
2640
axes : matplotlib.AxesSubplot or np.array of them
2641
+
2642
+ See also
2643
+ --------
2644
+ scipy.stats.gaussian_kde : Representation of a kernel-density
2645
+ estimate using Gaussian kernels. This is the function used
2646
+ internally to estimate the PDF.
2647
+
2648
+ Examples
2649
+ --------
2650
+ Given a Series of points randomly sampled from an unknown
2651
+ distribution, estimate this distribution using KDE with automatic
2652
+ bandwidth determination and plot the results, evaluating them at
2653
+ 1000 equally spaced points (default):
2654
+
2655
+ .. plot::
2656
+ :context: close-figs
2657
+
2658
+ >>> s = pd.Series([1, 2, 2.5, 3, 3.5, 4, 5])
2659
+ >>> ax = s.plot.kde()
2660
+
2661
+
2662
+ An scalar fixed bandwidth can be specified. Using a too small bandwidth
2663
+ can lead to overfitting, while a too large bandwidth can result in
2664
+ underfitting:
2665
+
2666
+ .. plot::
2667
+ :context: close-figs
2668
+
2669
+ >>> ax = s.plot.kde(bw_method=0.3)
2670
+
2671
+ .. plot::
2672
+ :context: close-figs
2673
+
2674
+ >>> ax = s.plot.kde(bw_method=3)
2675
+
2676
+ Finally, the `ind` parameter determines the evaluation points for the
2677
+ plot of the estimated PDF:
2678
+
2679
+ .. plot::
2680
+ :context: close-figs
2681
+
2682
+ >>> ax = s.plot.kde(ind=[1, 2, 3, 4, 5])
2636
2683
"""
2637
2684
return self (kind = 'kde' , bw_method = bw_method , ind = ind , ** kwds )
2638
2685
0 commit comments