Skip to content

Commit 4359471

Browse files
keshavramaswamyjorisvandenbossche
authored andcommitted
Fixed KDE Plot to drop the missing values (pandas-dev#14820)
BUG: Fixed KDE plot to ignore missing values closes pandas-dev#14821 * fixed kde plot to ignore the missing values * added comment to elaborate the changes made * added a release note in whatsnew/0.19.2 * added test to check for missing values and cleaned up whatsnew doc * added comment to refer the issue * modified to fit lint checks * replaced ._xorig with .get_xdata() (cherry picked from commit 033d345)
1 parent 9a6a78f commit 4359471

File tree

3 files changed

+11
-4
lines changed

3 files changed

+11
-4
lines changed

doc/source/whatsnew/v0.19.2.txt

+2
Original file line numberDiff line numberDiff line change
@@ -88,4 +88,6 @@ Bug Fixes
8888

8989
- Explicit check in ``to_stata`` and ``StataWriter`` for out-of-range values when writing doubles (:issue:`14618`)
9090

91+
- Bug in ``.plot(kind='kde')`` which did not drop missing values to generate the KDE Plot, instead generating an empty plot. (:issue:`14821`)
92+
9193
- Bug in ``unstack()`` if called with a list of column(s) as an argument, regardless of the dtypes of all columns, they get coerced to ``object`` (:issue:`11847`)

pandas/tests/plotting/test_series.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -569,7 +569,11 @@ def test_kde_missing_vals(self):
569569
_skip_if_no_scipy_gaussian_kde()
570570
s = Series(np.random.uniform(size=50))
571571
s[0] = np.nan
572-
_check_plot_works(s.plot.kde)
572+
axes = _check_plot_works(s.plot.kde)
573+
# check if the values have any missing values
574+
# GH14821
575+
self.assertTrue(any(~np.isnan(axes.lines[0].get_xdata())),
576+
msg='Missing Values not dropped')
573577

574578
@slow
575579
def test_hist_kwargs(self):

pandas/tools/plotting.py

+4-3
Original file line numberDiff line numberDiff line change
@@ -2136,9 +2136,10 @@ def _args_adjust(self):
21362136

21372137
def _get_ind(self, y):
21382138
if self.ind is None:
2139-
sample_range = max(y) - min(y)
2140-
ind = np.linspace(min(y) - 0.5 * sample_range,
2141-
max(y) + 0.5 * sample_range, 1000)
2139+
# np.nanmax() and np.nanmin() ignores the missing values
2140+
sample_range = np.nanmax(y) - np.nanmin(y)
2141+
ind = np.linspace(np.nanmin(y) - 0.5 * sample_range,
2142+
np.nanmax(y) + 0.5 * sample_range, 1000)
21422143
else:
21432144
ind = self.ind
21442145
return ind

0 commit comments

Comments
 (0)