Skip to content

Fixed KDE Plot to drop the missing values #14820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 15, 2016
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.19.2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -79,4 +79,6 @@ Bug Fixes

- Explicit check in ``to_stata`` and ``StataWriter`` for out-of-range values when writing doubles (:issue:`14618`)

- Bug in ``.plot(kind='kde')`` which did not drop missing values to generate the KDE Plot, instead generating an empty plot. (:issue:`14821`)

- Bug in ``unstack()`` if called with a list of column(s) as an argument, regardless of the dtypes of all columns, they get coerced to ``object`` (:issue:`11847`)
6 changes: 5 additions & 1 deletion pandas/tests/plotting/test_series.py
Original file line number Diff line number Diff line change
Expand Up @@ -569,7 +569,11 @@ def test_kde_missing_vals(self):
_skip_if_no_scipy_gaussian_kde()
s = Series(np.random.uniform(size=50))
s[0] = np.nan
_check_plot_works(s.plot.kde)
axes = _check_plot_works(s.plot.kde)
# check if the values have any missing values
# GH14821
self.assertTrue(any(~np.isnan(axes.lines[0]._xorig)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace the ._xorig with .get_xdata()? Better to not use matplotlib's internal APIs.

msg='Missing Values not dropped')

@slow
def test_hist_kwargs(self):
Expand Down
7 changes: 4 additions & 3 deletions pandas/tools/plotting.py
Original file line number Diff line number Diff line change
Expand Up @@ -2153,9 +2153,10 @@ def _args_adjust(self):

def _get_ind(self, y):
if self.ind is None:
sample_range = max(y) - min(y)
ind = np.linspace(min(y) - 0.5 * sample_range,
max(y) + 0.5 * sample_range, 1000)
# np.nanmax() and np.nanmin() ignores the missing values
sample_range = np.nanmax(y) - np.nanmin(y)
ind = np.linspace(np.nanmin(y) - 0.5 * sample_range,
np.nanmax(y) + 0.5 * sample_range, 1000)
else:
ind = self.ind
return ind
Expand Down