Skip to content

ENH-19629: Adding numpy nansun/nanmean, etc etc to _cython_table #19670

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
0d39286
Adding numpy nansun/nanmean, etc etc to _cython_table
AaronCritchley Feb 13, 2018
47936c7
Adding in tests, implementing suggested whatsnew entry
AaronCritchley Feb 13, 2018
d2671e6
Fixing flake8 errors
AaronCritchley Feb 13, 2018
29ccb18
PR comments, support for np.nanprod
AaronCritchley Mar 2, 2018
b702747
Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…
AaronCritchley Mar 2, 2018
130f767
skipping if nanprod not implemented
AaronCritchley Mar 2, 2018
0e4657a
Checking np version explicitly
AaronCritchley Mar 2, 2018
64c0d93
Using pytest params
AaronCritchley Mar 2, 2018
fdaeaf9
Updating for np 1.12
AaronCritchley Mar 9, 2018
cabb307
Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…
AaronCritchley Mar 9, 2018
0c5a2ae
Fixing bad indentation
AaronCritchley Mar 9, 2018
7157161
Moving compat test functions inline to prevent build time issue
AaronCritchley Mar 9, 2018
93e332d
Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…
AaronCritchley Mar 22, 2018
8c2a5dd
Making use of fixtures for series and df generation
AaronCritchley Mar 22, 2018
5326d56
Fixing silly formatting, removing external function to compare
AaronCritchley Mar 22, 2018
bdd2917
Merge branch 'master' into ENH-19629-np-nan-funcs-to-cython-map
AaronCritchley Apr 17, 2018
528e12b
Restoring whatsnew to normal after messy merge
AaronCritchley Apr 17, 2018
e88ac10
More whatsnew cleanup
AaronCritchley Apr 17, 2018
40e4dff
Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…
AaronCritchley Apr 25, 2018
c9ddaff
Merge branch 'ENH-19629-np-nan-funcs-to-cython-map' of github.com:Aar…
AaronCritchley Apr 25, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -747,7 +747,7 @@ Numeric
- Bug in the :class:`DataFrame` constructor in which data containing very large positive or very large negative numbers was causing ``OverflowError`` (:issue:`18584`)
- Bug in :class:`Index` constructor with ``dtype='uint64'`` where int-like floats were not coerced to :class:`UInt64Index` (:issue:`18400`)
- Bug in :class:`DataFrame` flex arithmetic (e.g. `df.add(other, fill_value=foo)`) with a `fill_value` other than ``None`` failed to raise ``NotImplementedError`` in corner cases where either the frame or ``other`` has length zero (:issue:`19522`)

- :meth:`~DataFrame.agg` now correctly handles numpy NaN-aware methods like :meth:`numpy.nansum` (:issue:`19629`)

Indexing
^^^^^^^^
Expand Down
12 changes: 11 additions & 1 deletion pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,15 +187,25 @@ class SelectionMixin(object):
builtins.max: 'max',
builtins.min: 'min',
np.sum: 'sum',
np.nansum: 'sum',
np.mean: 'mean',
np.nanmean: 'mean',
np.prod: 'prod',
np.nanprod: 'prod',
np.std: 'std',
np.nanstd: 'std',
np.var: 'var',
np.nanvar: 'var',
np.median: 'median',
np.nanmedian: 'median',
np.max: 'max',
np.nanmax: 'max',
np.min: 'min',
np.nanmin: 'min',
np.cumprod: 'cumprod',
np.cumsum: 'cumsum'
np.nancumprod: 'cumprod',
np.cumsum: 'cumsum',
np.nancumsum: 'cumsum'
}

@property
Expand Down
89 changes: 89 additions & 0 deletions pandas/tests/test_nanops.py
Original file line number Diff line number Diff line change
Expand Up @@ -1004,6 +1004,95 @@ def prng(self):
return np.random.RandomState(1234)


class TestNumpyNaNFunctions(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be a single function that is parameterized over all of the methods

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do this soon - could you help with why the build is failing?
From looking at CircleCI it seems like it's not recognizing that np.nanprod is valid, do I need to remove the nanprod case for compat or something? Sorry if I'm being dense here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might need to skip for older versions of numpy

not sure when certain ones were added


# xref GH 19629

def setup_method(self, method):
self.test_series = pd.Series([1, 2, 3, 4, 5, 6])
self.test_df = pd.DataFrame([[1, 2, 3], [4, 5, 6]])

def test_np_sum(self):
tm.assert_almost_equal(self.test_series.agg(np.sum),
self.test_series.agg(np.nansum),
check_exact=True)
tm.assert_almost_equal(self.test_df.agg(np.sum),
self.test_df.agg(np.nansum),
check_exact=True)

def test_np_mean(self):
tm.assert_almost_equal(self.test_series.agg(np.mean),
self.test_series.agg(np.nanmean),
check_exact=True)
tm.assert_almost_equal(self.test_df.agg(np.mean),
self.test_df.agg(np.nanmean),
check_exact=True)

def test_np_prod(self):
tm.assert_almost_equal(self.test_series.agg(np.prod),
self.test_series.agg(np.nanprod),
check_exact=True)
tm.assert_almost_equal(self.test_df.agg(np.prod),
self.test_df.agg(np.nanprod),
check_exact=True)

def test_np_std(self):
tm.assert_almost_equal(self.test_series.agg(np.std),
self.test_series.agg(np.nanstd),
check_exact=True)
tm.assert_almost_equal(self.test_df.agg(np.std),
self.test_df.agg(np.nanstd),
check_exact=True)

def test_np_var(self):
tm.assert_almost_equal(self.test_series.agg(np.var),
self.test_series.agg(np.nanvar),
check_exact=True)
tm.assert_almost_equal(self.test_df.agg(np.var),
self.test_df.agg(np.nanvar),
check_exact=True)

def test_np_median(self):
tm.assert_almost_equal(self.test_series.agg(np.median),
self.test_series.agg(np.nanmedian),
check_exact=True)
tm.assert_almost_equal(self.test_df.agg(np.median),
self.test_df.agg(np.nanmedian),
check_exact=True)

def test_np_max(self):
tm.assert_almost_equal(self.test_series.agg(np.max),
self.test_series.agg(np.nanmax),
check_exact=True)
tm.assert_almost_equal(self.test_df.agg(np.max),
self.test_df.agg(np.nanmax),
check_exact=True)

def test_np_min(self):
tm.assert_almost_equal(self.test_series.agg(np.min),
self.test_series.agg(np.nanmin),
check_exact=True)
tm.assert_almost_equal(self.test_df.agg(np.min),
self.test_df.agg(np.nanmin),
check_exact=True)

def test_np_cumprod(self):
tm.assert_almost_equal(self.test_series.agg(np.cumprod),
self.test_series.agg(np.nancumprod),
check_exact=True)
tm.assert_almost_equal(self.test_df.agg(np.cumprod),
self.test_df.agg(np.nancumprod),
check_exact=True)

def test_np_cumsum(self):
tm.assert_almost_equal(self.test_series.agg(np.cumsum),
self.test_series.agg(np.nancumsum),
check_exact=True)
tm.assert_almost_equal(self.test_df.agg(np.cumsum),
self.test_df.agg(np.nancumsum),
check_exact=True)


def test_use_bottleneck():

if nanops._BOTTLENECK_INSTALLED:
Expand Down