Skip to content

ENH-19629: Adding numpy nansun/nanmean, etc etc to _cython_table #19670

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
0d39286
Adding numpy nansun/nanmean, etc etc to _cython_table
AaronCritchley Feb 13, 2018
47936c7
Adding in tests, implementing suggested whatsnew entry
AaronCritchley Feb 13, 2018
d2671e6
Fixing flake8 errors
AaronCritchley Feb 13, 2018
29ccb18
PR comments, support for np.nanprod
AaronCritchley Mar 2, 2018
b702747
Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…
AaronCritchley Mar 2, 2018
130f767
skipping if nanprod not implemented
AaronCritchley Mar 2, 2018
0e4657a
Checking np version explicitly
AaronCritchley Mar 2, 2018
64c0d93
Using pytest params
AaronCritchley Mar 2, 2018
fdaeaf9
Updating for np 1.12
AaronCritchley Mar 9, 2018
cabb307
Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…
AaronCritchley Mar 9, 2018
0c5a2ae
Fixing bad indentation
AaronCritchley Mar 9, 2018
7157161
Moving compat test functions inline to prevent build time issue
AaronCritchley Mar 9, 2018
93e332d
Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…
AaronCritchley Mar 22, 2018
8c2a5dd
Making use of fixtures for series and df generation
AaronCritchley Mar 22, 2018
5326d56
Fixing silly formatting, removing external function to compare
AaronCritchley Mar 22, 2018
bdd2917
Merge branch 'master' into ENH-19629-np-nan-funcs-to-cython-map
AaronCritchley Apr 17, 2018
528e12b
Restoring whatsnew to normal after messy merge
AaronCritchley Apr 17, 2018
e88ac10
More whatsnew cleanup
AaronCritchley Apr 17, 2018
40e4dff
Merge branch 'master' of git://github.com/pandas-dev/pandas into ENH-…
AaronCritchley Apr 25, 2018
c9ddaff
Merge branch 'ENH-19629-np-nan-funcs-to-cython-map' of github.com:Aar…
AaronCritchley Apr 25, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -856,7 +856,7 @@ Numeric
- Bug in :class:`DataFrame` flex arithmetic (e.g. ``df.add(other, fill_value=foo)``) with a ``fill_value`` other than ``None`` failed to raise ``NotImplementedError`` in corner cases where either the frame or ``other`` has length zero (:issue:`19522`)
- Multiplication and division of numeric-dtyped :class:`Index` objects with timedelta-like scalars returns ``TimedeltaIndex`` instead of raising ``TypeError`` (:issue:`19333`)
- Bug where ``NaN`` was returned instead of 0 by :func:`Series.pct_change` and :func:`DataFrame.pct_change` when ``fill_method`` is not ``None`` (provided) (:issue:`19873`)

- :meth:`~DataFrame.agg` now correctly handles numpy NaN-aware methods like :meth:`numpy.nansum` (:issue:`19629`)

Indexing
^^^^^^^^
Expand Down
17 changes: 16 additions & 1 deletion pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -188,17 +188,32 @@ class SelectionMixin(object):
builtins.max: 'max',
builtins.min: 'min',
np.sum: 'sum',
np.nansum: 'sum',
np.mean: 'mean',
np.nanmean: 'mean',
np.prod: 'prod',
np.std: 'std',
np.nanstd: 'std',
np.var: 'var',
np.nanvar: 'var',
np.median: 'median',
np.nanmedian: 'median',
np.max: 'max',
np.nanmax: 'max',
np.min: 'min',
np.nanmin: 'min',
np.cumprod: 'cumprod',
np.cumsum: 'cumsum'
np.nancumprod: 'cumprod',
np.cumsum: 'cumsum',
np.nancumsum: 'cumsum'
}

# np.nanprod was added in np version 1.10.0, we currently support >= 1.9
try:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a preferred implementation for this, let me know and I'll happily change, explicitly checking version seemed ugly 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just check _np_version_under1p10 and add it conditionally

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, didn't know this was a thing, thank you

_cython_table[np.nanprod] = 'prod'
except AttributeError:
pass

@property
def _selection_name(self):
"""
Expand Down
29 changes: 29 additions & 0 deletions pandas/tests/test_nanops.py
Original file line number Diff line number Diff line change
Expand Up @@ -1005,6 +1005,35 @@ def prng(self):
return np.random.RandomState(1234)


class TestNumpyNaNFunctions(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be a single function that is parameterized over all of the methods

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do this soon - could you help with why the build is failing?
From looking at CircleCI it seems like it's not recognizing that np.nanprod is valid, do I need to remove the nanprod case for compat or something? Sorry if I'm being dense here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might need to skip for older versions of numpy

not sure when certain ones were added


# xref GH 19629
methods_to_test = [
(np.sum, np.nansum),
(np.mean, np.nanmean),
(np.prod, np.nanprod),
(np.std, np.nanstd),
(np.var, np.nanvar),
(np.median, np.nanmedian),
(np.max, np.nanmax),
(np.min, np.nanmin),
(np.cumprod, np.nancumprod),
(np.cumsum, np.nancumsum)
]

def test_np_nan_functions(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parametrize these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, didn't realise you meant pytest.mark.parametrize, implemented your suggestion now 😄

test_series = pd.Series([1, 2, 3, 4, 5, 6])
test_df = pd.DataFrame([[1, 2, 3], [4, 5, 6]])

for standard, nan_method in self.methods_to_test:
tm.assert_almost_equal(test_series.agg(standard),
test_series.agg(nan_method),
check_exact=True)
tm.assert_almost_equal(test_df.agg(standard),
test_df.agg(nan_method),
check_exact=True)


def test_use_bottleneck():

if nanops._BOTTLENECK_INSTALLED:
Expand Down