You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NaT is the datetime equivalent of NaN and is set to be the lowest possible 64-bit integer -(2**63). Previously, we could not support this value in any `groupby.mean()` calculations which lead to pandas-dev#43132.
On a high level, we slightly modify the `group_mean` to not count NaT values. To do so, we introduce the `is_datetimelike` parameter to the function call (already present in other functions, e.g., `group_cumsum`) and refactor and extend `#_treat_as_na` to work with float64.
## Tests
This PR adds an integration and two unit tests for the new functionality. In contrast to other tests in classes, I've tried to keep an individual test's scope as small as possible.
Additionally, I've taken the liberty to:
* Add a docstring for the group_mean algorithm.
* Change the algorithm to use guard clauses instead of if/else.
* Add a comment that we're using the Kahan summation (the compensation part initially confused me, and I only stumbled upon Kahan when browsing the file).
- [x] closespandas-dev#43132
- [x] tests added / passed
- [x] Ensure all linting tests pass, see [here](https://pandas.pydata.org/pandas-docs/dev/development/contributing.html#code-standards) for how to run them
- [x] whatsnew entry
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v1.3.3.rst
+2-1
Original file line number
Diff line number
Diff line change
@@ -46,7 +46,8 @@ Performance improvements
46
46
Bug fixes
47
47
~~~~~~~~~
48
48
- Fixed bug in :meth:`.DataFrameGroupBy.agg` and :meth:`.DataFrameGroupBy.transform` with ``engine="numba"`` where ``index`` data was not being correctly passed into ``func`` (:issue:`43133`)
49
-
49
+
- :meth:`.GroupBy.mean` now supports ``NaT`` values (:issue:`43132`)
0 commit comments