-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DEPR: Enforce deprecation of numeric_only=None in DataFrame aggregations #49551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
looks like some frame reductions tests are failing |
…/pandas into enforce_df_reductions � Conflicts: � pandas/core/frame.py � pandas/tests/apply/test_frame_apply.py � pandas/tests/frame/methods/test_quantile.py � pandas/tests/frame/test_reductions.py � pandas/tests/groupby/test_apply.py � pandas/tests/groupby/test_categorical.py
Thanks @jbrockmendel - this got a bit tricky. When the input is numeric but has object dtype, axis=1, and numeric_only=False, in 1.5.x the result is object dtype. When numeric_only=None however, the result is float. I think this behavior originated from #676. It seems odd, but I think a very rare case. I opted to go to with the 1.5.x default (numeric_only=None) behavior here and put a line in the whatsnew. If that looks good, I can open an issue on this for followup. Also, when working on these fixes, I realized I missed generic.py / series.py for this deprecation. Went unnoticed because |
doc/source/whatsnew/v2.0.0.rst
Outdated
@@ -441,7 +441,7 @@ Removal of prior version deprecations/changes | |||
- Changed behavior of comparison of a :class:`Timestamp` with a ``datetime.date`` object; these now compare as un-equal and raise on inequality comparisons, matching the ``datetime.datetime`` behavior (:issue:`36131`) | |||
- Enforced deprecation of silently dropping columns that raised a ``TypeError`` in :class:`Series.transform` and :class:`DataFrame.transform` when used with a list or dictionary (:issue:`43740`) | |||
- Change behavior of :meth:`DataFrame.apply` with list-like so that any partial failure will raise an error (:issue:`43740`) | |||
- | |||
- Enforced deprecation of silently dropping columns that raised in DataFrame reductions (:issue:`41480`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specific to numeric_only=None
? or "when numeric_only
is not specified"?
@@ -10541,25 +10539,22 @@ def _get_data() -> DataFrame: | |||
data = self._get_bool_data() | |||
return data | |||
|
|||
numeric_only_bool = com.resolve_numeric_only(numeric_only) | |||
if numeric_only is not None or axis == 0: | |||
if numeric_only or axis == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't we go through this path unconditionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I don't think what I wrote was all too clear, but tried to explain this in #49551 (comment). I think we should take this path unconitionality, but there would be a behavior change for numeric_only=False
, axis=1
and object dtype. I plan to open an issue and do a followup here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an example to make this more explicit (on main):
df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}, dtype=object)
print(df.sum(axis=1, numeric_only=None))
print(df.sum(axis=1, numeric_only=False))
0 5.0
1 7.0
2 9.0
dtype: float64
0 5
1 7
2 9
dtype: object
Taking this path unconditionally would be to always get object dtype. I think that's the right thing to do, but would be a change in the default (numeric_only=None) behavior and plan to handle in a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened #49603
…/pandas into enforce_df_reductions
…rce_df_reductions � Conflicts: � doc/source/whatsnew/v2.0.0.rst
needs rebase, otherwise LGTM |
Thanks @rhshadrach |
nice! |
…ons (pandas-dev#49551) * WIP * DEPR: Enforce deprecation of numeric_only=None in DataFrame aggregations * Partial reverts * numeric_only in generic/series, fixup * cleanup * Remove docs warning * fixups * Fixups
…ons (pandas-dev#49551) * WIP * DEPR: Enforce deprecation of numeric_only=None in DataFrame aggregations * Partial reverts * numeric_only in generic/series, fixup * cleanup * Remove docs warning * fixups * Fixups
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.