Skip to content

TST/CoW: expand test for chained inplace methods #56402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 66 additions & 3 deletions pandas/tests/copy_view/test_chained_assignment_deprecation.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import numpy as np
import pytest

from pandas.compat import PY311
from pandas.errors import (
ChainedAssignmentError,
SettingWithCopyWarning,
Expand Down Expand Up @@ -42,7 +43,9 @@ def test_methods_iloc_warn(using_copy_on_write):
("ffill", ()),
],
)
def test_methods_iloc_getitem_item_cache(func, args, using_copy_on_write):
def test_methods_iloc_getitem_item_cache(
func, args, using_copy_on_write, warn_copy_on_write
):
# ensure we don't incorrectly raise chained assignment warning because
# of the item cache / iloc not setting the item cache
df_orig = DataFrame({"a": [1, 2, 3], "b": 1})
Expand All @@ -66,14 +69,74 @@ def test_methods_iloc_getitem_item_cache(func, args, using_copy_on_write):
ser = df["a"]
getattr(ser, func)(*args, inplace=True)

df = df_orig.copy()
df["a"] # populate the item_cache
# TODO(CoW-warn) because of the usage of *args, this doesn't warn on Py3.11+
if using_copy_on_write:
with tm.raises_chained_assignment_error(not PY311):
getattr(df["a"], func)(*args, inplace=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the first test to also use getattr(..)(*args) for this part of the test, which means we are for now asserting this incorrectly doesn't raise a warning for 3.11+ because of the different ref count when using *args

else:
with tm.assert_cow_warning(not PY311, match="A value"):
getattr(df["a"], func)(*args, inplace=True)

df = df_orig.copy()
ser = df["a"] # populate the item_cache and keep ref
if using_copy_on_write:
with tm.raises_chained_assignment_error(not PY311):
getattr(df["a"], func)(*args, inplace=True)
else:
# ideally also warns on the default mode, but the ser' _cacher
# messes up the refcount + even in warning mode this doesn't trigger
# the warning of Py3.1+ (see above)
with tm.assert_cow_warning(warn_copy_on_write and not PY311, match="A value"):
getattr(df["a"], func)(*args, inplace=True)


def test_methods_iloc_getitem_item_cache_fillna(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a second version of this test with all hardcoded fillna() calls, so this isn't using *args, and we also have one version without this (in theory we could do this for each of the methods, but given the code is reused in each of the methods, I think just one is also fine)

using_copy_on_write, warn_copy_on_write
):
# ensure we don't incorrectly raise chained assignment warning because
# of the item cache / iloc not setting the item cache
df_orig = DataFrame({"a": [1, 2, 3], "b": 1})

df = df_orig.copy()
ser = df.iloc[:, 0]
ser.fillna(1, inplace=True)

# parent that holds item_cache is dead, so don't increase ref count
df = df_orig.copy()
ser = df.copy()["a"]
ser.fillna(1, inplace=True)

df = df_orig.copy()
df["a"] # populate the item_cache
ser = df.iloc[:, 0] # iloc creates a new object
ser.fillna(1, inplace=True)

df = df_orig.copy()
df["a"] # populate the item_cache
ser = df["a"]
ser.fillna(1, inplace=True)

df = df_orig.copy()
df["a"] # populate the item_cache
if using_copy_on_write:
with tm.raises_chained_assignment_error():
df["a"].fillna(0, inplace=True)
df["a"].fillna(1, inplace=True)
else:
with tm.assert_cow_warning(match="A value"):
df["a"].fillna(0, inplace=True)
df["a"].fillna(1, inplace=True)

df = df_orig.copy()
ser = df["a"] # populate the item_cache and keep ref
if using_copy_on_write:
with tm.raises_chained_assignment_error():
df["a"].fillna(1, inplace=True)
else:
# TODO(CoW-warn) ideally also warns on the default mode, but the ser' _cacher
# messes up the refcount
with tm.assert_cow_warning(warn_copy_on_write, match="A value"):
df["a"].fillna(1, inplace=True)
Comment on lines +136 to +139
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one more test case to both tests where we populate the item cache but also keep that series object alive. In this case, because that increases the ref count in the non-CoW or non-warn mode (the df['a'] is cached), this doesn't give a warning.

Not sure we can do something about this, though, given the fact we cache the getitem return value and reuse that.
But at least, the good part, is that it does correctly warn on the warning mode. So for users that do the migration in steps, they should see the warning then.



# TODO(CoW-warn) expand the cases
Expand Down