Skip to content

CLN: enforce deprecation of the method keyword on df.fillna #57760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

natmokval
Copy link
Contributor

@natmokval natmokval commented Mar 7, 2024

xref #53496

enforced deprecation of the method keyword on df.fillna/ser.fillna

@natmokval natmokval added Clean Transformations e.g. cumsum, diff, rank labels Mar 7, 2024
@natmokval natmokval changed the title CLN: enforce deprecation of param method in df.fillna CLN: enforce deprecation of the method keyword on df.fillna Mar 8, 2024
@natmokval natmokval marked this pull request as ready for review March 8, 2024 00:31
@natmokval
Copy link
Contributor Author

I enforced deprecation of the method keyword on df.fillna/ser.fillna. We deprecated the method keyword on EA.fillna as well. Is it OK if I enforce this deprecation in separate PR?

@jbrockmendel, could you please take a look at this PR? I think CI failures are unrelated to my changes.

@natmokval natmokval requested a review from jbrockmendel March 8, 2024 16:59
@@ -909,14 +893,14 @@ def test_bfill(self):
[0.0, 1.0, 2.0, 3.0, 4.0], index=date_range("2020-01-01", periods=5)
)
ts.iloc[2] = np.nan
tm.assert_series_equal(ts.bfill(), ts.fillna(method="bfill"))
tm.assert_series_equal(ts.bfill(), ts.bfill())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assertion looks no-longer-useful

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same on L883

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing this PR. I removed these useless tests.

@@ -809,7 +797,7 @@ def test_fillna_f32_upcast_with_dict(self):

def test_fillna_invalid_method(self, datetime_series):
try:
datetime_series.fillna(method="ffil")
datetime_series.ffill()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is missing the point of the test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I removed the test

tm.assert_frame_equal(result, tsframe.fillna(method="bfill"))
# TODO: what is this even testing?
result = tsframe.bfill()
tm.assert_frame_equal(result, tsframe.bfill())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a useful test (per the comment), can just be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"instead.",
FutureWarning,
stacklevel=find_stack_level(),
if isinstance(value, (list, tuple)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check if we have a test where value is a np.array or pd.Series? Given the error message, I would expect this check to be something like if not (is_scalar(value) or isinstance(value, dict))

Copy link
Contributor Author

@natmokval natmokval Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have test_fillna_series in pandas/tests/extension/base/missing.py where we fill with a series

def test_fillna_series(self, data_missing):
fill_value = data_missing[1]
ser = pd.Series(data_missing)
result = ser.fillna(fill_value)
expected = pd.Series(
data_missing._from_sequence(
[fill_value, fill_value], dtype=data_missing.dtype
)
)
tm.assert_series_equal(result, expected)
# Fill with a series
result = ser.fillna(expected)
tm.assert_series_equal(result, expected)
# Fill with a series not affecting the missing values
result = ser.fillna(ser)
tm.assert_series_equal(result, ser)

Copy link
Contributor Author

@natmokval natmokval Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in pandas/tests/frame/methods/test_fillna.py I found test_fillna_invalid_value.

def test_fillna_invalid_value(self, float_frame):
# list
msg = '"value" parameter must be a scalar or dict, but you passed a "{}"'
with pytest.raises(TypeError, match=msg.format("list")):
float_frame.fillna([1, 2])
# tuple
with pytest.raises(TypeError, match=msg.format("tuple")):
float_frame.fillna((1, 2))
# frame with series
msg = (
'"value" parameter must be a scalar, dict or Series, but you '
'passed a "DataFrame"'
)
with pytest.raises(TypeError, match=msg):
float_frame.iloc[:, 0].fillna(float_frame)

In this test we check if for value: list, tuple or DataFrame fillna raises TypeError.
We don't validate values: np.array and Series in this tests though. I checked, for examples below

df = DataFrame({"a": [1, None, 3]})
df.iloc[:, 0].fillna(value=np.array([1, 0]))

we get the TypeError: TypeError: "value" parameter must be a scalar, dict or Series, but you passed a "ndarray"

Should I add a check for np.array and maybe correct the condition in if statement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking! Since we have a test with np.array and check that we're raising the error correctly so no need to add the check

@mroeschke mroeschke added this to the 3.0 milestone Mar 14, 2024
@mroeschke mroeschke merged commit d2bf501 into pandas-dev:main Mar 14, 2024
47 checks passed
@mroeschke
Copy link
Member

Thanks @natmokval

@natmokval
Copy link
Contributor Author

Thanks @mroeschke for reviewing this PR.

pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
…as-dev#57760)

* enforce deprecation of param method in df.fillna: correct def fillna, fix tests

* correct def fillna, fix tests

* correct an example in v0.10.0, fix test

* add a note to v3.0.0

* remove an entry from ignored_doctest_warnings

* fix pylint error in test_sparse.py

* correct fillna docstring

* correct fillna docstring II

* correct tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Transformations e.g. cumsum, diff, rank
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants