-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DEPR: Series.first() and DataFrame.first() #53419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice one! just got some minor comments
also, we'll need a whatsnew note for 2.1.0
pandas/core/generic.py
Outdated
@@ -9162,6 +9162,12 @@ def first(self, offset) -> Self: | |||
3 days observed in the dataset, and therefore data for 2018-04-13 was | |||
not returned. | |||
""" | |||
# GH45908 & GH#52487 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the issue numbers here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
pandas/core/generic.py
Outdated
@@ -9162,6 +9162,12 @@ def first(self, offset) -> Self: | |||
3 days observed in the dataset, and therefore data for 2018-04-13 was | |||
not returned. | |||
""" | |||
# GH45908 & GH#52487 | |||
warnings.warn( | |||
"first is deprecated and will be removed in a future version", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we advise users what to do instead? something like "please create a mask and filter using .loc
instead"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
|
||
class TestFirst: | ||
def test_first_subset(self, frame_or_series): | ||
ts = tm.makeTimeDataFrame(freq="12h") | ||
ts = tm.get_obj(ts, frame_or_series) | ||
result = ts.first("10d") | ||
assert len(result) == 20 | ||
with tm.assert_produces_warning(FutureWarning, match=deprecated_msg): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we perhaps just do something like
ts = tm.makeTimeDataFrame(freq="12h")
ts = tm.get_obj(ts, frame_or_series)
with tm.assert_produces_warning(FutureWarning, match=deprecated_msg):
result = ts.first("10d")
ts = tm.makeTimeDataFrame(freq="D")
ts = tm.get_obj(ts, frame_or_series)
with tm.assert_produces_warning(FutureWarning, match=deprecated_msg):
result = ts.first("10d")
so that within each assert_produces_warning
, there is just a single statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure 🫣
with tm.assert_produces_warning(FutureWarning, match=deprecated_msg): | ||
with pytest.raises(TypeError, match=msg): # index is not a DatetimeIndex | ||
obj.first("1D") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine, though an alternative would be
with (
tm.assert_produces_warning(FutureWarning, match=deprecated_msg),
pytest.raises(TypeError, match=msg)
):
obj.first('1D')
There's a ruff check for this https://beta.ruff.rs/docs/rules/multiple-with-statements/ , in case you wanted to make a separate issue about it and we can enable it globally in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened a new issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo this looks very ugly as soon as one of both statements gets longer. I would prefer keeping them separate for tests at least
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phofl Separate like this you mean?:
with tm.assert_produces_warning(FutureWarning, match=deprecated_msg):
with pytest.raises(TypeError, match=msg): # index is not a DatetimeIndex
obj.first("1D")
expected = frame_or_series( | ||
[1] * periods, index=bdate_range(start, periods=periods) | ||
) | ||
tm.assert_equal(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's take this out of the with
context
# GH#51032 | ||
df = DataFrame(index=pd.DatetimeIndex([])) | ||
result = getattr(df, func)(offset=1) | ||
result = getattr(df, "last")(offset=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're doing getattr
with a literal, we may as well just write df.last
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks
ts = tm.makeTimeDataFrame(freq="D") | ||
ts = tm.get_obj(ts, frame_or_series) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't look like this was in the original, could we keep it the same please (except for the with
context)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - copied/pasted more lines than needed. I wasn't adding more tests, but still an error.. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me, thanks @DeaMariaLeon ! leaving open a bit in case anyone has a comment, then will merge
EDIT: sorry there's some failing unit tests - I should've waited for all jobs to finish before approving
@MarcoGorelli I think the last failing test is not related to these changes. I separated the test for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is close! we just have a test failure in the docstring validation CI job
================================== FAILURES ===================================
_________________ [doctest] pandas.core.generic.NDFrame.first __________________
9147 >>> ts
9148 A
9149 2018-04-09 1
9150 2018-04-11 2
9151 2018-04-13 3
9152 2018-04-15 4
9153
9154 Get the rows for the first 3 days:
9155
9156 >>> ts.first('3D')
UNEXPECTED EXCEPTION: FutureWarning('first is deprecated and will be removed in a future version. Please create a mask and filter using `.loc` instead')
Traceback (most recent call last):
File "/home/runner/micromamba/envs/test/lib/python3.10/doctest.py", line 1350, in __run
exec(compile(example.source, filename, "single",
File "<doctest pandas.core.generic.NDFrame.first[3]>", line 1, in <module>
File "/home/runner/micromamba/envs/test/lib/python3.10/site-packages/pandas/core/generic.py", line 9165, in first
warnings.warn(
FutureWarning: first is deprecated and will be removed in a future version. Please create a mask and filter using `.loc` instead
https://github.com/pandas-dev/pandas/actions/runs/5119510437/jobs/9204861758?pr=53419
Could you check what was done in the past to deal with docstrings of deprecated methods please? Then we can do something similar here
pd.DataFrame({"A": [1, 1, 1, 1]}, pd.date_range("2000", periods=4)), | ||
], | ||
) | ||
def test_finalize_first(data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! good call to split this into a separate test
Friendly ping again @MarcoGorelli - It looks green |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice one, thanks @DeaMariaLeon !
merging then, do you want to do a similar one for |
Yes.. Thank you @MarcoGorelli. |
* Deprecating first() * Added requested changes * removed repeated ts * Update doc/source/whatsnew/v2.1.0.rst * Update pandas/core/generic.py * pre-commit * Separated test for first() on test_finalize.py * Avoiding docstring on CI --------- Co-authored-by: Marco Edward Gorelli <[email protected]> Co-authored-by: MarcoGorelli <[email protected]>
* Deprecating first() * Added requested changes * removed repeated ts * Update doc/source/whatsnew/v2.1.0.rst * Update pandas/core/generic.py * pre-commit * Separated test for first() on test_finalize.py * Avoiding docstring on CI --------- Co-authored-by: Marco Edward Gorelli <[email protected]> Co-authored-by: MarcoGorelli <[email protected]>
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.I don't know which "whatsnew" I need to edit (which version)