-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: Remove redundant tests for .duplicated and .drop_duplicates in tests/base #32487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: Remove redundant tests for .duplicated and .drop_duplicates in tests/base #32487
Conversation
Same comment as #32484 are you sure this doesn't lose any coverage? |
…on a Series without duplicated values
idx = original.index[list(range(len(original))) + [5, 3]] | ||
values = original._values[list(range(len(original))) + [5, 3]] | ||
s = Series(values, index=idx, name="a") | ||
|
||
expected = Series( | ||
[False] * len(original) + [True, True], index=idx, name="a" | ||
) | ||
tm.assert_series_equal(s.duplicated(), expected) | ||
tm.assert_series_equal(s.drop_duplicates(), original) | ||
|
||
base = [False] * len(idx) | ||
base[3] = True | ||
base[5] = True | ||
expected = Series(base, index=idx, name="a") | ||
|
||
tm.assert_series_equal(s.duplicated(keep="last"), expected) | ||
tm.assert_series_equal( | ||
s.drop_duplicates(keep="last"), s[~np.array(base)] | ||
) | ||
|
||
base = [False] * len(original) + [True, True] | ||
base[3] = True | ||
base[5] = True | ||
expected = Series(base, index=idx, name="a") | ||
|
||
tm.assert_series_equal(s.duplicated(keep=False), expected) | ||
tm.assert_series_equal( | ||
s.drop_duplicates(keep=False), s[~np.array(base)] | ||
) | ||
|
||
s.drop_duplicates(inplace=True) | ||
tm.assert_series_equal(s, original) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected = Series( | ||
[False] * len(original), index=original.index, name="a" | ||
) | ||
tm.assert_series_equal(original.duplicated(), expected) | ||
result = original.drop_duplicates() | ||
tm.assert_series_equal(result, original) | ||
assert result is not original |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now tested in test_drop_duplicates_no_duplicates
in tests/series/methods/test_drop_duplicates.py
, see below
with pytest.raises( | ||
TypeError, | ||
match=r"drop_duplicates\(\) got an unexpected keyword argument", | ||
): | ||
idx.drop_duplicates(inplace=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now tested in test_drop_duplicates_inplace
in tests/indexes/test_common.py
, see below
# has_duplicates | ||
assert not original.has_duplicates | ||
|
||
# create repeated values, 3rd and 5th values are duplicated | ||
idx = original[list(range(len(original))) + [5, 3]] | ||
expected = np.array([False] * len(original) + [True, True], dtype=bool) | ||
duplicated = idx.duplicated() | ||
tm.assert_numpy_array_equal(duplicated, expected) | ||
assert duplicated.dtype == bool | ||
tm.assert_index_equal(idx.drop_duplicates(), original) | ||
|
||
base = [False] * len(idx) | ||
base[3] = True | ||
base[5] = True | ||
expected = np.array(base) | ||
|
||
duplicated = idx.duplicated(keep="last") | ||
tm.assert_numpy_array_equal(duplicated, expected) | ||
assert duplicated.dtype == bool | ||
result = idx.drop_duplicates(keep="last") | ||
tm.assert_index_equal(result, idx[~expected]) | ||
|
||
base = [False] * len(original) + [True, True] | ||
base[3] = True | ||
base[5] = True | ||
expected = np.array(base) | ||
|
||
duplicated = idx.duplicated(keep=False) | ||
tm.assert_numpy_array_equal(duplicated, expected) | ||
assert duplicated.dtype == bool | ||
result = idx.drop_duplicates(keep=False) | ||
tm.assert_index_equal(result, idx[~expected]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested in test_drop_duplicates
in tests/indexes/test_common.py
which was extended and refactored, see below
# original doesn't have duplicates | ||
expected = np.array([False] * len(original), dtype=bool) | ||
duplicated = original.duplicated() | ||
tm.assert_numpy_array_equal(duplicated, expected) | ||
assert duplicated.dtype == bool | ||
result = original.drop_duplicates() | ||
tm.assert_index_equal(result, original) | ||
assert result is not original |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now tested in test_drop_duplicates_no_duplicates
in tests/indexes/test_common.py
, see below
@WillAyd I added comments highlighting and linking the redundant tests. Where needed I extended the existing tests in |
|
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@jreback CI is green |
thanks @SaturnFromTitan there might be other opportunities to use the |
part of #23877
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
duplicated and drop_duplicates are already thoroughly tested in
tests/indexes
andtests/series
. I added comments to highlight the redundancy and extended the existing test cases where needed.