-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Fix numpy boolean subtraction error in Series.diff #28251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 56 commits
8be97f8
7c982d2
45f9c60
83907a2
6c3875e
ecf1c04
dd8ff98
e123486
269d148
3993206
b610dd9
92075fb
1b90657
182cbc2
3ba0958
47feb30
d425cdf
050b8cb
e028aef
9d947db
b115c9e
865e913
6f8986b
081c2c8
96bb1b4
91ebae5
74a474f
40df511
18a2e35
716a70e
e879dd7
850e315
8d2547a
e6f1890
3231d8f
3e0dbc7
5f101fe
f98cbf2
2127651
a174239
b83135d
b14f7c6
d92d6c7
b96bde8
6a72f2c
030d305
649042f
d1c1755
1e4a7ae
951556e
d1c6b65
8323e71
bbd2550
21defe6
6227564
88cea85
acc8f87
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,6 +20,7 @@ | |
from pandas.api.types import is_scalar | ||
from pandas.core.index import MultiIndex | ||
from pandas.core.indexes.datetimes import Timestamp | ||
from pandas.core.indexes.timedeltas import TimedeltaIndex | ||
import pandas.util.testing as tm | ||
from pandas.util.testing import ( | ||
assert_almost_equal, | ||
|
@@ -228,7 +229,7 @@ def test_cummax_timedelta64(self): | |
result = s.cummax(skipna=False) | ||
tm.assert_series_equal(expected, result) | ||
|
||
def test_npdiff(self): | ||
def test_np_diff(self): | ||
pytest.skip("skipping due to Series no longer being an ndarray") | ||
|
||
# no longer works as the return type of np.diff is now nd.array | ||
|
@@ -237,6 +238,67 @@ def test_npdiff(self): | |
r = np.diff(s) | ||
assert_series_equal(Series([nan, 0, 0, 0, nan]), r) | ||
|
||
def test_int_diff(self): | ||
# int dtype | ||
a = 10000000000000000 | ||
b = a + 1 | ||
s = Series([a, b]) | ||
|
||
result = s.diff() | ||
assert result[1] == 1 | ||
|
||
def test_tz_diff(self): | ||
# Combined datetime diff, normal diff and boolean diff test | ||
ts = tm.makeTimeSeries(name="ts") | ||
ts.diff() | ||
|
||
# neg n | ||
result = ts.diff(-1) | ||
expected = ts - ts.shift(-1) | ||
assert_series_equal(result, expected) | ||
|
||
# 0 | ||
result = ts.diff(0) | ||
expected = ts - ts | ||
assert_series_equal(result, expected) | ||
|
||
# datetime diff (GH3100) | ||
s = Series(date_range("20130102", periods=5)) | ||
result = s.diff() | ||
expected = s - s.shift(1) | ||
assert_series_equal(result, expected) | ||
|
||
# timedelta diff | ||
result = result - result.shift(1) # previous result | ||
expected = expected.diff() # previously expected | ||
assert_series_equal(result, expected) | ||
|
||
# with tz | ||
s = Series( | ||
date_range("2000-01-01 09:00:00", periods=5, tz="US/Eastern"), name="foo" | ||
) | ||
result = s.diff() | ||
expected = Series(TimedeltaIndex(["NaT"] + ["1 days"] * 4), name="foo") | ||
assert_series_equal(result, expected) | ||
|
||
@pytest.mark.parametrize( | ||
"input,output,diff", | ||
[([False, True, True, False, False], [nan, True, False, True, False], 1)], | ||
) | ||
def test_bool_diff(self, input, output, diff): | ||
# boolean series (test for fixing #17294) | ||
s = Series(input) | ||
result = s.diff() | ||
expected = Series(output) | ||
assert_series_equal(result, expected) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is x - x.shift() also true here (as it is for obj_diff)? can you add that assert as well (do another result= and expected=), otherwise this PR lgtm. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it is not and will not with the current implementation. We all agreed on this, please re-read the thread if you have the time. @TomAugspurger specifically wanted a boolean Series instead of a list of integers, as shown in the quote below:
Edit: This is also the reason the diff on a boolean object series is different from a diff on a series with only booleans. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok fair enough, do we have a followup issue on this? @TomAugspurger ? (which likely will block on a real BoolType EA) |
||
|
||
def test_obj_diff(self): | ||
# object series | ||
s = Series([False, True, 5.0, nan, True, False]) | ||
result = s.diff() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this routine is way too long can you parameterize There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback Shall I rewrite all other tests in pandas as well? I did not write the tests the way they are, I only added a few lines... I would rather do this in a refactor. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a particular reason why you moved this test? I think would be easier just to create another test specific to bool here and leave the time series stuff where it is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@ WillAyd Done. |
||
expected = s - s.shift(1) | ||
assert_series_equal(result, expected) | ||
|
||
def _check_accum_op(self, name, datetime_series_, check_dtype=True): | ||
func = getattr(np, name) | ||
tm.assert_numpy_array_equal( | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use double backticks around
TypeError
, I would remove the(the - operator is deprecated)
, not obvious what this is in reference to a casual reader