Skip to content

Fixed IntegerArray division by 0 #30183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 11, 2019

Conversation

TomAugspurger
Copy link
Contributor

@TomAugspurger TomAugspurger commented Dec 10, 2019

When dividing by 0, the result should be inf, not NaN.

closes #27398
closes #27829

When dividing by 0, the result should be `inf`, not `NaN`.

Closes pandas-dev#27398
# may need to fill infs
# and mask wraparound
if is_float_dtype(result):
mask |= (result == np.inf) | (result == -np.inf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this line was at the root of #27829, so this might fix that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're correct.

@@ -389,6 +396,10 @@ def test_compare_array(self, data, all_compare_operators):
other = pd.Series([0] * len(data))
self._compare_other(data, op_name, other)

def test_no_shared_mask(self, data):
result = data + 1
assert np.shares_memory(result._mask, data._mask) is False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didnt know about shares_memory, neat!

@jschendel
Copy link
Member

As an aside, floor division by zero looks a little suspicious to me. I'm not 100% sure what the expected behavior for it should be, but off the top of my head I think your test case should return the same values? Certainly doesn't need to be part of this PR though.

In [7]: a // 0
Out[7]: 
<IntegerArray>
[0, 0, 0, NaN]
Length: 4, dtype: Int64

In [8]: a // -0
Out[8]: 
<IntegerArray>
[0, 0, 0, NaN]
Length: 4, dtype: Int64

In [9]: a // 0.0
Out[9]: array([nan, nan, nan, nan])

In [10]: a // -0.0
Out[10]: array([nan, nan, nan, nan])

values = [np.nan, -np.inf, np.inf, np.nan]
else:
values = [np.nan, np.inf, -np.inf, np.nan]
expected = np.array(values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nitpick but maybe a little more concise to do:

expected = np.array([np.nan, np.inf, -np.inf, np.nan])
if negative:
    expected *= -1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that CI is passing, I think I'm OK with how things are now :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see tests.arithmetic.test_numeric.adjust_negative_zero

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

This can go in first before the pd.NA IntegerArray PR?

@@ -858,6 +858,7 @@ Other
- Bug in :meth:`DataFrame.append` that raised ``IndexError`` when appending with empty list (:issue:`28769`)
- Fix :class:`AbstractHolidayCalendar` to return correct results for
years after 2030 (now goes up to 2200) (:issue:`27790`)
- Fixed :class:`IntegerArray` returning ``NA`` rather than ``inf`` for operations dividing by 0 (:issue:`27398`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusingly worded I think, as I would interpret this that it now returns NaN and no longer inf, while it is the other way around?

Also NA -> NaN, as the result is about floats with NaNs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for non-trivially to understand changes we usually have a sub-section, but up to you.

@simonjayhawkins simonjayhawkins added ExtensionArray Extending pandas with custom dtypes or arrays. Numeric Operations Arithmetic, Comparison, and Logical operations labels Dec 11, 2019
@simonjayhawkins simonjayhawkins added this to the 1.0 milestone Dec 11, 2019
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. if you want to expand whatsnew note cool.

@@ -858,6 +858,7 @@ Other
- Bug in :meth:`DataFrame.append` that raised ``IndexError`` when appending with empty list (:issue:`28769`)
- Fix :class:`AbstractHolidayCalendar` to return correct results for
years after 2030 (now goes up to 2200) (:issue:`27790`)
- Fixed :class:`IntegerArray` returning ``NA`` rather than ``inf`` for operations dividing by 0 (:issue:`27398`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for non-trivially to understand changes we usually have a sub-section, but up to you.

@@ -339,6 +339,19 @@ def test_error(self, data, all_arithmetic_operators):
with pytest.raises(NotImplementedError):
opa(np.arange(len(s)).reshape(-1, len(s)))

@pytest.mark.parametrize("zero, negative", [(0, False), (0.0, False), (-0.0, True)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be good to make a fixture for zeros as we do a lot of these types of tests for series ops, cc @jbrockmendel

@TomAugspurger
Copy link
Contributor Author

I'd prefer to just merge this as is. I have a followup PR fixing IntergerArray.__pow__ which depends on this and I can clarify the whatsnew there.

@TomAugspurger
Copy link
Contributor Author

This can go in first before the pd.NA IntegerArray PR?

Yes, this and my next PR fixing IntegerArray.__pow__ are splitting off tangentially-related components of that large PR.

@jorisvandenbossche
Copy link
Member

I'd prefer to just merge this as is. I have a followup PR fixing IntergerArray.pow which depends on this and I can clarify the whatsnew there.

Sounds good

@jorisvandenbossche jorisvandenbossche merged commit 8841969 into pandas-dev:master Dec 11, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
When dividing by 0, the result should be `inf`, not `NaN`.
Closes pandas-dev#27398
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
When dividing by 0, the result should be `inf`, not `NaN`.
Closes pandas-dev#27398
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: IntNA division can alter array inplace IntegerArray divide by zero returns NaN instead of inf
6 participants