Skip to content

GH 9016: Bitwise operation weirdness #9338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 5, 2015
Merged

GH 9016: Bitwise operation weirdness #9338

merged 1 commit into from
Feb 5, 2015

Conversation

tvyomkesh
Copy link
Contributor

Series now supports bitwise op for integral types.

I have made the changes in wrapper() itself instead of na_op() because it looked to me like wrapper is controlling the input and output fill and dtype. Once that is taken care of, na_op() seems to be doing the right thing by itself and so I did not have to change anything to get back the expected behavior. Happy to change if things need to be altered toward better design etc.

@tvyomkesh
Copy link
Contributor Author

This PR intends to resolve #9016

other = other.reindex_like(self).fillna(False).astype(bool)
return self._constructor(na_op(self.values, other.values),
if other.dtype.kind not in ['u', 'i']:
other = other.reindex_like(self).fillna(False).astype(bool)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the other.reindex_like(self) bit is repeated twice here -- can you consolidate it?

@jreback
Copy link
Contributor

jreback commented Jan 23, 2015

can u move the tests near the other boolean tests (right below) in test_series

also need some test for ops with NaN and series containing nan and ops with other dtypes
(with other dtypes should prob raise TypeError)

@tvyomkesh
Copy link
Contributor Author

@shoyer, @jreback thanks, I am working on the suggested changes. In the meantime I noticed this. (using master without this patch).

In [4]: Series([0,1,2,3]) & Series([True, True, True, True])
Out[4]:
0    False
1     True
2    False
3     True
dtype: bool

I was expecting output to be Series([False, True, True, True]) as RHS has dtype bool.
Figured Numpy converts bool to 0/1 and executes bitwise op. For e.g.,

In [5]: import numpy as np

In [6]: np.array([0,1,2,3]) & np.array([True, True, True, True])
Out[6]: array([0, 1, 0, 1])

In [7]: np.array([0,1,2,3]) | np.array([True, True, True, True])
Out[7]: array([1, 1, 3, 3])

In the Series example, should the behavior be preserved? In that case should the output dtype be bool or int64? Otherwise it looks confusing.

@tvyomkesh
Copy link
Contributor Author

Looks like if changed, test_logical_with_nas() in test_frame.py would break.

@tvyomkesh
Copy link
Contributor Author

made the following changes:

  • Code refactored to remove some duplication
  • Moved test function near other comparison tests
  • Added more tests with NaN and other dtypes
  • Fixed code so it works with list,tuple and np.ndarray types correctly.

@jreback jreback added API Design Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations labels Jan 25, 2015
@jreback jreback added this to the 0.16.0 milestone Jan 25, 2015
@@ -107,6 +107,8 @@ Enhancements

- ``Timedelta`` will now accept nanoseconds keyword in constructor (:issue:`9273`)

- ``Series`` now supports bitwise operation for integral types (:issue:`9016`)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a mini-example here

@jreback
Copy link
Contributor

jreback commented Jan 25, 2015

@tvyomkesh this seems like a lot of branching for just slightly different dtype handling. see what you can do about that.

@tvyomkesh
Copy link
Contributor Author

@jreback thanks for pointing to core.common.is_interger_dtype(). Made the changes. Also cut down on branching a bit. Updated release note with a small example.

I still think this is confusing.

In [3]: pd.Series([2]) & pd.Series([True])
Out[3]:
0    False
dtype: bool

Either both the inputs should be converted to dtype boolean in which case output value should be True or if both the inputs are converted to dtype int64 implictly and bitwise & is carried out per current behavior, then the output dtype should remain int64 and not be coerced to bool. The current patch can be modified to fix this. Thoughts?

@tvyomkesh
Copy link
Contributor Author

PR review changes are done. Good to merge?

is_other_int_dtype = com.is_integer_dtype(other.dtype)

if is_other_int_dtype:
other = other.fillna(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can collapse a lot of this if you define functions like this:

fill_bool = lambda x: x.fillna(False).astype(bool)
fill_int = lambda x: x.fillna(0)

filler = fill_int if is_self_int_dtype and is_other_int_dtype else fill_bool

other = filler(other)
res = filler(self._constructor(.......))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@tvyomkesh
Copy link
Contributor Author

@jreback thanks. I have made the changes.


- ``Series`` now supports bitwise operation for integral types (:issue:`9016`)
.. code-block:: python
In [2]: pd.Series([0,1,2,3], list('abcd')) | pd.Series([4,4,4,4], list('abcd'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent 2 spaces here (the code under the directive), otherwise won't render correctly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh didn't notice this one. thanks.

@tvyomkesh
Copy link
Contributor Author

@jreback and @shoyer I have pushed the changes. Thanks.

@@ -117,6 +117,21 @@ Enhancements
- Added ``StringMethods.ljust()`` and ``rjust()`` which behave as the same as standard ``str`` (:issue:`9352`)
- ``StringMethods.pad()`` and ``center()`` now accept `fillchar` option to specify filling character (:issue:`9352`)



Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok move this to the api section snd show s code block of the existing behavior (and then put this after)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Series now supports bitwise op for integral types
jreback added a commit that referenced this pull request Feb 5, 2015
@jreback jreback merged commit add16e5 into pandas-dev:master Feb 5, 2015
@jreback
Copy link
Contributor

jreback commented Feb 5, 2015

thanks @tvyomkesh

@tvyomkesh tvyomkesh deleted the vyom/gh9016_bit_op_weird branch February 5, 2015 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants