Skip to content

BUG: Cannot astype from IntegerArray to BooleanArray with missing values #31102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue Jan 17, 2020 · 2 comments · Fixed by #31187
Closed

BUG: Cannot astype from IntegerArray to BooleanArray with missing values #31102

jorisvandenbossche opened this issue Jan 17, 2020 · 2 comments · Fixed by #31187
Labels
Bug ExtensionArray Extending pandas with custom dtypes or arrays.
Milestone

Comments

@jorisvandenbossche
Copy link
Member

For the boolean -> integer array conversion, I added a special path for that. But astypeing integer to boolean is currently not working:

In [23]: a = pd.array([1, 0, pd.NA])  

In [24]: a  
Out[24]: 
<IntegerArray>
[1, 0, <NA>]
Length: 3, dtype: Int64

In [25]: a.astype("boolean")  
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-25-41973ed53ee3> in <module>
----> 1 a.astype("boolean")

~/scipy/pandas/pandas/core/arrays/integer.py in astype(self, dtype, copy)
    454             kwargs = {}
    455 
--> 456         data = self.to_numpy(dtype=dtype, **kwargs)
    457         return astype_nansafe(data, dtype, copy=False)
    458 

~/scipy/pandas/pandas/core/arrays/masked.py in to_numpy(self, dtype, copy, na_value)
    124             ):
    125                 raise ValueError(
--> 126                     f"cannot convert to '{dtype}'-dtype NumPy array "
    127                     "with missing values. Specify an appropriate 'na_value' "
    128                     "for this dtype."

ValueError: cannot convert to 'boolean'-dtype NumPy array with missing values. Specify an appropriate 'na_value' for this dtype.

In [26]: a.astype(pd.BooleanDtype()) 
...
ValueError: cannot convert to 'boolean'-dtype NumPy array with missing values. Specify an appropriate 'na_value' for this dtype.

while for conversions to other nullable dtypes, this should be possible.

@jorisvandenbossche jorisvandenbossche added ExtensionArray Extending pandas with custom dtypes or arrays. Bug labels Jan 17, 2020
@jorisvandenbossche jorisvandenbossche added this to the 1.0.0 milestone Jan 17, 2020
@TomAugspurger
Copy link
Contributor

Will we treat all non-zero values (other than NA) as True? We should match the behavior in the constructor.

@jorisvandenbossche
Copy link
Member Author

Good question. I suppose yes? That's what numpy also does. In the constructor, we are actually a bit more conservative, and only allow 0/1/NA-like:

In [30]: pd.array([0, 1], dtype="boolean")
Out[30]: 
<BooleanArray>
[False, True]
Length: 2, dtype: boolean

In [31]: pd.array([0, 1, 2], dtype="boolean")
...
TypeError: Need to pass bool-like values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants