Skip to content

API: combine casting behavior with extension types #42198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mzeitlin11 opened this issue Jun 23, 2021 · 1 comment
Closed

API: combine casting behavior with extension types #42198

mzeitlin11 opened this issue Jun 23, 2021 · 1 comment
Labels
API - Consistency Internal Consistency of API/Behavior Closing Candidate May be closeable, needs more eyeballs Dtype Conversions Unexpected or buggy dtype conversions ExtensionArray Extending pandas with custom dtypes or arrays. NA - MaskedArrays Related to pd.NA and nullable extension arrays

Comments

@mzeitlin11
Copy link
Member

In combine, current behavior tries to maintain extension types by casting back if possible through maybe_cast_pointwise_result. This can give some value dependent behavior with a (contrived) example like the following:

In [3]: ser = pd.Series([True, True], dtype="boolean")

In [4]: ser.combine(2, lambda x, y: int(x) * y)
Out[4]:
0    2
1    2
dtype: int64

In [5]: ser = pd.Series([False, False], dtype="boolean")

In [6]: ser.combine(2, lambda x, y: int(x) * y)
Out[6]:
0    False
1    False
dtype: boolean

Not sure if that is desired behavior on its own, but it presents an issue for fixing #42137. On current master object boolean data raises in IntegerArray._from_sequence, so combine fails to cast back:

In [2]: ser = pd.Series([1, pd.NA], dtype="Int64")

In [3]: ser.combine(2, lambda x, y: x < y)
Out[3]:
0    True
1    <NA>
dtype: object

However, fixing #42137 will allow casting back, changing this behavior to

In [5]: ser = pd.Series([1, pd.NA], dtype="Int64")

In [6]: ser.combine(2, lambda x, y: x < y)
Out[6]:
0       1
1    <NA>
dtype: Int64

which looks unexpected (and fails pandas/tests/extension/test_integer.py::TestMethods::test_combine_le section of tests).

@mzeitlin11 mzeitlin11 added API - Consistency Internal Consistency of API/Behavior ExtensionArray Extending pandas with custom dtypes or arrays. Dtype Conversions Unexpected or buggy dtype conversions labels Jun 23, 2021
@jbrockmendel jbrockmendel added the NA - MaskedArrays Related to pd.NA and nullable extension arrays label Jun 24, 2021
@jbrockmendel
Copy link
Member

I think the answer is "yes, this is expected"

@jbrockmendel jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior Closing Candidate May be closeable, needs more eyeballs Dtype Conversions Unexpected or buggy dtype conversions ExtensionArray Extending pandas with custom dtypes or arrays. NA - MaskedArrays Related to pd.NA and nullable extension arrays
Projects
None yet
Development

No branches or pull requests

3 participants