-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Convert ArrowExtensionArray to proper NumPy dtype #56290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/arrays/_utils.py
Outdated
from pandas.core.dtypes.common import is_numeric_dtype | ||
|
||
|
||
def _to_numpy_dtype_inference(arr, dtype, na_value, hasna): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you type this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -58,6 +58,7 @@ | |||
"_iLocIndexer", | |||
# TODO(3.0): GH#55043 - remove upon removal of ArrayManager | |||
"_get_option", | |||
"_to_numpy_dtype_inference", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the original method is in core, probably can remove the leading _
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -1501,8 +1511,8 @@ def test_to_numpy_int_with_na(): | |||
data = [1, None] | |||
arr = pd.array(data, dtype="int64[pyarrow]") | |||
result = arr.to_numpy() | |||
expected = np.array([1, pd.NA], dtype=object) | |||
assert isinstance(result[0], int) | |||
expected = np.array([1, np.nan]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that ArrowExentionArray.to_numpy
is now lossy for integer types if there are NA's since it will now be converted to float?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little surprised it doesn't break more tests as I think there are still a number of places that try to round-trip through numpy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah same
# Conflicts: # doc/source/whatsnew/v2.2.0.rst
# Conflicts: # doc/source/whatsnew/v2.2.0.rst # pandas/core/arrays/arrow/array.py # pandas/tests/extension/test_arrow.py
cc @mroeschke this should be ready as well |
# Conflicts: # pandas/core/arrays/arrow/array.py
np.log
fails with missing pyarrow values #56285 (Replace xxxx with the GitHub issue number)doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.As discussed on the other pr