Skip to content

BUG: fix parquet roundtrip with unsigned integer dtypes #31918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 12, 2020

Conversation

jorisvandenbossche
Copy link
Member

Closes #31896

@jorisvandenbossche jorisvandenbossche added Bug IO Parquet parquet, feather NA - MaskedArrays Related to pd.NA and nullable extension arrays labels Feb 12, 2020
@jorisvandenbossche jorisvandenbossche added this to the 1.0.2 milestone Feb 12, 2020
Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codecov failure can be ignored I think.

@jreback jreback merged commit 9767da6 into pandas-dev:master Feb 12, 2020
@jreback
Copy link
Contributor

jreback commented Feb 12, 2020

thanks @jorisvandenbossche

@jorisvandenbossche
Copy link
Member Author

@meeseeksdev backport to 1.0.x

@@ -103,6 +103,10 @@ def __from_arrow__(
import pyarrow # noqa: F811
from pandas.core.arrays._arrow_utils import pyarrow_array_to_numpy_and_mask

pyarrow_type = pyarrow.from_numpy_dtype(self.type)
if not array.type.equals(pyarrow_type):
array = array.cast(pyarrow_type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the cause of a bunch of xfails in test_from_arrow_type_error. does this need to be tightened somehow, or is the test behavior not something we care about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be tightened, for example by checking that the array.type is integral

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Parquet parquet, feather NA - MaskedArrays Related to pd.NA and nullable extension arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: read_parquet unexpected output with pyarrow 0.16.0 and nullable usigned int dtype
4 participants