You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In [514]: result = pd.read_parquet('example_pa.parquet', engine='pyarrow', columns=['a', 'b'])
...
IndexError: Table column index 6 is out of range
In [515]: result = pd.read_parquet('example_fp.parquet', engine='fastparquet', columns=['a', 'b'])
In [516]: result.dtypes
Out[516]:
a object
b int64
dtype: object
This is due to a bug in pyarrow (which I am reporting over there, due to how pyarrow deals with the pandas metadata if not all columns are present), but in the meantime we should also fix our docs to not show this buggy example.
The text was updated successfully, but these errors were encountered:
The docs are updated to not include this, so removing the 0.21.1 milestone, we let's keep this open to make sure to revert the PR once pyarrow 0.8.0 is released.
In the dev docs the example that subsets the columns to read with
read_parquet
is broken for the pyarrow engine: http://pandas-docs.github.io/pandas-docs-travis/io.html#io-parquetThis is due to a bug in
pyarrow
(which I am reporting over there, due to how pyarrow deals with the pandas metadata if not all columns are present), but in the meantime we should also fix our docs to not show this buggy example.The text was updated successfully, but these errors were encountered: