DOC/BUG: broken example in read_parquet with selecting columns #18628

jorisvandenbossche · 2017-12-04T12:55:07Z

In the dev docs the example that subsets the columns to read with read_parquet is broken for the pyarrow engine: http://pandas-docs.github.io/pandas-docs-travis/io.html#io-parquet

In [514]: result = pd.read_parquet('example_pa.parquet', engine='pyarrow', columns=['a', 'b'])
...
IndexError: Table column index 6 is out of range

In [515]: result = pd.read_parquet('example_fp.parquet', engine='fastparquet', columns=['a', 'b'])

In [516]: result.dtypes
Out[516]: 
a    object
b     int64
dtype: object

This is due to a bug in pyarrow (which I am reporting over there, due to how pyarrow deals with the pandas metadata if not all columns are present), but in the meantime we should also fix our docs to not show this buggy example.

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2017-12-04T13:14:00Z

Issue is here: https://issues.apache.org/jira/browse/ARROW-1883 and PR here: apache/arrow#1386

jorisvandenbossche · 2017-12-04T13:57:57Z

And we maybe should also add a test for this case

jorisvandenbossche · 2017-12-07T15:36:22Z

The docs are updated to not include this, so removing the 0.21.1 milestone, we let's keep this open to make sure to revert the PR once pyarrow 0.8.0 is released.

jorisvandenbossche added Compat pandas objects compatability with Numpy or Python functions Docs IO Parquet parquet, feather labels Dec 4, 2017

jorisvandenbossche added this to the 0.21.1 milestone Dec 4, 2017

This was referenced Dec 6, 2017

DOC: temporary remove pyarrow example of reading subset columns #18661

Merged

TST/DOC: test pyarrow tz data + doc / enable cross compat tests for pyarrow/fastparquet #18662

Merged

jorisvandenbossche modified the milestones: 0.21.1, 0.22.0 Dec 7, 2017

jreback modified the milestones: 0.23.0, Next Major Release Apr 14, 2018

jorisvandenbossche mentioned this issue Apr 23, 2019

Revert "DOC: temporary remove pyarrow example of reading subset columns (#18661) #26194

Merged

jreback modified the milestones: Contributions Welcome, 0.25.0 Apr 23, 2019

jreback closed this as completed in #26194 Apr 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC/BUG: broken example in read_parquet with selecting columns #18628

DOC/BUG: broken example in read_parquet with selecting columns #18628

jorisvandenbossche commented Dec 4, 2017

jorisvandenbossche commented Dec 4, 2017 •

edited

Loading

jorisvandenbossche commented Dec 4, 2017

jorisvandenbossche commented Dec 7, 2017

DOC/BUG: broken example in read_parquet with selecting columns #18628

DOC/BUG: broken example in read_parquet with selecting columns #18628

Comments

jorisvandenbossche commented Dec 4, 2017

jorisvandenbossche commented Dec 4, 2017 • edited Loading

jorisvandenbossche commented Dec 4, 2017

jorisvandenbossche commented Dec 7, 2017

jorisvandenbossche commented Dec 4, 2017 •

edited

Loading