Enable to restrict columns for pandas.read_parquet #18154

hoffmann · 2017-11-07T20:08:35Z

Problem description

In pandas 0.21 the top level funtion read_parquet() was introduced. Both available engines fastparquet and pyarrow support the specifications of columns to read. If you are only interested in certain columns of a dataframe this reduces the io.

It should be also possible to specify the columns in pandas.read_parquet().

gfyoung · 2017-11-07T20:14:00Z

Sounds good to me!

jreback · 2017-11-07T20:54:07Z

this is actually quite trivial, we just need to pass kwargs thru on the read. and then you can specify columns=, which we could document as a formal kwarg.

PR's welcome!

jorisvandenbossche · 2017-11-08T21:39:02Z

More in general, should we pass through **kwargs to the actual engine call?

hoffmann mentioned this issue Nov 7, 2017

restrict columns to read for pandas.read_parquet #18155

Merged

1 task

gfyoung added Enhancement IO Parquet parquet, feather labels Nov 7, 2017

jreback added Difficulty Novice labels Nov 7, 2017

jreback modified the milestones: Next Major Release, 0.21.1 Nov 7, 2017

jreback closed this as completed in #18155 Nov 8, 2017

criemen mentioned this issue Nov 10, 2017

Pass kwargs from read_parquet() to the underlying engines. #18216

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable to restrict columns for pandas.read_parquet #18154

Enable to restrict columns for pandas.read_parquet #18154

hoffmann commented Nov 7, 2017

gfyoung commented Nov 7, 2017

jreback commented Nov 7, 2017

jorisvandenbossche commented Nov 8, 2017

Enable to restrict columns for pandas.read_parquet #18154

Enable to restrict columns for pandas.read_parquet #18154

Comments

hoffmann commented Nov 7, 2017

Problem description

gfyoung commented Nov 7, 2017

jreback commented Nov 7, 2017

jorisvandenbossche commented Nov 8, 2017