You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
https://issues.apache.org/jira/browse/ARROW-5436
I suppose the fact that `parquet.read_table` dispatched to FileSystem.read_parquet was for historical reasons (that function was added before ParquetDataset was added), but directly calling ParquetDataset there looks cleaner instead of going through FileSystem.read_parquet. So therefore I also changed that.
In addition, I made sure the `memory_map` keyword was actually passed through, I think an oversight of #2954.
(those two changes should be useful anyway, regardless of adding `filters` keyword or not)
Author: Joris Van den Bossche <[email protected]>
Closes#4409 from jorisvandenbossche/ARROW-5436-parquet-read_table and squashes the following commits:
85e5b0e <Joris Van den Bossche> lint
0ae1488 <Joris Van den Bossche> add test with nested list
9baf420 <Joris Van den Bossche> add filters to read_pandas
0df8c88 <Joris Van den Bossche> Merge remote-tracking branch 'upstream/master' into ARROW-5436-parquet-read_table
4ea7b77 <Joris Van den Bossche> fix test
4eb2ea7 <Joris Van den Bossche> add filters keyword
9c10f70 <Joris Van den Bossche> fix passing of memory_map (leftover from ARROW-2807)
896abb2 <Joris Van den Bossche> simplify read_table (use ParquetDataset directly)
0 commit comments