ENH: Add global option `io.nullable_type="pandas"|"pyarrow"` to control IO reader `use_nullable_dtype` #48957

mroeschke · 2022-10-05T17:25:35Z

From the above issues and read_parquet, it appears that generally a use_nullable_dtypes: bool option will be added to read_* functions to allow users to opt into pandas nullable types.

Additionally in 1.5 with ArrowDtype, the nullable type returned could be backed by pyarrow instead of pandas' implementation. This could be advantageous for read_*(engine="pyarrow") capable readers where an option like io.nullable_type="pyarrow" would just preserve the pyarrow object from the pyarrow parsing function instead of converting it to a numpy object.

The proposal would be to add a new global option io.nullable_type="pandas"|"pyarrow" (default="pandas") such that when pd.read_*(..., use_nullable_dtype=True), the nullable backing type would be dictated by the global io.nullable_type setting.

The text was updated successfully, but these errors were encountered:

mroeschke · 2023-02-10T19:32:12Z

I think this is largely done. Closing

mroeschke added Enhancement IO Data IO issues that don't fit into a more specific label Arrow pyarrow functionality labels Oct 5, 2022

mroeschke mentioned this issue Oct 5, 2022

ENH: Add engine keyword to read_json to enable reading from pyarrow #48893

Closed

lithomas1 added the NA - MaskedArrays Related to pd.NA and nullable extension arrays label Oct 5, 2022

mroeschke mentioned this issue Oct 11, 2022

ENH: Implement io.nullable_backend config for read_parquet #49039

Merged

5 tasks

mroeschke mentioned this issue Oct 28, 2022

ENH: Implement io.nullable_backend config for read_csv(engine="pyarrow") #49366

Merged

5 tasks

mroeschke mentioned this issue Nov 22, 2022

ENH: Add use_nullable_dtypes and nullable_backend global option to read_orc #49827

Merged

5 tasks

This was referenced Nov 30, 2022

ENH: Add io.nullable_backend=pyarrow support to read_excel #49965

Merged

read_json engine keyword and pyarrow integration #49249

Merged

Support dtype_backend="pandas|pyarrow" configuration dask/dask#9719

Merged

mroeschke closed this as completed Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add global option `io.nullable_type="pandas"|"pyarrow"` to control IO reader `use_nullable_dtype` #48957

ENH: Add global option `io.nullable_type="pandas"|"pyarrow"` to control IO reader `use_nullable_dtype` #48957

mroeschke commented Oct 5, 2022

mroeschke commented Feb 10, 2023

ENH: Add global option io.nullable_type="pandas"|"pyarrow" to control IO reader use_nullable_dtype #48957

ENH: Add global option io.nullable_type="pandas"|"pyarrow" to control IO reader use_nullable_dtype #48957

Comments

mroeschke commented Oct 5, 2022

mroeschke commented Feb 10, 2023

ENH: Add global option `io.nullable_type="pandas"|"pyarrow"` to control IO reader `use_nullable_dtype` #48957

ENH: Add global option `io.nullable_type="pandas"|"pyarrow"` to control IO reader `use_nullable_dtype` #48957