pandas-dev · phofl · Mar 13, 2023 · Mar 8, 2023 · Mar 8, 2023 · Mar 9, 2023
diff --git a/doc/source/user_guide/io.rst b/doc/source/user_guide/io.rst
@@ -170,10 +170,9 @@ dtype : Type name or dict of column -> type, default ``None``
      the default determines the dtype of the columns which are not explicitly
      listed.
 
-use_nullable_dtypes : bool = False
-    Whether or not to use nullable dtypes as default when reading data. If
-    set to True, nullable dtypes are used for all dtypes that have a nullable
-    implementation, even if no nulls are present.
+dtype_backend : {"numpy_nullable", "pyarrow"}, defaults to NumPy backed DataFrames.
+    Which dtype backend to use. If
+    set to True, nullable dtypes or pyarrow dtypes are used for all dtypes.
 
     .. versionadded:: 2.0
 
@@ -475,7 +474,7 @@ worth trying.
 
    os.remove("foo.csv")
 
-Setting ``use_nullable_dtypes=True`` will result in nullable dtypes for every column.
+Setting ``dtype_backend="numpy_nullable"`` will result in nullable dtypes for every column.
 
 .. ipython:: python
 
@@ -484,7 +483,7 @@ Setting ``use_nullable_dtypes=True`` will result in nullable dtypes for every co
    3,4.5,False,b,6,7.5,True,a,12-31-2019,
    """
 
-   df = pd.read_csv(StringIO(data), use_nullable_dtypes=True, parse_dates=["i"])
+   df = pd.read_csv(StringIO(data), dtype_backend="numpy_nullable", parse_dates=["i"])
    df
    df.dtypes
 

diff --git a/doc/source/user_guide/pyarrow.rst b/doc/source/user_guide/pyarrow.rst
@@ -145,8 +145,8 @@ functions provide an ``engine`` keyword that can dispatch to PyArrow to accelera
    df
 
 By default, these functions and all other IO reader functions return NumPy-backed data. These readers can return
-PyArrow-backed data by specifying the parameter ``use_nullable_dtypes=True`` **and** the global configuration option ``"mode.dtype_backend"``
-set to ``"pyarrow"``. A reader does not need to set ``engine="pyarrow"`` to necessarily return PyArrow-backed data.
+PyArrow-backed data by specifying the parameter ``dtype_backend="pyarrow"``. A reader does not need to set
+``engine="pyarrow"`` to necessarily return PyArrow-backed data.
 
 .. ipython:: python
 
@@ -155,20 +155,10 @@ set to ``"pyarrow"``. A reader does not need to set ``engine="pyarrow"`` to nece
         1,2.5,True,a,,,,,
         3,4.5,False,b,6,7.5,True,a,
     """)
-    with pd.option_context("mode.dtype_backend", "pyarrow"):
-        df_pyarrow = pd.read_csv(data, use_nullable_dtypes=True)
+    df_pyarrow = pd.read_csv(data, dtype_backend="pyarrow")
     df_pyarrow.dtypes
 
-To simplify specifying ``use_nullable_dtypes=True`` in several functions, you can set a global option ``nullable_dtypes``
-to ``True``. You will still need to set the global configuration option ``"mode.dtype_backend"`` to ``pyarrow``.
-
-.. code-block:: ipython
-
-    In [1]: pd.set_option("mode.dtype_backend", "pyarrow")
-
-    In [2]: pd.options.mode.nullable_dtypes = True
-
-Several non-IO reader functions can also use the ``"mode.dtype_backend"`` option to return PyArrow-backed data including:
+Several non-IO reader functions can also use the ``dtype_backend`` argument to return PyArrow-backed data including:
 
 * :func:`to_numeric`
 * :meth:`DataFrame.convert_dtypes`

diff --git a/doc/source/whatsnew/v2.0.0.rst b/doc/source/whatsnew/v2.0.0.rst
@@ -103,12 +103,12 @@ Below is a possibly non-exhaustive list of changes:
        pd.Index([1, 2, 3], dtype=np.float16)
 
 
-.. _whatsnew_200.enhancements.io_use_nullable_dtypes_and_dtype_backend:
+.. _whatsnew_200.enhancements.io_dtype_backend:
 
-Configuration option, ``mode.dtype_backend``, to return pyarrow-backed dtypes
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Argument ``dtype_backend``, to return pyarrow-backed or numpy-backed nullable dtypes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-The ``use_nullable_dtypes`` keyword argument has been expanded to the following functions to enable automatic conversion to nullable dtypes (:issue:`36712`)
+The following functions gained a new keyword ``dtype_backend`` (:issue:`36712`)
 
 * :func:`read_csv`
 * :func:`read_clipboard`
@@ -124,19 +124,13 @@ The ``use_nullable_dtypes`` keyword argument has been expanded to the following
 * :func:`read_feather`
 * :func:`read_spss`
 * :func:`to_numeric`
+* :meth:`DataFrame.convert_dtypes`
+* :meth:`Series.convert_dtypes`
 
-To simplify opting-in to nullable dtypes for these functions, a new option ``nullable_dtypes`` was added that allows setting
-the keyword argument globally to ``True`` if not specified directly. The option can be enabled
-through:
-
-.. ipython:: python
-
-    pd.options.mode.nullable_dtypes = True
-
-The option will only work for functions with the keyword ``use_nullable_dtypes``.
+When this option is set to ``"numpy_nullable"`` it will return a :class:`DataFrame` that is
+backed by nullable dtypes.
 
-Additionally a new global configuration, ``mode.dtype_backend`` can now be used in conjunction with the parameter ``use_nullable_dtypes=True`` in the following functions
-to select the nullable dtypes implementation.
+When this keyword is set to ``"pyarrow"``, then these functions will return pyarrow-backed nullable :class:`ArrowDtype` DataFrames (:issue:`48957`, :issue:`49997`):
 
 * :func:`read_csv`
 * :func:`read_clipboard`
@@ -153,30 +147,21 @@ to select the nullable dtypes implementation.
 * :func:`read_feather`
 * :func:`read_spss`
 * :func:`to_numeric`
-
-
-And the following methods will also utilize the ``mode.dtype_backend`` option.
-
 * :meth:`DataFrame.convert_dtypes`
 * :meth:`Series.convert_dtypes`
 
-By default, ``mode.dtype_backend`` is set to ``"pandas"`` to return existing, numpy-backed nullable dtypes, but it can also
-be set to ``"pyarrow"`` to return pyarrow-backed, nullable :class:`ArrowDtype` (:issue:`48957`, :issue:`49997`).
-
 .. ipython:: python
 
     import io
     data = io.StringIO("""a,b,c,d,e,f,g,h,i
         1,2.5,True,a,,,,,
         3,4.5,False,b,6,7.5,True,a,
     """)
-    with pd.option_context("mode.dtype_backend", "pandas"):
-        df = pd.read_csv(data, use_nullable_dtypes=True)
+    df = pd.read_csv(data, dtype_backend="pyarrow")
     df.dtypes
 
     data.seek(0)
-    with pd.option_context("mode.dtype_backend", "pyarrow"):
-        df_pyarrow = pd.read_csv(data, use_nullable_dtypes=True, engine="pyarrow")
+    df_pyarrow = pd.read_csv(data, dtype_backend="pyarrow", engine="pyarrow")
     df_pyarrow.dtypes
 
 Copy-on-Write improvements
@@ -810,6 +795,7 @@ Deprecations
 - Deprecated :meth:`Grouper.obj`, use :meth:`Groupby.obj` instead (:issue:`51206`)
 - Deprecated :meth:`Grouper.indexer`, use :meth:`Resampler.indexer` instead (:issue:`51206`)
 - Deprecated :meth:`Grouper.ax`, use :meth:`Resampler.ax` instead (:issue:`51206`)
+- Deprecated keyword ``use_nullable_dtypes`` in :func:`read_parquet`, use ``dtype_backend`` instead (:issue:`51853`)
 - Deprecated :meth:`Series.pad` in favor of :meth:`Series.ffill` (:issue:`33396`)
 - Deprecated :meth:`Series.backfill` in favor of :meth:`Series.bfill` (:issue:`33396`)
 - Deprecated :meth:`DataFrame.pad` in favor of :meth:`DataFrame.ffill` (:issue:`33396`)

diff --git a/pandas/_libs/parsers.pyi b/pandas/_libs/parsers.pyi
@@ -72,5 +72,5 @@ class TextReader:
 na_values: dict
 
 def _maybe_upcast(
-    arr, use_nullable_dtypes: bool = ..., dtype_backend: str = ...
+    arr, use_dtype_backend: bool = ..., dtype_backend: str = ...
 ) -> np.ndarray: ...
diff --git a/pandas/_libs/parsers.pyx b/pandas/_libs/parsers.pyx
@@ -339,7 +339,6 @@ cdef class TextReader:
         object index_col
         object skiprows
         object dtype
-        bint use_nullable_dtypes
         object usecols
         set unnamed_cols  # set[str]
         str dtype_backend
@@ -379,8 +378,7 @@ cdef class TextReader:
                   float_precision=None,
                   bint skip_blank_lines=True,
                   encoding_errors=b"strict",
-                  use_nullable_dtypes=False,
-                  dtype_backend="pandas"):
+                  dtype_backend="numpy"):
 
         # set encoding for native Python and C library
         if isinstance(encoding_errors, str):
@@ -501,7 +499,6 @@ cdef class TextReader:
         # - DtypeObj
         # - dict[Any, DtypeObj]
         self.dtype = dtype
-        self.use_nullable_dtypes = use_nullable_dtypes
         self.dtype_backend = dtype_backend
 
         self.noconvert = set()
@@ -928,7 +925,6 @@ cdef class TextReader:
             bint na_filter = 0
             int64_t num_cols
             dict results
-            bint use_nullable_dtypes
 
         start = self.parser_start
 
@@ -1049,12 +1045,12 @@ cdef class TextReader:
             # don't try to upcast EAs
             if (
                 na_count > 0 and not is_extension_array_dtype(col_dtype)
-                or self.use_nullable_dtypes
+                or self.dtype_backend != "numpy"
             ):
-                use_nullable_dtypes = self.use_nullable_dtypes and col_dtype is None
+                use_dtype_backend = self.dtype_backend != "numpy" and col_dtype is None
                 col_res = _maybe_upcast(
                     col_res,
-                    use_nullable_dtypes=use_nullable_dtypes,
+                    use_dtype_backend=use_dtype_backend,
                     dtype_backend=self.dtype_backend,
                 )
 
@@ -1389,11 +1385,11 @@ _NA_VALUES = _ensure_encoded(list(STR_NA_VALUES))
 
 
 def _maybe_upcast(
-    arr, use_nullable_dtypes: bool = False, dtype_backend: str = "pandas"
+    arr, use_dtype_backend: bool = False, dtype_backend: str = "numpy"
 ):
     """Sets nullable dtypes or upcasts if nans are present.
 
-    Upcast, if use_nullable_dtypes is false and nans are present so that the
+    Upcast, if use_dtype_backend is false and nans are present so that the
     current dtype can not hold the na value. We use nullable dtypes if the
     flag is true for every array.
 
@@ -1402,7 +1398,7 @@ def _maybe_upcast(
     arr: ndarray
         Numpy array that is potentially being upcast.
 
-    use_nullable_dtypes: bool, default False
+    use_dtype_backend: bool, default False
         If true, we cast to the associated nullable dtypes.
 
     Returns
@@ -1419,7 +1415,7 @@ def _maybe_upcast(
     if issubclass(arr.dtype.type, np.integer):
         mask = arr == na_value
 
-        if use_nullable_dtypes:
+        if use_dtype_backend:
             arr = IntegerArray(arr, mask)
         else:
             arr = arr.astype(float)
@@ -1428,22 +1424,22 @@ def _maybe_upcast(
     elif arr.dtype == np.bool_:
         mask = arr.view(np.uint8) == na_value
 
-        if use_nullable_dtypes:
+        if use_dtype_backend:
             arr = BooleanArray(arr, mask)
         else:
             arr = arr.astype(object)
             np.putmask(arr, mask, np.nan)
 
     elif issubclass(arr.dtype.type, float) or arr.dtype.type == np.float32:
-        if use_nullable_dtypes:
+        if use_dtype_backend:
             mask = np.isnan(arr)
             arr = FloatingArray(arr, mask)
 
     elif arr.dtype == np.object_:
-        if use_nullable_dtypes:
+        if use_dtype_backend:
             arr = StringDtype().construct_array_type()._from_sequence(arr)
 
-    if use_nullable_dtypes and dtype_backend == "pyarrow":
+    if use_dtype_backend and dtype_backend == "pyarrow":
         import pyarrow as pa
         if isinstance(arr, IntegerArray) and arr.isna().all():
             # use null instead of int64 in pyarrow

@@ -377,3 +377,4 @@ def closed(self) -> bool:
     Literal["pearson", "kendall", "spearman"], Callable[[np.ndarray, np.ndarray], float]
 ]
 AlignJoin = Literal["outer", "inner", "left", "right"]
+DtypeBackend = Literal["pyarrow", "numpy_nullable"]
diff --git a/pandas/conftest.py b/pandas/conftest.py
@@ -1274,7 +1274,7 @@ def string_storage(request):
 
 @pytest.fixture(
     params=[
-        "pandas",
+        "numpy_nullable",
         pytest.param("pyarrow", marks=td.skip_if_no("pyarrow")),
     ]
 )

diff --git a/pandas/core/arrays/numeric.py b/pandas/core/arrays/numeric.py
@@ -285,7 +285,7 @@ def _from_sequence_of_strings(
     ) -> T:
         from pandas.core.tools.numeric import to_numeric
 
-        scalars = to_numeric(strings, errors="raise", use_nullable_dtypes=True)
+        scalars = to_numeric(strings, errors="raise", dtype_backend="numpy_nullable")
         return cls._from_sequence(scalars, dtype=dtype, copy=copy)
 
     _HANDLED_TYPES = (np.ndarray, numbers.Number)
diff --git a/pandas/core/config_init.py b/pandas/core/config_init.py
@@ -487,41 +487,13 @@ def use_inf_as_na_cb(key) -> None:
     The default storage for StringDtype.
 """
 
-dtype_backend_doc = """
-: string
-    The nullable dtype implementation to return. Only applicable to certain
-    operations where documented. Available options: 'pandas', 'pyarrow',
-    the default is 'pandas'.
-"""
-
 with cf.config_prefix("mode"):
     cf.register_option(
         "string_storage",
         "python",
         string_storage_doc,
         validator=is_one_of_factory(["python", "pyarrow"]),
     )
-    cf.register_option(
-        "dtype_backend",
-        "pandas",
-        dtype_backend_doc,
-        validator=is_one_of_factory(["pandas", "pyarrow"]),
-    )
-
-
-nullable_dtypes_doc = """
-: bool
-    If nullable dtypes should be returned. This is only applicable to functions
-    where the ``use_nullable_dtypes`` keyword is implemented.
-"""
-
-with cf.config_prefix("mode"):
-    cf.register_option(
-        "nullable_dtypes",
-        False,
-        nullable_dtypes_doc,
-        validator=is_bool,
-    )
 
 
 # Set up the io.excel specific reader configuration.

diff --git a/pandas/core/dtypes/cast.py b/pandas/core/dtypes/cast.py
@@ -1007,7 +1007,7 @@ def convert_dtypes(
     convert_boolean: bool = True,
     convert_floating: bool = True,
     infer_objects: bool = False,
-    dtype_backend: Literal["pandas", "pyarrow"] = "pandas",
+    dtype_backend: Literal["numpy_nullable", "pyarrow"] = "numpy_nullable",
 ) -> DtypeObj:
     """
     Convert objects to best possible type, and optionally,
@@ -1029,10 +1029,10 @@ def convert_dtypes(
     infer_objects : bool, defaults False
         Whether to also infer objects to float/int if possible. Is only hit if the
         object array contains pd.NA.
-    dtype_backend : str, default "pandas"
+    dtype_backend : str, default "numpy_nullable"
         Nullable dtype implementation to use.
 
-        * "pandas" returns numpy-backed nullable types
+        * "numpy_nullable" returns numpy-backed nullable types
         * "pyarrow" returns pyarrow-backed nullable types using ``ArrowDtype``
 
     Returns

diff --git a/pandas/core/generic.py b/pandas/core/generic.py
@@ -52,6 +52,7 @@
     CompressionOptions,
     Dtype,
     DtypeArg,
+    DtypeBackend,
     DtypeObj,
     FilePath,
     FillnaOptions,
@@ -6547,6 +6548,7 @@ def convert_dtypes(
         convert_integer: bool_t = True,
         convert_boolean: bool_t = True,
         convert_floating: bool_t = True,
+        dtype_backend: DtypeBackend = "numpy_nullable",
     ) -> NDFrameT:
         """
         Convert columns to the best possible dtypes using dtypes supporting ``pd.NA``.
@@ -6567,6 +6569,11 @@ def convert_dtypes(
             dtypes if the floats can be faithfully casted to integers.
 
             .. versionadded:: 1.2.0
+        dtype_backend : {"numpy_nullable", "pyarrow"}, default "numpy_nullable"
+            Which dtype_backend to use, e.g. whether a DataFrame should have nullable
+            extension dtypes or pyarrow dtypes.
+
+            .. versionadded:: 2.0
 
         Returns
         -------
@@ -6686,6 +6693,7 @@ def convert_dtypes(
                 convert_integer,
                 convert_boolean,
                 convert_floating,
+                dtype_backend=dtype_backend,
             )
         else:
             results = [
@@ -6695,6 +6703,7 @@ def convert_dtypes(
                     convert_integer,
                     convert_boolean,
                     convert_floating,
+                    dtype_backend=dtype_backend,
                 )
                 for col_name, col in self.items()
             ]