From 61de055fe32ca850975070eaa9a3d5efebbe2e52 Mon Sep 17 00:00:00 2001 From: Georgios Malandrakis Date: Fri, 5 Apr 2024 18:05:21 +0000 Subject: [PATCH 1/5] fix 49201 --- doc/source/user_guide/boolean.rst | 12 ++++++++++++ doc/source/user_guide/integer_na.rst | 12 ++++++++++++ 2 files changed, 24 insertions(+) diff --git a/doc/source/user_guide/boolean.rst b/doc/source/user_guide/boolean.rst index 3c361d4de17e5..90cc8d637779d 100644 --- a/doc/source/user_guide/boolean.rst +++ b/doc/source/user_guide/boolean.rst @@ -37,6 +37,18 @@ If you would prefer to keep the ``NA`` values you can manually fill them with `` s[mask.fillna(True)] +If you create a column of ``NA`` values (for example to fill them later) +with ``df['new_col'] = pd.NA``, the ``dtype`` would be set to ``object`` in the +new column. The performance on this column will be worse than with +the appropriate type. It's better to use +``df['new_col'] = pd.Series(pd.NA, dtype=Int64)`` +(or another ``dtype`` as you desire). + +.. ipython:: python + df = pd.DataFrame() + df['objects'] = pd.NA + df.dtypes + .. _boolean.kleene: Kleene logical operations diff --git a/doc/source/user_guide/integer_na.rst b/doc/source/user_guide/integer_na.rst index 1a727cd78af09..7c72af587b37f 100644 --- a/doc/source/user_guide/integer_na.rst +++ b/doc/source/user_guide/integer_na.rst @@ -84,6 +84,18 @@ with the dtype. In the future, we may provide an option for :class:`Series` to infer a nullable-integer dtype. +If you create a column of ``NA`` values (for example to fill them later) +with ``df['new_col'] = pd.NA``, the ``dtype`` would be set to ``object`` in the +new column. The performance on this column will be worse than with +the appropriate type. It's better to use +``df['new_col'] = pd.Series(pd.NA, dtype=Int64)`` +(or another ``dtype`` as you desire). + +.. ipython:: python + df = pd.DataFrame() + df['objects'] = pd.NA + df.dtypes + Operations ---------- From 103f4e4239aad2cb3aee6764078a6a1bb78038f6 Mon Sep 17 00:00:00 2001 From: Georgios Malandrakis <93475472+giormala@users.noreply.github.com> Date: Mon, 8 Apr 2024 17:45:20 +0300 Subject: [PATCH 2/5] Update boolean.rst --- doc/source/user_guide/boolean.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/source/user_guide/boolean.rst b/doc/source/user_guide/boolean.rst index 90cc8d637779d..65f609d6fa365 100644 --- a/doc/source/user_guide/boolean.rst +++ b/doc/source/user_guide/boolean.rst @@ -41,8 +41,8 @@ If you create a column of ``NA`` values (for example to fill them later) with ``df['new_col'] = pd.NA``, the ``dtype`` would be set to ``object`` in the new column. The performance on this column will be worse than with the appropriate type. It's better to use -``df['new_col'] = pd.Series(pd.NA, dtype=Int64)`` -(or another ``dtype`` as you desire). +``df['new_col'] = pd.Series(pd.NA, dtype='Int64')`` +(or another ``dtype`` that supports ``NA``). .. ipython:: python df = pd.DataFrame() From 4de348d44cf976f9f248286db53ececc755af245 Mon Sep 17 00:00:00 2001 From: Georgios Malandrakis <93475472+giormala@users.noreply.github.com> Date: Mon, 8 Apr 2024 17:46:38 +0300 Subject: [PATCH 3/5] Update integer_na.rst --- doc/source/user_guide/integer_na.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/source/user_guide/integer_na.rst b/doc/source/user_guide/integer_na.rst index 7c72af587b37f..98700154daacb 100644 --- a/doc/source/user_guide/integer_na.rst +++ b/doc/source/user_guide/integer_na.rst @@ -88,8 +88,8 @@ If you create a column of ``NA`` values (for example to fill them later) with ``df['new_col'] = pd.NA``, the ``dtype`` would be set to ``object`` in the new column. The performance on this column will be worse than with the appropriate type. It's better to use -``df['new_col'] = pd.Series(pd.NA, dtype=Int64)`` -(or another ``dtype`` as you desire). +``df['new_col'] = pd.Series(pd.NA, dtype='Int64')`` +(or another ``dtype`` that supports ``NA``). .. ipython:: python df = pd.DataFrame() From 248c8504c087440eb2a0c3167be0411c99422a46 Mon Sep 17 00:00:00 2001 From: Georgios Malandrakis <93475472+giormala@users.noreply.github.com> Date: Thu, 9 May 2024 18:59:46 +0300 Subject: [PATCH 4/5] Update boolean.rst --- doc/source/user_guide/boolean.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/source/user_guide/boolean.rst b/doc/source/user_guide/boolean.rst index 65f609d6fa365..7de0430123fd2 100644 --- a/doc/source/user_guide/boolean.rst +++ b/doc/source/user_guide/boolean.rst @@ -41,10 +41,11 @@ If you create a column of ``NA`` values (for example to fill them later) with ``df['new_col'] = pd.NA``, the ``dtype`` would be set to ``object`` in the new column. The performance on this column will be worse than with the appropriate type. It's better to use -``df['new_col'] = pd.Series(pd.NA, dtype='Int64')`` +``df['new_col'] = pd.Series(pd.NA, dtype="boolean")`` (or another ``dtype`` that supports ``NA``). .. ipython:: python + df = pd.DataFrame() df['objects'] = pd.NA df.dtypes From 1c3636ce2b9bab8791e260c2e47509d5d2a691e7 Mon Sep 17 00:00:00 2001 From: Georgios Malandrakis <93475472+giormala@users.noreply.github.com> Date: Thu, 9 May 2024 19:00:25 +0300 Subject: [PATCH 5/5] Update integer_na.rst --- doc/source/user_guide/integer_na.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/source/user_guide/integer_na.rst b/doc/source/user_guide/integer_na.rst index 98700154daacb..76a2f22b7987d 100644 --- a/doc/source/user_guide/integer_na.rst +++ b/doc/source/user_guide/integer_na.rst @@ -88,10 +88,11 @@ If you create a column of ``NA`` values (for example to fill them later) with ``df['new_col'] = pd.NA``, the ``dtype`` would be set to ``object`` in the new column. The performance on this column will be worse than with the appropriate type. It's better to use -``df['new_col'] = pd.Series(pd.NA, dtype='Int64')`` +``df['new_col'] = pd.Series(pd.NA, dtype="Int64")`` (or another ``dtype`` that supports ``NA``). .. ipython:: python + df = pd.DataFrame() df['objects'] = pd.NA df.dtypes