From 5334e6ab378bc7830f43da872a8320ab53d27a39 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Est=C3=A8ve?= Date: Mon, 20 Jun 2022 17:26:12 +0200 Subject: [PATCH 1/4] DOC make inplace operation section in whats_new clearer --- doc/source/whatsnew/v1.5.0.rst | 39 +++++++++++++++++++++------------- 1 file changed, 24 insertions(+), 15 deletions(-) diff --git a/doc/source/whatsnew/v1.5.0.rst b/doc/source/whatsnew/v1.5.0.rst index 76f6e864a174f..08da54429554f 100644 --- a/doc/source/whatsnew/v1.5.0.rst +++ b/doc/source/whatsnew/v1.5.0.rst @@ -565,11 +565,12 @@ retained by specifying ``group_keys=False``. .. _whatsnew_150.notable_bug_fixes.setitem_column_try_inplace: _ see also _whatsnew_130.notable_bug_fixes.setitem_column_try_inplace -Try operating inplace when setting values with ``loc`` and ``iloc`` -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Towards consistently trying to operate inplace when setting values with ``loc`` and ``iloc`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Most of the time setting values with ``frame.iloc`` attempts to set values -in-place, only falling back to inserting a new array if necessary. In the past, -setting entire columns has been an exception to this rule: +inplace, only falling back to inserting a new array if necessary. There are +some edge cases where values are not set inplace, for example when setting an +entire column from an array with different dtype: .. ipython:: python @@ -577,25 +578,37 @@ setting entire columns has been an exception to this rule: df = pd.DataFrame(values) ser = df[0] -*Old behavior*: +*Old behavior with pandas < 1.5*: .. code-block:: ipython - In [3]: df.iloc[:, 0] = np.array([10, 11]) + In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32) In [4]: ser Out[4]: 0 0 1 2 Name: 0, dtype: int64 -This behavior is deprecated. In a future version, setting an entire column with -iloc will attempt to operate inplace. +*Behavior with pandas 1.5* is the same but you get a ``FutureWarning``: + +.. code-block:: ipython + + In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32) + FutureWarning: In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)` + In [4]: ser + Out[4]: + 0 0 + 1 2 + Name: 0, dtype: int64 *Future behavior*: +In a future version, setting an entire column with ``iloc`` will attempt to +operate inplace. + .. code-block:: ipython - In [3]: df.iloc[:, 0] = np.array([10, 11]) + In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32) In [4]: ser Out[4]: 0 10 @@ -604,11 +617,9 @@ iloc will attempt to operate inplace. To get the old behavior, use :meth:`DataFrame.__setitem__` directly: -*Future behavior*: - .. code-block:: ipython - In [5]: df[0] = np.array([21, 31]) + In [5]: df[0] = np.array([21, 31], dtype=np.int32) In [4]: ser Out[4]: 0 10 @@ -617,12 +628,10 @@ To get the old behavior, use :meth:`DataFrame.__setitem__` directly: In the case where ``df.columns`` is not unique, use :meth:`DataFrame.isetitem`: -*Future behavior*: - .. code-block:: ipython In [5]: df.columns = ["A", "A"] - In [5]: df.isetitem(0, np.array([21, 31])) + In [5]: df.isetitem(0, np.array([21, 31]), dtype=np.int32) In [4]: ser Out[4]: 0 10 From fae74acf4cc6a61ab29a9a6f0335f396dc9aa178 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Est=C3=A8ve?= Date: Tue, 21 Jun 2022 12:06:20 +0200 Subject: [PATCH 2/4] tweak --- doc/source/whatsnew/v1.5.0.rst | 35 +++++++++++++--------------------- 1 file changed, 13 insertions(+), 22 deletions(-) diff --git a/doc/source/whatsnew/v1.5.0.rst b/doc/source/whatsnew/v1.5.0.rst index 08da54429554f..dd45657dbe2b2 100644 --- a/doc/source/whatsnew/v1.5.0.rst +++ b/doc/source/whatsnew/v1.5.0.rst @@ -562,15 +562,15 @@ As ``group_keys=True`` is the default value of :meth:`DataFrame.groupby` and raise a ``FutureWarning``. This can be silenced and the previous behavior retained by specifying ``group_keys=False``. -.. _whatsnew_150.notable_bug_fixes.setitem_column_try_inplace: +.. _whatsnew_150.deprecations.setitem_column_try_inplace: _ see also _whatsnew_130.notable_bug_fixes.setitem_column_try_inplace -Towards consistently trying to operate inplace when setting values with ``loc`` and ``iloc`` -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Inplace operation when setting values with ``iloc`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Most of the time setting values with ``frame.iloc`` attempts to set values inplace, only falling back to inserting a new array if necessary. There are -some edge cases where values are not set inplace, for example when setting an -entire column from an array with different dtype: +some cases where this rule is not followed, for example when setting an entire +column from an array with different dtype: .. ipython:: python @@ -578,7 +578,7 @@ entire column from an array with different dtype: df = pd.DataFrame(values) ser = df[0] -*Old behavior with pandas < 1.5*: +*Old behavior*: .. code-block:: ipython @@ -589,23 +589,11 @@ entire column from an array with different dtype: 1 2 Name: 0, dtype: int64 -*Behavior with pandas 1.5* is the same but you get a ``FutureWarning``: - -.. code-block:: ipython - - In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32) - FutureWarning: In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)` - In [4]: ser - Out[4]: - 0 0 - 1 2 - Name: 0, dtype: int64 +This behavior is deprecated. In a future version, setting an entire column with +iloc will attempt to operate inplace. *Future behavior*: -In a future version, setting an entire column with ``iloc`` will attempt to -operate inplace. - .. code-block:: ipython In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32) @@ -626,12 +614,15 @@ To get the old behavior, use :meth:`DataFrame.__setitem__` directly: 1 11 Name: 0, dtype: int64 -In the case where ``df.columns`` is not unique, use :meth:`DataFrame.isetitem`: +In the case where ``df.columns`` is not unique, :meth:`DataFrame.isetitem` has +been added in pandas 1.5: + +*New behavior* .. code-block:: ipython In [5]: df.columns = ["A", "A"] - In [5]: df.isetitem(0, np.array([21, 31]), dtype=np.int32) + In [5]: df.isetitem(0, np.array([21, 31], dtype=np.int32)) In [4]: ser Out[4]: 0 10 From 23cca416fa804bcd906850ebf538ec670e7f2988 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Est=C3=A8ve?= Date: Wed, 22 Jun 2022 11:20:40 +0200 Subject: [PATCH 3/4] Can happen with both loc and iloc when setting entire column --- doc/source/whatsnew/v1.5.0.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/source/whatsnew/v1.5.0.rst b/doc/source/whatsnew/v1.5.0.rst index dd45657dbe2b2..6276a8fd5b2cb 100644 --- a/doc/source/whatsnew/v1.5.0.rst +++ b/doc/source/whatsnew/v1.5.0.rst @@ -565,8 +565,8 @@ retained by specifying ``group_keys=False``. .. _whatsnew_150.deprecations.setitem_column_try_inplace: _ see also _whatsnew_130.notable_bug_fixes.setitem_column_try_inplace -Inplace operation when setting values with ``iloc`` -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Inplace operation when setting values with ``loc`` and ``iloc`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Most of the time setting values with ``frame.iloc`` attempts to set values inplace, only falling back to inserting a new array if necessary. There are some cases where this rule is not followed, for example when setting an entire From d1893954ceb95ae153f6193f924571d4abc22737 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Est=C3=A8ve?= Date: Wed, 22 Jun 2022 17:38:41 +0200 Subject: [PATCH 4/4] Use more meaningful data --- doc/source/whatsnew/v1.5.0.rst | 79 +++++++++++++++++++++------------- 1 file changed, 49 insertions(+), 30 deletions(-) diff --git a/doc/source/whatsnew/v1.5.0.rst b/doc/source/whatsnew/v1.5.0.rst index 6276a8fd5b2cb..1af8d0d08cf31 100644 --- a/doc/source/whatsnew/v1.5.0.rst +++ b/doc/source/whatsnew/v1.5.0.rst @@ -574,20 +574,25 @@ column from an array with different dtype: .. ipython:: python - values = np.arange(4).reshape(2, 2) - df = pd.DataFrame(values) - ser = df[0] + df = pd.DataFrame({'price': [11.1, 12.2]}, index=['book1', 'book2']) + original_prices = df['price'] + new_prices = np.array([98, 99]) *Old behavior*: .. code-block:: ipython - In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32) - In [4]: ser + In [3]: df.iloc[:, 0] = new_prices + In [4]: df.iloc[:, 0] Out[4]: - 0 0 - 1 2 - Name: 0, dtype: int64 + book1 98 + book2 99 + Name: price, dtype: int64 + In [5]: original_prices + Out[5]: + book1 11.1 + book2 12.2 + Name: price, float: 64 This behavior is deprecated. In a future version, setting an entire column with iloc will attempt to operate inplace. @@ -596,38 +601,52 @@ iloc will attempt to operate inplace. .. code-block:: ipython - In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32) - In [4]: ser + In [3]: df.iloc[:, 0] = new_prices + In [4]: df.iloc[:, 0] Out[4]: - 0 10 - 1 11 - Name: 0, dtype: int64 + book1 98.0 + book2 99.0 + Name: price, dtype: float64 + In [5]: original_prices + Out[5]: + book1 98.0 + book2 99.0 + Name: price, dtype: float64 To get the old behavior, use :meth:`DataFrame.__setitem__` directly: .. code-block:: ipython - In [5]: df[0] = np.array([21, 31], dtype=np.int32) - In [4]: ser - Out[4]: - 0 10 - 1 11 - Name: 0, dtype: int64 - -In the case where ``df.columns`` is not unique, :meth:`DataFrame.isetitem` has -been added in pandas 1.5: - -*New behavior* + In [3]: df[df.columns[0]] = new_prices + In [4]: df.iloc[:, 0] + Out[4] + book1 98 + book2 99 + Name: price, dtype: int64 + In [5]: original_prices + Out[5]: + book1 11.1 + book2 12.2 + Name: price, dtype: float64 + +To get the old behaviour when ``df.columns`` is not unique and you want to +change a single column by index, you can use :meth:`DataFrame.isetitem`, which +has been added in pandas 1.5: .. code-block:: ipython - In [5]: df.columns = ["A", "A"] - In [5]: df.isetitem(0, np.array([21, 31], dtype=np.int32)) - In [4]: ser + In [3]: df_with_duplicated_cols = pd.concat([df, df], axis='columns') + In [3]: df_with_duplicated_cols.isetitem(0, new_prices) + In [4]: df_with_duplicated_cols.iloc[:, 0] Out[4]: - 0 10 - 1 11 - Name: 0, dtype: int64 + book1 98 + book2 99 + Name: price, dtype: int64 + In [5]: original_prices + Out[5]: + book1 11.1 + book2 12.2 + Name: 0, dtype: float64 .. _whatsnew_150.deprecations.numeric_only_default: