Skip to content

DOC clarify inplace operation section in 1.5 whats_new #47433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 30, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 24 additions & 15 deletions doc/source/whatsnew/v1.5.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -565,37 +565,50 @@ retained by specifying ``group_keys=False``.
.. _whatsnew_150.notable_bug_fixes.setitem_column_try_inplace:
_ see also _whatsnew_130.notable_bug_fixes.setitem_column_try_inplace
Copy link
Contributor Author

@lesteve lesteve Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side-comment: the _ see also ... line is not rendered in the HTML, not sure whether it was supposed to be a link to https://pandas.pydata.org/docs/dev/whatsnew/v1.3.0.html#try-operating-inplace-when-setting-values-with-loc-and-iloc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it looks like it's supposed to reference that section. Mind correcting it? Can be in the body of the paragraph too like Please reference `this section <whatsnew_130.notable_bug_fixes.setitem_column_try_inplace>` of the 1.3 whatsnew file.


Try operating inplace when setting values with ``loc`` and ``iloc``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Towards consistently trying to operate inplace when setting values with ``loc`` and ``iloc``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Most of the time setting values with ``frame.iloc`` attempts to set values
in-place, only falling back to inserting a new array if necessary. In the past,
setting entire columns has been an exception to this rule:
inplace, only falling back to inserting a new array if necessary. There are
some edge cases where values are not set inplace, for example when setting an
entire column from an array with different dtype:

.. ipython:: python

values = np.arange(4).reshape(2, 2)
df = pd.DataFrame(values)
ser = df[0]

*Old behavior*:
*Old behavior with pandas < 1.5*:

.. code-block:: ipython

In [3]: df.iloc[:, 0] = np.array([10, 11])
In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you change this?

Copy link
Contributor Author

@lesteve lesteve Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the dtypes match the old (pandas < 1.5) behaviour is already to update the underlying array in place.

import numpy as np
import pandas as pd

print(f"{pd.__version__=}")
values = np.arange(4).reshape(2, 2)
df = pd.DataFrame(values)
ser = df[0]
print(f"before setting column\n{ser}")
df.iloc[:, 0] = np.array([10, 11])
print(f"after setting column\n{ser}")

Output:

pd.__version__='1.4.2'
before setting column
0    0
1    2
Name: 0, dtype: int64
after setting column
0    10
1    11
Name: 0, dtype: int64

This snippet was incorrect in pretending that ser was not updated in place.

I used a different dtype for the assignment rhs term to be in the case, where:

  • the old (pandas < 1.5) behaviour is not to update in place
  • the new (pandas 1.5) behaviour is the same with an additional warning
  • the future behaviour will be to update in place

In [4]: ser
Out[4]:
0 0
1 2
Name: 0, dtype: int64

This behavior is deprecated. In a future version, setting an entire column with
iloc will attempt to operate inplace.
*Behavior with pandas 1.5* is the same but you get a ``FutureWarning``:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leave the original note, this is not inline with our styling.


.. code-block:: ipython

In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32)
FutureWarning: In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In [4]: ser
Out[4]:
0 0
1 2
Name: 0, dtype: int64

*Future behavior*:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth adding ...an entire column with different type with iloc... to the comment above?

In a future version, setting an entire column with ``iloc`` will attempt to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is redundant

operate inplace.

.. code-block:: ipython

In [3]: df.iloc[:, 0] = np.array([10, 11])
In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32)
In [4]: ser
Out[4]:
0 10
Expand All @@ -604,11 +617,9 @@ iloc will attempt to operate inplace.

To get the old behavior, use :meth:`DataFrame.__setitem__` directly:

*Future behavior*:

.. code-block:: ipython

In [5]: df[0] = np.array([21, 31])
In [5]: df[0] = np.array([21, 31], dtype=np.int32)
In [4]: ser
Out[4]:
0 10
Expand All @@ -617,12 +628,10 @@ To get the old behavior, use :meth:`DataFrame.__setitem__` directly:

In the case where ``df.columns`` is not unique, use :meth:`DataFrame.isetitem`:

*Future behavior*:

.. code-block:: ipython

In [5]: df.columns = ["A", "A"]
In [5]: df.isetitem(0, np.array([21, 31]))
In [5]: df.isetitem(0, np.array([21, 31]), dtype=np.int32)
In [4]: ser
Out[4]:
0 10
Expand Down