Skip to content

DOC: Add Copy on write whatsnew #50470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 3, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions doc/source/whatsnew/v1.5.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,52 @@ and attributes without holding entire tree in memory (:issue:`45442`).
.. _`lxml's iterparse`: https://lxml.de/3.2/parsing.html#iterparse-and-iterwalk
.. _`etree's iterparse`: https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.iterparse

.. _whatsnew_150.enhancements.copy_on_write:

Copy on Write
^^^^^^^^^^^^^

A new feature ``copy_on_write`` was added (:issue:`46958`). Copy on write ensures that
any DataFrame or Series derived from another in any way always behaves as a copy.
Copy on write disallows updating any other object than the object the method
was applied to.

Copy on write can be enabled through:

.. code-block:: python

pd.set_option("mode.copy_on_write", True)
pd.options.mode.copy_on_write = True

Alternatively, copy on write can be enabled locally through:

.. code-block:: python

with pd.option_context("mode.copy_on_write", True):
Comment on lines +310 to +314
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we should encourage this usage by explicitly mentioning it (it's a general option, so if you are familiar with the option function, you can always use it this way using the existing option_context).

This is tricky as this will only work if the data you are using it created within that block. Once you only put this around an operation, with data created before the block, things might go sideways.

(that's probably a limitation of the setting the option that we should document anyway)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep it for now, since support on the 1.5.x branch is limited. It's more or less just to try it out. But agree we should maybe document this more clearly.

...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a small example that shows the effect of copy on write?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


Without copy on write, the parent :class:`DataFrame` is updated when updating a child
:class:`DataFrame` that was derived from this :class:`DataFrame`.

.. ipython:: python

df = pd.DataFrame({"foo": [1, 2, 3], "bar": 1})
view = df["foo"]
view.iloc[0]
df

With copy on write enabled, df won't be updated anymore:

.. ipython:: python

with pd.option_context("mode.copy_on_write", True):
df = pd.DataFrame({"foo": [1, 2, 3], "bar": 1})
view = df["foo"]
view.iloc[0]
df

A more detailed explanation can be found `here <https://phofl.github.io/cow-introduction.html>`_.

.. _whatsnew_150.enhancements.other:

Other enhancements
Expand Down