Skip to content

DOC: Start migration guide for Copy-on-Write #56298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Dec 14, 2023

Conversation

phofl
Copy link
Member

@phofl phofl commented Dec 2, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice start! Some quick comments

Would it be worth to also explicitly call out df["col"].fillna(.., inplace=True) style chained method that will no longer work? (or as second example under "Only one pandas object is updated at once")

Comment on lines 52 to +56
The following sections will explain what this means and how it impacts existing
applications.

Migrating to Copy-on-Write
--------------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should move the migration guide further in the file, as right now this section assumes somewhat knowledge about what CoW is, but that is only explained "Description" section that now comes afterwards

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had similar thoughts, but I wanted to put it as prominent as possible

Creating a copy of this array allows modification. You can also make the array
writeable again if you don't care about the pandas object anymore.

**Only one pandas object is updated at once**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code example you use is already a practical example of it, but I wonder if we should explicitly call out "modifying a column as a Series no longer works"? As I think this will be one of the main use cases where right now the user intended the propagation of the modification (while with dataframe row slices, I think that is much less the case)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep added

Comment on lines 120 to 121
See the section about :ref:`read-only NumPy arrays <copy_on_write_read_only_na>`
for more details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "see also" probably belongs in the section above "Accessing the underlying array of a pandas object will return a read-only view"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah good point

@phofl phofl added this to the 2.2 milestone Dec 8, 2023
@phofl
Copy link
Member Author

phofl commented Dec 8, 2023

I think we should link from the FutureWarning to this section as well, but that can be a follow up

@phofl phofl merged commit f1fae79 into pandas-dev:main Dec 14, 2023
@phofl phofl deleted the cow_migration_guide branch December 14, 2023 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants