Skip to content

ENH / CoW: Add lazy copy to eval #53746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 29, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/user_guide/copy_on_write.rst
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,7 @@ following methods:
- :meth:`DataFrame.astype` / :meth:`Series.astype`
- :meth:`DataFrame.convert_dtypes` / :meth:`Series.convert_dtypes`
- :meth:`DataFrame.join`
- :meth:`DataFrame.eval`
- :func:`concat`
- :func:`merge`

Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Copy-on-Write improvements
- The :class:`DataFrame` constructor, when constructing a DataFrame from a dictionary
of Index objects and specifying ``copy=False``, will now use a lazy copy
of those Index objects for the columns of the DataFrame (:issue:`52947`)
- Add lazy copy mechanism to :meth:`DataFrame.eval` (:issue:`53746`)

.. _whatsnew_210.enhancements.enhancement2:

Expand Down
8 changes: 7 additions & 1 deletion pandas/core/computation/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
from typing import TYPE_CHECKING
import warnings

from pandas._config import using_copy_on_write

from pandas.util._exceptions import find_stack_level
from pandas.util._validators import validate_bool_kwarg

Expand Down Expand Up @@ -373,7 +375,11 @@ def eval(
# if returning a copy, copy only on the first assignment
if not inplace and first_expr:
try:
target = env.target.copy()
target = env.target
if isinstance(target, NDFrame):
target = target.copy(deep=not using_copy_on_write())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also do target.copy(deep=None) here, which will have the same effect (without having to import using_copy_on_write here). I think this is how we did it in most places that had a self.copy() in the method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot about that, thx

else:
target = target.copy()
except AttributeError as err:
raise ValueError("Cannot return a copy of the target") from err
else:
Expand Down
27 changes: 27 additions & 0 deletions pandas/tests/copy_view/test_methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -1850,3 +1850,30 @@ def test_insert_series(using_copy_on_write):

df.iloc[0, 1] = 100
tm.assert_series_equal(ser, ser_orig)


def test_eval(using_copy_on_write):
df = DataFrame({"a": [1, 2, 3], "b": 1})
df_orig = df.copy()

result = df.eval("c = a+b")
if using_copy_on_write:
assert np.shares_memory(get_array(df, "a"), get_array(result, "a"))
else:
assert not np.shares_memory(get_array(df, "a"), get_array(result, "a"))

result.iloc[0, 0] = 100
tm.assert_frame_equal(df, df_orig)


def test_eval_inplace(using_copy_on_write):
df = DataFrame({"a": [1, 2, 3], "b": 1})
df_orig = df.copy()
df_view = df[:]

df.eval("c = a+b", inplace=True)
assert np.shares_memory(get_array(df, "a"), get_array(df_view, "a"))

df.iloc[0, 0] = 100
if using_copy_on_write:
tm.assert_frame_equal(df_view, df_orig)