Skip to content

ENH / CoW: Add lazy copy to eval #53746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/user_guide/copy_on_write.rst
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,7 @@ following methods:
- :meth:`DataFrame.astype` / :meth:`Series.astype`
- :meth:`DataFrame.convert_dtypes` / :meth:`Series.convert_dtypes`
- :meth:`DataFrame.join`
- :meth:`DataFrame.eval`
- :func:`concat`
- :func:`merge`

Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ Copy-on-Write improvements
- The :class:`DataFrame` constructor, when constructing a DataFrame from a dictionary
of Index objects and specifying ``copy=False``, will now use a lazy copy
of those Index objects for the columns of the DataFrame (:issue:`52947`)
- Add lazy copy mechanism to :meth:`DataFrame.eval` (:issue:`53746`)

.. _whatsnew_210.enhancements.enhancement2:

Expand Down
10 changes: 8 additions & 2 deletions pandas/core/computation/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -373,7 +373,11 @@ def eval(
# if returning a copy, copy only on the first assignment
if not inplace and first_expr:
try:
target = env.target.copy()
target = env.target
if isinstance(target, NDFrame):
target = target.copy(deep=None)
else:
target = target.copy()
except AttributeError as err:
raise ValueError("Cannot return a copy of the target") from err
else:
Expand All @@ -389,7 +393,9 @@ def eval(
if inplace and isinstance(target, NDFrame):
target.loc[:, assigner] = ret
else:
target[assigner] = ret
target[ # pyright: ignore[reportGeneralTypeIssues]
assigner
] = ret
except (TypeError, IndexError) as err:
raise ValueError("Cannot assign expression output to target") from err

Expand Down
27 changes: 27 additions & 0 deletions pandas/tests/copy_view/test_methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -1850,3 +1850,30 @@ def test_insert_series(using_copy_on_write):

df.iloc[0, 1] = 100
tm.assert_series_equal(ser, ser_orig)


def test_eval(using_copy_on_write):
df = DataFrame({"a": [1, 2, 3], "b": 1})
df_orig = df.copy()

result = df.eval("c = a+b")
if using_copy_on_write:
assert np.shares_memory(get_array(df, "a"), get_array(result, "a"))
else:
assert not np.shares_memory(get_array(df, "a"), get_array(result, "a"))

result.iloc[0, 0] = 100
tm.assert_frame_equal(df, df_orig)


def test_eval_inplace(using_copy_on_write):
df = DataFrame({"a": [1, 2, 3], "b": 1})
df_orig = df.copy()
df_view = df[:]

df.eval("c = a+b", inplace=True)
assert np.shares_memory(get_array(df, "a"), get_array(df_view, "a"))

df.iloc[0, 0] = 100
if using_copy_on_write:
tm.assert_frame_equal(df_view, df_orig)