Skip to content

Add __copy__ and __deepcopy__ to NDFrame #15444

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -3161,6 +3161,14 @@ def copy(self, deep=True):
data = self._data.copy(deep=deep)
return self._constructor(data).__finalize__(self)

def __copy__(self, deep=True):
return self.copy(deep=deep)

def __deepcopy__(self, memo=None):
if memo is None:
memo = {}
return self.copy(deep=True)

def _convert(self, datetime=False, numeric=False, timedelta=False,
coerce=False, copy=True):
"""
Expand Down
13 changes: 7 additions & 6 deletions pandas/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -724,7 +724,13 @@ def copy(self, name=None, deep=False, dtype=None, **kwargs):
new_index = new_index.astype(dtype)
return new_index

__copy__ = copy
def __copy__(self, **kwargs):
return self.copy(**kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add (and test) __deepcopy__ here as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__deepcopy__ already exists in this class, see line 1483. Did you want me to move it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, then just move it next to this one


def __deepcopy__(self, memo=None):
if memo is None:
memo = {}
return self.copy(deep=True)

def _validate_names(self, name=None, names=None, deep=False):
"""
Expand Down Expand Up @@ -1480,11 +1486,6 @@ def __setstate__(self, state):

_unpickle_compat = __setstate__

def __deepcopy__(self, memo=None):
if memo is None:
memo = {}
return self.copy(deep=True)

def __nonzero__(self):
raise ValueError("The truth value of a {0} is ambiguous. "
"Use a.empty, a.bool(), a.item(), a.any() or a.all()."
Expand Down
23 changes: 23 additions & 0 deletions pandas/tests/test_generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
# pylint: disable-msg=E1101,W0612

from operator import methodcaller
from copy import copy, deepcopy
import pytest
import numpy as np
from numpy import nan
Expand Down Expand Up @@ -675,6 +676,17 @@ def test_validate_bool_args(self):
with self.assertRaises(ValueError):
super(DataFrame, df).mask(cond=df.a > 2, inplace=value)

def test_copy_and_deepcopy(self):
for shape in [0, 1, 2]:
obj = self._construct(shape)
for func in [copy,
deepcopy,
lambda x: x.copy(deep=False),
lambda x: x.copy(deep=True)]:
obj_copy = func(obj)
self.assertIsNot(obj_copy, obj)
self._compare(obj_copy, obj)


class TestSeries(tm.TestCase, Generic):
_typ = Series
Expand Down Expand Up @@ -1539,6 +1551,14 @@ def test_to_xarray(self):
expected,
check_index_type=False)

def test_deepcopy_empty(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look at how the other tests are done. We need systematic testing on all structures. So move to Generic and use _construct and _compare

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see my comment below on how test should be for this. you need to tests .copy(), copy(...), .deepcopy(), deepcopy()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I don't understand your comment here. Can you point me to a specific example to base the tests on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think I might not have made the motivation for this test sufficiently clear.

It's not intended to be a full test of deepcopy, for which _construct might be useful. Rather, it's designed specifically to trigger the bug I mentioned in #15370 where the data/index are empty but not the columns. I don't see how _construct would be appropriate for this, but maybe I'm doing it wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmcfee well you are changing fundamental things, which need full testing.

_construct is run with Series, DataFrame,Panel, to simply make a new one. Then you call a method/function and test if its a copy or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback sure, but that would be a separate test from the one you said to move. That issue never affected series; as far as I can tell, there's no way that it could have since it depends on having columns.

I have no problem adding generic tests for copy/deepcopy, but I want to make sure it's the right test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmcfee you can add the test you wanted. but you are making fundamental changes. These HAVE to be tested on all classes that are affected. (and you ARE affecting everything)

# This test covers empty frame copying with non-empty column sets
# as reported in issue GH15370
empty_frame = DataFrame(data=[], index=[], columns=['A'])
empty_frame_copy = deepcopy(empty_frame)

self._compare(empty_frame_copy, empty_frame)


class TestPanel(tm.TestCase, Generic):
_typ = Panel
Expand Down Expand Up @@ -1569,6 +1589,9 @@ class TestPanel4D(tm.TestCase, Generic):
def test_sample(self):
pytest.skip("sample on Panel4D")

def test_copy_and_deepcopy(self):
pytest.skip("copy_and_deepcopy on Panel4D")

def test_to_xarray(self):

tm._skip_if_no_xarray()
Expand Down