Skip to content

DOC: Update pandas.Series.copy docstring #20261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 23, 2018
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 105 additions & 1 deletion pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -4506,7 +4506,15 @@ def astype(self, dtype, copy=True, errors='raise', **kwargs):

def copy(self, deep=True):
"""
Make a copy of this objects data.
Make a copy of this objects indices and data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of explicitness can you change "this objects" to "the calling object's" or "the caller's"


Deep copy (default) will create a new object with a copy of the objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given these start to dive into the parameters and implementations I think they are better served in a dedicated Notes section

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

object's

data and indices. Modifications to the data or indices of the copy
will not be reflected in the original object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indices are immutable, mention this

Shallow copy (``deep=False``) will create a new object without copying
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single back ticks

the data or indices. Any changes to the index or data of the original
will be reflected in the shallow copy (and vice versa).

Parameters
----------
Expand All @@ -4522,6 +4530,102 @@ def copy(self, deep=True):
Returns
-------
copy : type of caller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move "type of caller" down as the description and say something like "Object type matches caller."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to clean up the line that says copy : type of caller. I would explicitly list the objects that could be a caller of this function, which I assume would make this Series or DataFrame but if there's anything else be sure to add that.

Otherwise nice job on all the edits - changes I requested all lgtm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would make this "Series or DataFrame"


Examples
--------
>>> s = pd.Series([1,2], dtype="int32", index=["a", "b"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after comma

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the dtype argument provided because you want it to be clear that type is copied over and not automatically upcast to int64? I think that's what you are going for but it wasn't immediately obvious to me, so I'd try and make it more explicit if that is your intention

>>> s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simple enough that I don't think you need to print

a 1
b 2
dtype: int32
>>> s_copy = s.copy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank line between cases (if they are separate / distinct)

>>> s_copy
a 1
b 2
dtype: int32

Shallow copy versus default (deep) copy:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a "section header" can you get it to render in bold (i.e. between asterisks)?


In a shallow copy the index and data is a reference to the original
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just "the index and data are shared with the original object"

objects'.

>>> s = pd.Series([1,2], dtype="int32", index=["a", "b"])
>>> deep_copy = s.copy()
>>> shallow_copy = s.copy(deep=False)
>>> id(s) == id(shallow_copy)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These examples are certainly comprehensive but perhaps too verbose. Can you refactor to make the index / values comparisons fit on one line? Similar comment with the value assignment - any consolidation will make it more readable

False
>>> id(s.index) == id(shallow_copy.index)
True
>>> id(s.values) == id(shallow_copy.values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just again wondering if these can be more concise. Perhaps choosing a different variable name will allow you to do id(s.values) == id(shlw.values) and id(s.index) == id(shlw.values)

True
>>> id(s) == id(deep_copy)
False
>>> id(s.index) == id(deep_copy.index)
False
>>> id(s.values) == id(deep_copy.values)
False
>>> s[0] = 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break this up with some text describing what you are showing

>>> s
a 3
b 2
dtype: int32
>>> shallow_copy
a 3
b 2
dtype: int32
>>> deep_copy
a 1
b 2
dtype: int32
>>> shallow_copy[0] = 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having two different assignments I think you can have s[0] = 3 and then this line both before printing any of the objects. Gets you the same result and covers the same concepts in less lines

>>> s
a 4
b 2
dtype: int32
>>> shallow_copy
a 4
b 2
dtype: int32
>>> deep_copy
a 1
b 2
dtype: int32

When copying object containing python objects, deep copy will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"an object"

copy the indices and data but will not do so recursively.

>>> class Point(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it required to put a class in this docstring? Again for the sake of being concise wonder if you can just use lists to illustrate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I made this change plus the other requested changes.

... def __init__(self, x, y):
... self.x = x
... self.y = y
... def __repr__(self):
... return "Point(%d,%d)" % (self.x, self.y)
...
>>> p1 = Point(0, 1)
>>> s = pd.Series([p1], dtype="object")
>>> s
0 Point(0,1)
dtype: object
>>> s_copy = s.copy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show this with a deep copy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure how to address this comment - do you mean to show the behavior of
.copy(deep=True) versus the std python copy.deepcopy()? I was trying to show an example of how a (deep=True) copy isn't necessarily a deep copy in the standard python sense, but maybe I should remove this example entirely since it's already mentioned as a caveat in the Notes section?

>>> s_copy
0 Point(0,1)
dtype: object
>>> id(s[0]) == id(s_copy[0])
True
>>> p1.x = 1
>>> s
0 Point(1,1)
dtype: object
>>> s_copy
0 Point(1,1)
dtype: object
>>>

For deep-copying python objects, the following can be used:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this last example is not necessary


>>> import copy
>>> deep_deep_copy = pd.Series(copy.deepcopy(s.values),
... index=copy.deepcopy(s.index))
"""
data = self._data.copy(deep=deep)
return self._constructor(data).__finalize__(self)
Expand Down