-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: Series and DataFrame constructors to return shallow copy (i.e. don't share index) from another Series/DataFrame #50539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
3c97a59
1b7de43
7cf45ff
1dfea08
c9ca480
2450a94
d355032
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -596,6 +596,11 @@ Other API changes | |
or :attr:`~DataFrame.iloc` (thus, ``df.loc[:, :]`` or ``df.iloc[:, :]``) now returns a | ||
new DataFrame (shallow copy) instead of the original DataFrame, consistent with other | ||
methods to get a full slice (for example ``df.loc[:]`` or ``df[:]``) (:issue:`49469`) | ||
- The :class:`Series` and :class:`DataFrame` constructors will now return a shallow copy | ||
(i.e. share data, but not attributes) when passed a Series and DataFrame, respectively, | ||
and with the default of ``copy=False`` (and if no other triggers a copy). Previously, | ||
the new Series or DataFrame would share the index attribute (e.g. ``df.index = ...`` | ||
would also update the index of the parent or child) (:issue:`49523`) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IIUC it is setting df.index.name = ..., not setting df.index itself? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, actually the index itself, see the code snippet in the issue: #49523 This is because right now the returned Series is actually sharing the same manager, and so setting I suppose actually also after this change with doing a proper shallow copy, they still share the actual Index object, and so mutating an attribute of the index (eg There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. makes sense, thanks There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to confirm: as expected, mutating index' attributes is still propagated (since the Index object is shared, because immutable):
We should maybe consider also "shallow copying" the index when creating a shallow copy of a Series/DataFrame, i.e. create a new Index but viewing the same data, to avoid the above. But that's a bigger change / for a separate issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
+1 |
||
- Disallow computing ``cumprod`` for :class:`Timedelta` object; previously this returned incorrect values (:issue:`50246`) | ||
- Loading a JSON file with duplicate columns using ``read_json(orient='split')`` renames columns to avoid duplicates, as :func:`read_csv` and the other readers do (:issue:`50370`) | ||
- :func:`to_datetime` with ``unit`` of either "Y" or "M" will now raise if a sequence contains a non-round ``float`` value, matching the ``Timestamp`` behavior (:issue:`50301`) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and if no other triggers a copy" is there a word missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I suppose I meant "no other keyword triggers a copy", eg by passing a dtype, or by passing index/columns causing a reindex.