Skip to content

BUG-CoW: DataFrame constructed from Series not respecting CoW #52031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 29, 2023

Conversation

phofl
Copy link
Member

@phofl phofl commented Mar 16, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

right now we back the DataFrame by a read-only array, before the read-only pr this did not respect CoW.

Follow-up will be to do the same for Index.

@phofl phofl added this to the 2.0 milestone Mar 16, 2023
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@@ -291,7 +293,20 @@ def ndarray_to_mgr(
if values.ndim == 1:
values = values.reshape(-1, 1)

elif isinstance(values, (np.ndarray, ExtensionArray, ABCSeries, Index)):
elif isinstance(values, ABCSeries):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this path handle both Series and Index?

Copy link
Member

@jorisvandenbossche jorisvandenbossche Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see you indicated to do that in a follow-up (although I would assume it's "just" adding Index to this check?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think so, but would mean adding more tests and increasing the diff. Separate would probably keep it a bit simpler


def test_dataframe_from_series_infer_datetime(using_copy_on_write):
ser = Series([Timestamp("2019-12-31"), Timestamp("2020-12-31")], dtype=object)
df = DataFrame(ser)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should stop inferring this, but that's another issue ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅

# Conflicts:
#	doc/source/whatsnew/v2.0.0.rst
#	pandas/tests/copy_view/test_constructors.py
@phofl
Copy link
Member Author

phofl commented Mar 29, 2023

I'll merge this to get it into 2.0 and to make the follow up

@phofl phofl merged commit a7fec97 into pandas-dev:main Mar 29, 2023
@phofl phofl deleted the cow_df_from_series branch March 29, 2023 14:06
phofl added a commit that referenced this pull request Mar 29, 2023
…om Series not respecting CoW) (#52274)

Backport PR #52031: BUG-CoW: DataFrame constructed from Series not respecting CoW

Co-authored-by: Patrick Hoefler <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants