Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ENH: Add lazy copy to concat and round #50501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add lazy copy to concat and round #50501
Changes from 13 commits
94c982e
7fbba4a
0db1c39
fc564f8
a7165c2
20a62bb
bca2123
3c4abd2
7734979
8bc3272
79cedd4
6c8a704
de678ab
dddc9f0
d18416d
5ccf23e
7648b23
99da573
a6afa8b
3d0772c
8b49210
48a4f7e
690f072
8ba35fe
2047d5c
e959c53
891e07c
04adc17
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are obviously overwriting parent here when we have more than one object, is this a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, I suppose that is a problem.
parent
is just used to keep a reference but is not otherwise used, so it doesn't really matter what object it is. So I think we can also make it a list of parent objects in this case.(will try to make a test that fails because of this; generally it is for chained operations, which you might have less often with concat)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A "chained" concat (not very normal usage .., but still should work correctly ;)):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, definitely agree. I'll keep track of a list of parent objects then. Series does not work correctly either, but this should be straightforward to fix I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this is annoying, I thought that ci should fail, but makes sense that the import continues to work through there...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this caused by the concat changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah strides changed from 8 to 16, this is really weird
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into this a little bit, I think the issue is that in
pandas/pandas/io/pytables.py
Lines 3207 to 3212 in a0071f9
where we are creating the DataFrame, with the changes in this PR that
concat
won't copy if CoW is enabled. But it seems that the data from the HDF store always come back as F contiguous, while in pandas we want it as C contiguous for optimal performance.Maybe we should just add a
copy=True
in the concat call in the mentioned snippet. That's already the default for non-CoW so won't change anything, and for CoW enabled also ensures we get the desired memory layout (at the expense of an extra copy while reading in, but to fix that, that can be a follow-up optimization investigating why the HDF store always returns F contiguous arrays)Also commented about that on the related issue (that triggered adding this test): #22073
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this seems to fix it