-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: move shift logic from BlockManager to DataFrame #40536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
) | ||
tail = self.iloc[:, -periods:] | ||
# pin a simple Index to avoid costly casting | ||
tail.columns = range(len(tail.columns)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we guaranteed to copy here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh. tail
is not necessarily a copy of the data in self
, but we do make a copy in the concat
call below, so the result
we return is always a copy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right that's the problem you are actually modifying the input here (e.g. you are setting columns).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tail
is never the same object as self
, just may be a view on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it ok
) | ||
tail = self.iloc[:, -periods:] | ||
# pin a simple Index to avoid costly casting | ||
tail.columns = range(len(tail.columns)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it ok
I don't know if it's a particularly useful benchmark, but there is a big slowdown potentially related to this PR: https://pandas.pydata.org/speed/pandas/#frame_methods.Shift.time_shift?python=3.8&Cython=0.29.21&p-axis=1&commits=05a0e98a-bc62e768 |
best guess is its bc a consolidate is being done by concat, which is called 2x. easy-ish workaround will be available following #38939 |
I opened an issue to track my above comment -> #42246 |
This reverts commit 9535280.
…dev#40536)" (pandas-dev#42317) This reverts commit 9535280.
Partially-fixes one of the currently-disabled ArrayManager tests.