-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: vectorized DateOffset with months #11205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
to_timedelta(base.days_in_month - 1, unit='D')) | ||
i = base + day_offset + time | ||
shifted = tslib.shift_months(i.asi8, months) | ||
i = i._constructor(shifted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will lose the name
of the Index
- I think you want to use _shallow_copy
:
In [57]: df.index._constructor(df.index)
Out[57]:
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19],
dtype='int64')
In [58]: df.index._shallow_copy(df.index)
Out[58]:
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19],
dtype='int64', name='hi')
(there's a broader issue about the constructor keeping the name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually doesn't cause a problem in this case because the values are going to be unboxed/boxed here anyways:
https://github.com/pydata/pandas/blob/master/pandas/tseries/index.py#L716
But I'll change it if it's the more idiomatic way to get the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes shallow copy is the idiom
@jreback - pushed changes for your comments and added For your comment about using this routine in the |
@@ -4386,6 +4387,73 @@ cpdef normalize_date(object dt): | |||
raise TypeError('Unrecognized type: %s' % type(dt)) | |||
|
|||
|
|||
cdef inline int _year_add_months(pandas_datetimestruct dts, | |||
int months): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a doc-string to these
|
3180168
to
e58f18c
Compare
Made those doc changes. Yeah, there is an
|
PERF: vectorized DateOffset with months
@chris-b1 thanks! these pr's are awesome! keep em coming! |
This is a follow-up to #10744. In that, vectorized versions of some offsets were implemented, mostly by changing to periods and back.
The case of shifting by years/months (which is actually most useful to me) required some extra hoops and had poorer performance - this PR implements a special cython routine for that, for about a 10x improvement.