Skip to content

Implement scalar shift_month mirroring tslib.shift_months #18218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 12, 2017
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.22.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ Performance Improvements
~~~~~~~~~~~~~~~~~~~~~~~~

- Indexers on Series or DataFrame no longer create a reference cycle (:issue:`17956`)
-
- DateOffset arithmetic performance is improved (:issue:`18218`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a :class`DateOffset` (its now defined

-

.. _whatsnew_0220.docs:
Expand Down
34 changes: 32 additions & 2 deletions pandas/_libs/tslibs/offsets.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
cimport cython

import time
from cpython.datetime cimport timedelta, time as dt_time
from cpython.datetime cimport datetime, timedelta, time as dt_time

from dateutil.relativedelta import relativedelta

Expand All @@ -15,7 +15,7 @@ np.import_array()

from util cimport is_string_object

from pandas._libs.tslib import pydt_to_i8
from pandas._libs.tslib import pydt_to_i8, monthrange
Copy link
Contributor

@jreback jreback Nov 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob should move monthrange and everything in its impl to offsets (and then you can cimport these to tslib.pyx)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually. We've got a few more of these left to go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k, add to the list


from frequencies cimport get_freq_code
from conversion cimport tz_convert_single
Expand Down Expand Up @@ -375,3 +375,33 @@ class BaseOffset(_BaseOffset):
# i.e. isinstance(other, (ABCDatetimeIndex, ABCSeries))
return other - self
return -self + other


# ----------------------------------------------------------------------
# RelativeDelta Arithmetic


cpdef datetime shift_month(datetime stamp, int months, object day_opt=None):
cdef:
int year, month, day
int dim, dy

dy = (stamp.month + months) // 12
month = (stamp.month + months) % 12

if month == 0:
month = 12
dy -= 1
year = stamp.year + dy

dim = monthrange(year, month)[1]
if day_opt is None:
day = min(stamp.day, dim)
elif day_opt == 'start':
day = 1
elif day_opt == 'end':
day = dim
else:
# assume this is an integer (and a valid day)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? why is day_opt anything else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and if it is, then explicity put it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? why is day_opt anything else?

For e.g. semi-month offsets we may be shifting to a particular day other than the first or last of the month.

and if it is, then explicity put it.

You mean assert it? OK. I'll go ahead and write a docstring too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check if it’s an integer

the else clause should raise

day = min(day_opt, dim)
return stamp.replace(year=year, month=month, day=day)
62 changes: 32 additions & 30 deletions pandas/tseries/offsets.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
_int_to_weekday, _weekday_to_int,
_determine_offset,
apply_index_wraps,
shift_month,
BeginMixin, EndMixin,
BaseOffset)

Expand Down Expand Up @@ -252,6 +253,8 @@ def apply_index(self, i):
"applied vectorized".format(kwd=kwd))

def isAnchored(self):
# TODO: Does this make sense for the general case? It would help
# if there were a canonical docstring for what isAnchored means.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback : Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah to be honested I am not sure isAnchored is really necessary, but that's orthogonal

return (self.n == 1)

def _params(self):
Expand Down Expand Up @@ -721,6 +724,7 @@ def apply(self, other):

return result
else:
# TODO: Figure out the end of this sente
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume you're going to figure this out beforehand?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean what the end of the error message should be? That's orthogonal to this PR, but merits a reminder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah not sure, @sinhrks wrote this originally.

raise ApplyTypeError(
'Only know how to combine business hour with ')

Expand Down Expand Up @@ -927,10 +931,10 @@ def apply(self, other):
n = self.n
_, days_in_month = tslib.monthrange(other.year, other.month)
if other.day != days_in_month:
other = other + relativedelta(months=-1, day=31)
other = shift_month(other, -1, 'end')
if n <= 0:
n = n + 1
other = other + relativedelta(months=n, day=31)
other = shift_month(other, n, 'end')
return other

@apply_index_wraps
Expand All @@ -956,7 +960,7 @@ def apply(self, other):
if other.day > 1 and n <= 0: # then roll forward if n<=0
n += 1

return other + relativedelta(months=n, day=1)
return shift_month(other, n, 'start')

@apply_index_wraps
def apply_index(self, i):
Expand Down Expand Up @@ -1002,12 +1006,12 @@ def apply(self, other):
if not self.onOffset(other):
_, days_in_month = tslib.monthrange(other.year, other.month)
if 1 < other.day < self.day_of_month:
other += relativedelta(day=self.day_of_month)
other = other.replace(day=self.day_of_month)
if n > 0:
# rollforward so subtract 1
n -= 1
elif self.day_of_month < other.day < days_in_month:
other += relativedelta(day=self.day_of_month)
other = other.replace(day=self.day_of_month)
if n < 0:
# rollforward in the negative direction so add 1
n += 1
Expand Down Expand Up @@ -1084,19 +1088,19 @@ def onOffset(self, dt):
def _apply(self, n, other):
# if other.day is not day_of_month move to day_of_month and update n
if other.day < self.day_of_month:
other += relativedelta(day=self.day_of_month)
other = other.replace(day=self.day_of_month)
if n > 0:
n -= 1
elif other.day > self.day_of_month:
other += relativedelta(day=self.day_of_month)
other = other.replace(day=self.day_of_month)
if n == 0:
n = 1
else:
n += 1

months = n // 2
day = 31 if n % 2 else self.day_of_month
return other + relativedelta(months=months, day=day)
return shift_month(other, months, day)

def _get_roll(self, i, before_day_of_month, after_day_of_month):
n = self.n
Expand Down Expand Up @@ -1141,21 +1145,21 @@ def onOffset(self, dt):
def _apply(self, n, other):
# if other.day is not day_of_month move to day_of_month and update n
if other.day < self.day_of_month:
other += relativedelta(day=self.day_of_month)
other = other.replace(day=self.day_of_month)
if n == 0:
n = -1
else:
n -= 1
elif other.day > self.day_of_month:
other += relativedelta(day=self.day_of_month)
other = other.replace(day=self.day_of_month)
if n == 0:
n = 1
elif n < 0:
n += 1

months = n // 2 + n % 2
day = 1 if n % 2 else self.day_of_month
return other + relativedelta(months=months, day=day)
return shift_month(other, months, day)

def _get_roll(self, i, before_day_of_month, after_day_of_month):
n = self.n
Expand Down Expand Up @@ -1191,7 +1195,7 @@ def apply(self, other):
n = n - 1
elif n <= 0 and other.day > lastBDay:
n = n + 1
other = other + relativedelta(months=n, day=31)
other = shift_month(other, n, 'end')

if other.weekday() > 4:
other = other - BDay()
Expand All @@ -1215,7 +1219,7 @@ def apply(self, other):
other = other + timedelta(days=first - other.day)
n -= 1

other = other + relativedelta(months=n)
other = shift_month(other, n, None)
wkday, _ = tslib.monthrange(other.year, other.month)
first = _get_firstbday(wkday)
result = datetime(other.year, other.month, first,
Expand Down Expand Up @@ -1520,8 +1524,7 @@ def apply(self, other):
else:
months = self.n + 1

other = self.getOffsetOfMonth(
other + relativedelta(months=months, day=1))
other = self.getOffsetOfMonth(shift_month(other, months, 'start'))
other = datetime(other.year, other.month, other.day, base.hour,
base.minute, base.second, base.microsecond)
return other
Expand Down Expand Up @@ -1612,8 +1615,7 @@ def apply(self, other):
else:
months = self.n + 1

return self.getOffsetOfMonth(
other + relativedelta(months=months, day=1))
return self.getOffsetOfMonth(shift_month(other, months, 'start'))

def getOffsetOfMonth(self, dt):
m = MonthEnd()
Expand Down Expand Up @@ -1716,7 +1718,7 @@ def apply(self, other):
elif n <= 0 and other.day > lastBDay and monthsToGo == 0:
n = n + 1

other = other + relativedelta(months=monthsToGo + 3 * n, day=31)
other = shift_month(other, monthsToGo + 3 * n, 'end')
other = tslib._localize_pydatetime(other, base.tzinfo)
if other.weekday() > 4:
other = other - BDay()
Expand Down Expand Up @@ -1761,7 +1763,7 @@ def apply(self, other):
n = n - 1

# get the first bday for result
other = other + relativedelta(months=3 * n - monthsSince)
other = shift_month(other, 3 * n - monthsSince, None)
wkday, _ = tslib.monthrange(other.year, other.month)
first = _get_firstbday(wkday)
result = datetime(other.year, other.month, first,
Expand Down Expand Up @@ -1795,7 +1797,7 @@ def apply(self, other):
if n > 0 and not (other.day >= days_in_month and monthsToGo == 0):
n = n - 1

other = other + relativedelta(months=monthsToGo + 3 * n, day=31)
other = shift_month(other, monthsToGo + 3 * n, 'end')
return other

@apply_index_wraps
Expand Down Expand Up @@ -1830,7 +1832,7 @@ def apply(self, other):
# after start, so come back an extra period as if rolled forward
n = n + 1

other = other + relativedelta(months=3 * n - monthsSince, day=1)
other = shift_month(other, 3 * n - monthsSince, 'start')
return other

@apply_index_wraps
Expand Down Expand Up @@ -1889,7 +1891,7 @@ def apply(self, other):
(other.month == self.month and other.day > lastBDay)):
years += 1

other = other + relativedelta(years=years)
other = shift_month(other, 12 * years, None)

_, days_in_month = tslib.monthrange(other.year, self.month)
result = datetime(other.year, self.month, days_in_month,
Expand Down Expand Up @@ -1927,7 +1929,7 @@ def apply(self, other):
years += 1

# set first bday for result
other = other + relativedelta(years=years)
other = shift_month(other, years * 12, None)
wkday, days_in_month = tslib.monthrange(other.year, self.month)
first = _get_firstbday(wkday)
return datetime(other.year, self.month, first, other.hour,
Expand Down Expand Up @@ -2145,8 +2147,8 @@ def onOffset(self, dt):

if self.variation == "nearest":
# We have to check the year end of "this" cal year AND the previous
return year_end == dt or \
self.get_year_end(dt - relativedelta(months=1)) == dt
return (year_end == dt or
self.get_year_end(shift_month(dt, -1, None)) == dt)
else:
return year_end == dt

Expand Down Expand Up @@ -2226,8 +2228,8 @@ def get_year_end(self, dt):
def get_target_month_end(self, dt):
target_month = datetime(
dt.year, self.startingMonth, 1, tzinfo=dt.tzinfo)
next_month_first_of = target_month + relativedelta(months=+1)
return next_month_first_of + relativedelta(days=-1)
next_month_first_of = shift_month(target_month, 1, None)
return next_month_first_of + timedelta(days=-1)

def _get_year_end_nearest(self, dt):
target_date = self.get_target_month_end(dt)
Expand Down Expand Up @@ -2382,7 +2384,7 @@ def apply(self, other):
qtr_lens = self.get_weeks(other + self._offset)

for weeks in qtr_lens:
start += relativedelta(weeks=weeks)
start += timedelta(weeks=weeks)
if start > other:
other = start
n -= 1
Expand All @@ -2399,7 +2401,7 @@ def apply(self, other):
qtr_lens = self.get_weeks(other)

for weeks in reversed(qtr_lens):
end -= relativedelta(weeks=weeks)
end -= timedelta(weeks=weeks)
if end < other:
other = end
n -= 1
Expand Down Expand Up @@ -2442,7 +2444,7 @@ def onOffset(self, dt):

current = next_year_end
for qtr_len in qtr_lens[0:4]:
current += relativedelta(weeks=qtr_len)
current += timedelta(weeks=qtr_len)
if dt == current:
return True
return False
Expand Down