Skip to content

Initial implementation of holiday and holiday calendar. #6719

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
47 changes: 47 additions & 0 deletions doc/source/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -546,6 +546,17 @@ calendars which account for local holidays and local weekend conventions.
print(dts)
print(Series(dts.weekday, dts).map(Series('Mon Tue Wed Thu Fri Sat Sun'.split())))

As of v0.14 holiday calendars can be used to provide the list of holidays. See the
:ref:`holiday calendar<timeseries.holiday>` section for more information.

.. ipython:: python

from pandas.tseries.holiday import USFederalHolidayCalendar
bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar())
dt = datetime(2014, 1, 17) #Friday before MLK Day
print(dt + bday_us) #Tuesday after MLK Day


.. note::

The frequency string 'C' is used to indicate that a CustomBusinessDay
Expand Down Expand Up @@ -712,6 +723,42 @@ and business year ends. Please also note the legacy time rule for milliseconds
``ms`` versus the new offset alias for month start ``MS``. This means that
offset alias parsing is case sensitive.

.. _timeseries.holiday:

Holidays / Holiday Calendars
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Holidays and calendars provide a simple way to define holiday rules to be used
with ``CustomBusinessDay`` or in other analysis that requires a predefined
set of holidays. The ``AbstractHolidayCalendar`` class provides all the necessary methods
to return a list of holidays and only the ``_rule_table`` needs to be defined
in a specific holiday calendar class.

Moreover, there are several observance functions that define how a fixed-date
holiday is observed. For example, if Christmas falls on a weekend
is may be observed on Friday if it falls on Saturday and Monday if it falls on
Sunday (``tseries.offsets.holiday.Nearest``) or only if it falls on Sunday
is will be observed on Monday (``tseries.offsets.holiday.Sunday``). Other rules
can easily be specified.

.. ipython:: python

from pandas.tseries.holiday import Holiday, USMemorialDay,\
AbstractHolidayCalendar, Nearest, MO
class ExampleCalendar(AbstractHolidayCalendar):
_rule_table = [
USMemorialDay,
Holiday('July 4th', month=7, day=4, observance=Nearest),
Holiday('Columbus Day', month=10, day=1,
offset=DateOffset(weekday=MO(2))),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this offset can be Week(weekday='Monday')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be Week(weekday=0) or some other integer. I think the example is illustrative enough. I'll put a comment pointing this alternative out.

]
cal = ExampleCalendar()
cal.holidays(datetime(2012, 1, 1), datetime(2012, 12, 31)) #holiday list
datetime(2012, 5, 25) + CustomBusinessDay(calendar=cal) #holiday arithmetic

There are several defined US holidays in ``pandas.tseries.holiday`` along with
several common holidays calendars.

.. _timeseries.advanced_datetime:

Time series-related instance methods
Expand Down
3 changes: 2 additions & 1 deletion doc/source/v0.14.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,7 @@ Asymmetrical error bars are also supported, however raw error values must be pro
Prior Version Deprecations/Changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Therse are prior version deprecations that are taking effect as of 0.14.0.
There are prior version deprecations that are taking effect as of 0.14.0.

- Remove ``column`` keyword from ``DataFrame.sort`` (:issue:`4370`)

Expand Down Expand Up @@ -376,6 +376,7 @@ Enhancements
file. (:issue:`6545`)
- ``pandas.io.gbq`` now handles reading unicode strings properly. (:issue:`5940`)
- Improve performance of ``CustomBusinessDay`` (:issue:`6584`)
- :ref:`Holidays and holiday calendars<timeseries.holiday>` are now available and can be used with CustomBusinessDay.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a reference to this PR number (also pls add a release note in the Enhancements section of release notes).


Performance
~~~~~~~~~~~
Expand Down
217 changes: 217 additions & 0 deletions pandas/tseries/holiday.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
from pandas import DateOffset, date_range, DatetimeIndex, Series
from datetime import datetime
from pandas.tseries.offsets import Easter
from dateutil.relativedelta import MO, TU, WE, TH, FR, SA, SU

def Sunday(dt):
'''
If the holiday falls on Sunday, make Monday a holiday (nothing
happens for Saturday.
'''
if dt.isoweekday() == 7:
return dt + DateOffset(+1)
else:
return dt

def Nearest(dt):
'''
If the holiday falls on a weekend, make it a 3-day weekend by making
Saturday a Friday holiday and Sunday a Monday holiday.
'''
if dt.isoweekday() == 6:
return dt + DateOffset(-1)
elif dt.isoweekday() == 7:
return dt + DateOffset(+1)
else:
return dt

#TODO: Need to add an observance function when a holiday
# falls on a Tuesday and get a 4-day weekend
# def Nearest4(dt):
# '''
# If the holiday falls on Tuesday,
# make Monday a holiday as well, otherwise
# follow the rules for Nearest (a
# 3-day weekend).
# '''
# if dt.isoweekday() == 2:
# return dt - DateOffset()
# else:
# return Nearest(dt)

class Holiday(object):
'''
Class that defines a holiday with start/end dates and rules
for observance.
'''
def __init__(self, name, year=None, month=None, day=None, offset=None,
observance=None, start_date=None, end_date=None):
self.name = name
self.year = year
self.month = month
self.day = day
self.offset = offset
self.start_date = start_date
self.end_date = end_date
self.observance = observance

def __repr__(self):
#FIXME: This should handle observance rules as well
return 'Holiday %s (%s, %s, %s)' % (self.name, self.month, self.day,
self.offset)

def dates(self, start_date, end_date):

if self.year is not None:
return datetime(self.year, self.month, self.day)

if self.start_date is not None:
start_date = self.start_date

if self.end_date is not None:
end_date = self.end_date

year_offset = DateOffset(years=1)
baseDate = datetime(start_date.year, self.month, self.day)
dates = date_range(baseDate, end_date, freq=year_offset)

return self._apply_rule(dates)

def dates_with_name(self, start_date, end_date):

dates = self.dates(start_date, end_date)
return Series(self.name, index=dates)

def _apply_rule(self, dates):
'''
Apply the given offset/observance to an
iterable of dates.

Parameters
----------
dates : array-like
Dates to apply the given offset/observance rule

Returns
-------
Dates with rules applied
'''
if self.observance is not None:
return map(lambda d: self.observance(d), dates)

if not isinstance(self.offset, list):
offsets = [self.offset]
else:
offsets = self.offset

for offset in offsets:
dates = map(lambda d: d + offset, dates)

return dates

class AbstractHolidayCalendar(object):
'''
Abstract interface to create holidays following certain rules.
'''
_rule_table = []

def __init__(self, rules=None):
'''
Initializes holiday object with a given set a rules. Normally
classes just have the rules defined within them.

Parameters
----------
rules : array of Holiday objects
A set of rules used to create the holidays.
'''
super(AbstractHolidayCalendar, self).__init__()
if rules is not None:
self._rule_table = rules

@property
def holiday_rules(self):
return self._rule_table

def holidays(self, start=None, end=None, return_names=False):
'''
Returns a curve with holidays between start_date and end_date

Parameters
----------
start : starting date, datetime-like, optional
end : ending date, datetime-like, optional
return_names : bool, optional
If True, return a series that has dates and holiday names.
False will only return a DatetimeIndex of dates.

Returns
-------
DatetimeIndex of holidays
'''
#FIXME: Where should the default limits exist?
if start is None:
start = datetime(1970, 1, 1)

if end is None:
end = datetime(2030, 12, 31)

if self.holiday_rules is None:
raise Exception('Holiday Calendar %s does not have any '\
'rules specified' % self.calendarName)

if return_names:
holidays = None
else:
holidays = []
for rule in self.holiday_rules:
if return_names:
rule_holidays = rule.dates_with_name(start, end)
if holidays is None:
holidays = rule_holidays
else:
holidays = holidays.append(rule_holidays)
else:
holidays += rule.dates(start, end)

if return_names:
return holidays.sort_index()
else:
return DatetimeIndex(holidays).order(False)

USMemorialDay = Holiday('MemorialDay', month=5, day=24,
offset=DateOffset(weekday=MO(1)))
USLaborDay = Holiday('Labor Day', month=9, day=1,
offset=DateOffset(weekday=MO(1)))
USThanksgivingDay = Holiday('Thanksgiving', month=11, day=1,
offset=DateOffset(weekday=TH(4)))
USMartinLutherKingJr = Holiday('Dr. Martin Luther King Jr.', month=1, day=1,
offset=DateOffset(weekday=MO(3)))
USPresidentsDay = Holiday('President''s Day', month=2, day=1,
offset=DateOffset(weekday=MO(3)))

class USFederalHolidayCalendar(AbstractHolidayCalendar):

_rule_table = [
Holiday('New Years Day', month=1, day=1, observance=Nearest),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not factor all holidays out as constants like Memorial Day, etc so that they can be re-used by other HolidayCalendars?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it for the holidays that shouldn't change observances. If you look at my two calendars I put there, the other holidays have different observances. There could be a default set, but I think caution needs to be taken when doing these types of holidays.

USMartinLutherKingJr,
USPresidentsDay,
USMemorialDay,
Holiday('July 4th', month=7, day=4, observance=Nearest),
USLaborDay,
Holiday('Columbus Day', month=10, day=1, offset=DateOffset(weekday=MO(2))),
Holiday('Veterans Day', month=11, day=11, observance=Nearest),
USThanksgivingDay,
Holiday('Christmas', month=12, day=25, observance=Nearest)
]

class NERCHolidayCalendar(AbstractHolidayCalendar):

_rule_table = [
Holiday('New Years Day', month=1, day=1, observance=Sunday),
USMemorialDay,
Holiday('July 4th', month=7, day=4, observance=Sunday),
USLaborDay,
USThanksgivingDay,
Holiday('Christmas', month=12, day=25, observance=Sunday)
]
47 changes: 43 additions & 4 deletions pandas/tseries/offsets.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

# import after tools, dateutil check
from dateutil.relativedelta import relativedelta, weekday
from dateutil.easter import easter
import pandas.tslib as tslib
from pandas.tslib import Timestamp, OutOfBoundsDatetime

Expand All @@ -17,7 +18,7 @@
'YearBegin', 'BYearBegin', 'YearEnd', 'BYearEnd',
'QuarterBegin', 'BQuarterBegin', 'QuarterEnd', 'BQuarterEnd',
'LastWeekOfMonth', 'FY5253Quarter', 'FY5253',
'Week', 'WeekOfMonth',
'Week', 'WeekOfMonth', 'Easter',
'Hour', 'Minute', 'Second', 'Milli', 'Micro', 'Nano']

# convert to/from datetime/timestamp to allow invalid Timestamp ranges to pass thru
Expand Down Expand Up @@ -447,6 +448,8 @@ class CustomBusinessDay(BusinessDay):
holidays : list
list/array of dates to exclude from the set of valid business days,
passed to ``numpy.busdaycalendar``
calendar : HolidayCalendar instance
instance of AbstractHolidayCalendar that provide the list of holidays
"""

_cacheable = False
Expand All @@ -458,8 +461,11 @@ def __init__(self, n=1, **kwds):
self.offset = kwds.get('offset', timedelta(0))
self.normalize = kwds.get('normalize', False)
self.weekmask = kwds.get('weekmask', 'Mon Tue Wed Thu Fri')
holidays = kwds.get('holidays', [])


if 'calendar' in kwds:
holidays = kwds['calendar'].holidays()
else:
holidays = kwds.get('holidays', [])
holidays = [self._to_dt64(dt, dtype='datetime64[D]') for dt in
holidays]
self.holidays = tuple(sorted(holidays))
Expand Down Expand Up @@ -1677,7 +1683,40 @@ def _from_name(cls, *args):
return cls(**dict(FY5253._parse_suffix(*args[:-1]),
qtr_with_extra_week=int(args[-1])))


class Easter(DateOffset):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to Easter, it would be great to have Good Friday (defined in terms of Easter) since this is a market holiday.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used good Friday in a NYMEX calendar as just Easter+DateOffset(-2). I think this is probably fine for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although you have a point, Good Friday would be the more useful for a calendar. Would be simple to add now.

'''
DateOffset for the Easter holiday using
logic defined in dateutil. Right now uses
the revised method which is valid in years
1583-4099.
'''
def __init__(self, n=1, **kwds):
super(Easter, self).__init__(n, **kwds)

def apply(self, other):

currentEaster = easter(other.year)
currentEaster = datetime(currentEaster.year, currentEaster.month, currentEaster.day)

# NOTE: easter returns a datetime.date so we have to convert to type of other
if other >= currentEaster:
new = easter(other.year + self.n)
elif other < currentEaster:
new = easter(other.year + self.n - 1)
else:
new = other

# FIXME: There has to be a better way to do this, but I don't know what it is
if isinstance(other, Timestamp):
return as_timestamp(new)
elif isinstance(other, datetime):
return datetime(new.year, new.month, new.day)
else:
return new

@classmethod
def onOffset(cls, dt):
return date(dt.year, dt.month, dt.day) == easter(dt.year)
#----------------------------------------------------------------------
# Ticks

Expand Down
Loading