Skip to content

Prevent passing invalid kwds to DateOffset constructors #18226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Nov 25, 2017
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
b9fe60e
Patch __init__ to prevent passing invalid kwds
jbrockmendel Nov 11, 2017
f10a1c6
cast n to integer, assert equality
jbrockmendel Nov 11, 2017
807d769
whatsnew note
jbrockmendel Nov 11, 2017
687e3b7
parameterize tests, define fixture where it is used
jbrockmendel Nov 11, 2017
d5443ca
fixup
jbrockmendel Nov 11, 2017
5e5e0c0
exclude base classes from testing
jbrockmendel Nov 11, 2017
17f7b5a
exclude base classes from testing
jbrockmendel Nov 11, 2017
3834ef8
dummy commit to force CI
jbrockmendel Nov 12, 2017
1c54e96
edits per reviewer suggestions
jbrockmendel Nov 12, 2017
0e37a24
whatsnew whitespace
jbrockmendel Nov 12, 2017
a572368
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 13, 2017
d0ff381
whitespace fixup
jbrockmendel Nov 13, 2017
409dbd0
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 14, 2017
6a9233e
break up hour test to debug appveyor error (segfault?)
jbrockmendel Nov 15, 2017
4406df8
break down segfaulting test to debug
jbrockmendel Nov 16, 2017
44891cd
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 16, 2017
573abb6
fixturize
jbrockmendel Nov 16, 2017
c6cc8bc
troubleshoot segfault by moving __eq__ to _BaseOffset
jbrockmendel Nov 17, 2017
a68f4a7
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 17, 2017
c8224c1
try sorting rd_kwds to fix segfault, revert other troubleshooting gue…
jbrockmendel Nov 17, 2017
55779d8
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 17, 2017
d5b8302
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 19, 2017
b54e26b
implement _validate_n method
jbrockmendel Nov 19, 2017
fade4a2
test for _validate_n
jbrockmendel Nov 19, 2017
38c2238
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 22, 2017
fe895ff
Raise TypeError, not ValueError
jbrockmendel Nov 22, 2017
11ba1a9
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 22, 2017
bc90a19
Catch ValueError in int(n)
jbrockmendel Nov 22, 2017
0b3dca0
fixup extra imports
jbrockmendel Nov 22, 2017
9c841fc
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 22, 2017
adee395
remove unnecessary re-definitions, add tests, improve error msg
jbrockmendel Nov 23, 2017
43dc17e
Add docstring to validate_n
jbrockmendel Nov 23, 2017
4ff8c22
fixup missing format
jbrockmendel Nov 24, 2017
d261be9
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 24, 2017
b4b9e15
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 25, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.22.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Other API Changes
- :class:`CacheableOffset` and :class:`WeekDay` are no longer available in the ``pandas.tseries.offsets`` module (:issue:`17830`)
- `tseries.frequencies.get_freq_group()` and `tseries.frequencies.DAYS` are removed from the public API (:issue:`18034`)
- :func:`Series.truncate` and :func:`DataFrame.truncate` will raise a ``ValueError`` if the index is not sorted instead of an unhelpful ``KeyError`` (:issue:`17935`)

- Restricted ``DateOffset`` keyword arguments. Previously, ``DateOffset`` subclasses allowed arbitrary keyword arguments which could lead to unexpected behavior. Now, only valid arguments will be accepted. (:issue:`17176`, :issue:`18226`).

.. _whatsnew_0220.deprecations:

Expand Down
11 changes: 10 additions & 1 deletion pandas/_libs/tslibs/offsets.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ def _validate_business_time(t_input):
# ---------------------------------------------------------------------
# Constructor Helpers

_rd_kwds = set([
relativedelta_kwds = set([
'years', 'months', 'weeks', 'days',
'year', 'month', 'week', 'day', 'weekday',
'hour', 'minute', 'second', 'microsecond',
Expand Down Expand Up @@ -394,6 +394,15 @@ class _BaseOffset(object):
out = '<%s' % n_str + className + plural + self._repr_attrs() + '>'
return out

def _validate_n(self, n):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some tests to exercise this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Got one test so far, will add a few more after addressing the current test failure (which is a bug that may need to be addressed as part of a Larger Fix).

try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the doc-string that you did for #18210

nint = int(n)
except (ValueError, TypeError):
raise ValueError('`n` argument must be an integer')
if n != nint:
raise ValueError('`n` argument must be an integer')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show the type of what was passed (for 1st)
make this a more informative message (e.g. showing both values) (for 2nd one)

return nint


class BaseOffset(_BaseOffset):
# Here we add __rfoo__ methods that don't play well with cdef classes
Expand Down
13 changes: 13 additions & 0 deletions pandas/tests/tseries/offsets/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,19 @@ def offset_types(request):
return request.param


@pytest.fixture(params=[getattr(offsets, o) for o in offsets.__all__ if
issubclass(getattr(offsets, o), offsets.MonthOffset)
and o != 'MonthOffset'])
def month_classes(request):
return request.param


@pytest.fixture(params=[getattr(offsets, o) for o in offsets.__all__ if
issubclass(getattr(offsets, o), offsets.Tick)])
def tick_classes(request):
return request.param


@pytest.fixture(params=[None, 'UTC', 'Asia/Tokyo', 'US/Eastern',
'dateutil/Asia/Tokyo', 'dateutil/US/Pacific'])
def tz(request):
Expand Down
29 changes: 28 additions & 1 deletion pandas/tests/tseries/offsets/test_offsets.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@
get_offset, get_standard_freq)
from pandas.core.indexes.datetimes import (
_to_m8, DatetimeIndex, _daterange_cache)
from pandas._libs.tslibs.offsets import WeekDay, CacheableOffset
import pandas._libs.tslibs.offsets as liboffsets
from pandas._libs.tslibs.offsets import WeekDay, CacheableOffset, relativedelta_kwds
from pandas.tseries.offsets import (BDay, CDay, BQuarterEnd, BMonthEnd,
BusinessHour, WeekOfMonth, CBMonthEnd,
CustomBusinessHour,
Expand Down Expand Up @@ -4680,3 +4681,29 @@ def test_all_offset_classes(self, tup):
first = Timestamp(test_values[0], tz='US/Eastern') + offset()
second = Timestamp(test_values[1], tz='US/Eastern')
assert first == second


# ---------------------------------------------------------------------

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob move to liboffsets tests?

also _rd_kwds event internally is pretty hard to grok. you can de-privatize and name it more descriptively.


@pytest.mark.parametrize('kwd', sorted(list(liboffsets.relativedelta_kwds)))
def test_valid_month_attributes(kwd, month_classes):
# GH#18226
cls = month_classes
# check that we cannot create e.g. MonthEnd(weeks=3)
with pytest.raises(TypeError):
cls(**{kwd: 3})


@pytest.mark.parametrize('kwd', sorted(list(liboffsets.relativedelta_kwds)))
def test_valid_tick_attributes(kwd, tick_classes):
# GH#18226
cls = tick_classes
# check that we cannot create e.g. Hour(weeks=3)
with pytest.raises(TypeError):
cls(**{kwd: 3})


def test_validate_n_error():
with pytest.raises(ValueError):
DateOffset(n='Doh!')
59 changes: 43 additions & 16 deletions pandas/tseries/offsets.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# -*- coding: utf-8 -*-
from datetime import date, datetime, timedelta
import functools
import operator

from pandas.compat import range
from pandas import compat
import numpy as np
Expand Down Expand Up @@ -28,8 +31,6 @@
BeginMixin, EndMixin,
BaseOffset)

import functools
import operator

__all__ = ['Day', 'BusinessDay', 'BDay', 'CustomBusinessDay', 'CDay',
'CBMonthEnd', 'CBMonthBegin',
Expand Down Expand Up @@ -166,7 +167,7 @@ def __add__(date):
normalize = False

def __init__(self, n=1, normalize=False, **kwds):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
self.kwds = kwds

Expand Down Expand Up @@ -473,7 +474,7 @@ class BusinessDay(BusinessMixin, SingleConstructorOffset):
_adjust_dst = True

def __init__(self, n=1, normalize=False, offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
self.kwds = {'offset': offset}
self._offset = offset
Expand Down Expand Up @@ -782,7 +783,7 @@ class BusinessHour(BusinessHourMixin, SingleConstructorOffset):

def __init__(self, n=1, normalize=False, start='09:00',
end='17:00', offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
super(BusinessHour, self).__init__(start=start, end=end, offset=offset)

Expand Down Expand Up @@ -819,7 +820,7 @@ class CustomBusinessDay(BusinessDay):

def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None, offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = offset
self.kwds = {}
Expand Down Expand Up @@ -887,7 +888,7 @@ class CustomBusinessHour(BusinessHourMixin, SingleConstructorOffset):
def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None,
start='09:00', end='17:00', offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
super(CustomBusinessHour, self).__init__(start=start,
end=end, offset=offset)
Expand Down Expand Up @@ -915,6 +916,13 @@ def next_bday(self):
class MonthOffset(SingleConstructorOffset):
_adjust_dst = True

def __init__(self, n=1, normalize=False):
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = timedelta(1)
self._use_relativedelta = False
self.kwds = {}

@property
def name(self):
if self.isAnchored:
Expand Down Expand Up @@ -989,7 +997,8 @@ def __init__(self, n=1, normalize=False, day_of_month=None):
msg = 'day_of_month must be {min}<=day_of_month<=27, got {day}'
raise ValueError(msg.format(min=self._min_day_of_month,
day=self.day_of_month))
self.n = int(n)

self.n = self._validate_n(n)
self.normalize = normalize
self.kwds = {'day_of_month': self.day_of_month}

Expand Down Expand Up @@ -1242,7 +1251,7 @@ class CustomBusinessMonthEnd(BusinessMixin, MonthOffset):

def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None, offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = offset
self.kwds = {}
Expand Down Expand Up @@ -1313,7 +1322,7 @@ class CustomBusinessMonthBegin(BusinessMixin, MonthOffset):

def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None, offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = offset
self.kwds = {}
Expand Down Expand Up @@ -1377,7 +1386,7 @@ class Week(EndMixin, DateOffset):
_prefix = 'W'

def __init__(self, n=1, normalize=False, weekday=None):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize
self.weekday = weekday

Expand Down Expand Up @@ -1456,7 +1465,7 @@ class WeekOfMonth(DateOffset):
_adjust_dst = True

def __init__(self, n=1, normalize=False, week=None, weekday=None):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize
self.weekday = weekday
self.week = week
Expand Down Expand Up @@ -1541,7 +1550,7 @@ class LastWeekOfMonth(DateOffset):
_prefix = 'LWOM'

def __init__(self, n=1, normalize=False, weekday=None):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize
self.weekday = weekday

Expand Down Expand Up @@ -1604,13 +1613,15 @@ class QuarterOffset(DateOffset):
# point

def __init__(self, n=1, normalize=False, startingMonth=None):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize
if startingMonth is None:
startingMonth = self._default_startingMonth
self.startingMonth = startingMonth

self.kwds = {'startingMonth': startingMonth}
self._offset = timedelta(1)
self._use_relativedelta = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should _use_relative_delta be a class variable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and _offset ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_use_relativedelta is defined at the class level, defaults to False. So setting again here is unnecessary.

The bigger issue with _relativedelta&_offset is that they are only actually relevant for DateOffset (i.e. not subclasses). It took me a while to figure this out (and in fact I screwed up several PRs back by changing BusinessFoo.offset --> BusinessFoo._offset "for consistency" without realizing the business version is used differently). My current thought is to separate out the relativedelta-using parts of DateOffset into a subclass that isnt inherited by everything else to avoid this confusion.


def isAnchored(self):
return (self.n == 1 and self.startingMonth is not None)
Expand Down Expand Up @@ -1946,7 +1957,7 @@ class FY5253(DateOffset):

def __init__(self, n=1, normalize=False, weekday=0, startingMonth=1,
variation="nearest"):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize
self.startingMonth = startingMonth
self.weekday = weekday
Expand Down Expand Up @@ -2196,7 +2207,7 @@ class FY5253Quarter(DateOffset):

def __init__(self, n=1, normalize=False, weekday=0, startingMonth=1,
qtr_with_extra_week=1, variation="nearest"):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize

self.weekday = weekday
Expand Down Expand Up @@ -2323,6 +2334,13 @@ class Easter(DateOffset):
"""
_adjust_dst = True

def __init__(self, n=1, normalize=False):
self.n = self._validate_n(n)
self.normalize = normalize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same (and for other ones you are adding)

self._offset = timedelta(1)
self._use_relativedelta = False
self.kwds = {}

@apply_wraps
def apply(self, other):
currentEaster = easter(other.year)
Expand Down Expand Up @@ -2367,6 +2385,14 @@ class Tick(SingleConstructorOffset):
_inc = Timedelta(microseconds=1000)
_prefix = 'undefined'

def __init__(self, n=1, normalize=False):
# TODO: do Tick classes with normalize=True make sense?
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = timedelta(1)
self._use_relativedelta = False
self.kwds = {}

__gt__ = _tick_comp(operator.gt)
__ge__ = _tick_comp(operator.ge)
__lt__ = _tick_comp(operator.lt)
Expand Down Expand Up @@ -2425,6 +2451,7 @@ def delta(self):
def nanos(self):
return delta_to_nanoseconds(self.delta)

# TODO: Should Tick have its own apply_index?
def apply(self, other):
# Timestamp can handle tz and nano sec, thus no need to use apply_wraps
if isinstance(other, Timestamp):
Expand Down