Skip to content

Prevent passing invalid kwds to DateOffset constructors #18226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Nov 25, 2017
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
b9fe60e
Patch __init__ to prevent passing invalid kwds
jbrockmendel Nov 11, 2017
f10a1c6
cast n to integer, assert equality
jbrockmendel Nov 11, 2017
807d769
whatsnew note
jbrockmendel Nov 11, 2017
687e3b7
parameterize tests, define fixture where it is used
jbrockmendel Nov 11, 2017
d5443ca
fixup
jbrockmendel Nov 11, 2017
5e5e0c0
exclude base classes from testing
jbrockmendel Nov 11, 2017
17f7b5a
exclude base classes from testing
jbrockmendel Nov 11, 2017
3834ef8
dummy commit to force CI
jbrockmendel Nov 12, 2017
1c54e96
edits per reviewer suggestions
jbrockmendel Nov 12, 2017
0e37a24
whatsnew whitespace
jbrockmendel Nov 12, 2017
a572368
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 13, 2017
d0ff381
whitespace fixup
jbrockmendel Nov 13, 2017
409dbd0
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 14, 2017
6a9233e
break up hour test to debug appveyor error (segfault?)
jbrockmendel Nov 15, 2017
4406df8
break down segfaulting test to debug
jbrockmendel Nov 16, 2017
44891cd
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 16, 2017
573abb6
fixturize
jbrockmendel Nov 16, 2017
c6cc8bc
troubleshoot segfault by moving __eq__ to _BaseOffset
jbrockmendel Nov 17, 2017
a68f4a7
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 17, 2017
c8224c1
try sorting rd_kwds to fix segfault, revert other troubleshooting gue…
jbrockmendel Nov 17, 2017
55779d8
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 17, 2017
d5b8302
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 19, 2017
b54e26b
implement _validate_n method
jbrockmendel Nov 19, 2017
fade4a2
test for _validate_n
jbrockmendel Nov 19, 2017
38c2238
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 22, 2017
fe895ff
Raise TypeError, not ValueError
jbrockmendel Nov 22, 2017
11ba1a9
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 22, 2017
bc90a19
Catch ValueError in int(n)
jbrockmendel Nov 22, 2017
0b3dca0
fixup extra imports
jbrockmendel Nov 22, 2017
9c841fc
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 22, 2017
adee395
remove unnecessary re-definitions, add tests, improve error msg
jbrockmendel Nov 23, 2017
43dc17e
Add docstring to validate_n
jbrockmendel Nov 23, 2017
4ff8c22
fixup missing format
jbrockmendel Nov 24, 2017
d261be9
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 24, 2017
b4b9e15
Merge branch 'master' of https://github.com/pandas-dev/pandas into ts…
jbrockmendel Nov 25, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.22.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ Other API Changes
- `tseries.frequencies.get_freq_group()` and `tseries.frequencies.DAYS` are removed from the public API (:issue:`18034`)
- :func:`Series.truncate` and :func:`DataFrame.truncate` will raise a ``ValueError`` if the index is not sorted instead of an unhelpful ``KeyError`` (:issue:`17935`)
- :func:`Dataframe.unstack` will now default to filling with ``np.nan`` for ``object`` columns. (:issue:`12815`)

- Restricted ``DateOffset`` keyword arguments. Previously, ``DateOffset`` subclasses allowed arbitrary keyword arguments which could lead to unexpected behavior. Now, only valid arguments will be accepted. (:issue:`17176`, :issue:`18226`).

.. _whatsnew_0220.deprecations:

Expand Down
13 changes: 12 additions & 1 deletion pandas/_libs/tslibs/offsets.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@ def _validate_business_time(t_input):
# ---------------------------------------------------------------------
# Constructor Helpers

_rd_kwds = set([
relativedelta_kwds = set([
'years', 'months', 'weeks', 'days',
'year', 'month', 'week', 'day', 'weekday',
'hour', 'minute', 'second', 'microsecond',
Expand Down Expand Up @@ -406,6 +406,17 @@ class _BaseOffset(object):
# will raise NotImplementedError.
return get_day_of_month(other, self._day_opt)

def _validate_n(self, n):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some tests to exercise this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Got one test so far, will add a few more after addressing the current test failure (which is a bug that may need to be addressed as part of a Larger Fix).

try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the doc-string that you did for #18210

nint = int(n)
except (ValueError, TypeError):
raise TypeError('`n` argument must be an integer, '
'got {ntype}'.format(type(n)))
if n != nint:
raise ValueError('`n` argument must be an integer, '
'got {n}'.format(n=n))
return nint


class BaseOffset(_BaseOffset):
# Here we add __rfoo__ methods that don't play well with cdef classes
Expand Down
13 changes: 13 additions & 0 deletions pandas/tests/tseries/offsets/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,19 @@ def offset_types(request):
return request.param


@pytest.fixture(params=[getattr(offsets, o) for o in offsets.__all__ if
issubclass(getattr(offsets, o), offsets.MonthOffset)
and o != 'MonthOffset'])
def month_classes(request):
return request.param


@pytest.fixture(params=[getattr(offsets, o) for o in offsets.__all__ if
issubclass(getattr(offsets, o), offsets.Tick)])
def tick_classes(request):
return request.param


@pytest.fixture(params=[None, 'UTC', 'Asia/Tokyo', 'US/Eastern',
'dateutil/Asia/Tokyo', 'dateutil/US/Pacific'])
def tz(request):
Expand Down
37 changes: 37 additions & 0 deletions pandas/tests/tseries/offsets/test_offsets.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
get_offset, get_standard_freq)
from pandas.core.indexes.datetimes import (
_to_m8, DatetimeIndex, _daterange_cache)
import pandas._libs.tslibs.offsets as liboffsets
from pandas._libs.tslibs.offsets import WeekDay, CacheableOffset
from pandas.tseries.offsets import (BDay, CDay, BQuarterEnd, BMonthEnd,
BusinessHour, WeekOfMonth, CBMonthEnd,
Expand Down Expand Up @@ -4682,9 +4683,45 @@ def test_all_offset_classes(self, tup):
assert first == second


# ---------------------------------------------------------------------
def test_get_offset_day_error():
# subclass of _BaseOffset must override _day_opt attribute, or we should
# get a NotImplementedError

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob move to liboffsets tests?

also _rd_kwds event internally is pretty hard to grok. you can de-privatize and name it more descriptively.

with pytest.raises(NotImplementedError):
DateOffset()._get_offset_day(datetime.now())


@pytest.mark.parametrize('kwd', sorted(list(liboffsets.relativedelta_kwds)))
def test_valid_month_attributes(kwd, month_classes):
# GH#18226
cls = month_classes
# check that we cannot create e.g. MonthEnd(weeks=3)
with pytest.raises(TypeError):
cls(**{kwd: 3})


@pytest.mark.parametrize('kwd', sorted(list(liboffsets.relativedelta_kwds)))
def test_valid_tick_attributes(kwd, tick_classes):
# GH#18226
cls = tick_classes
# check that we cannot create e.g. Hour(weeks=3)
with pytest.raises(TypeError):
cls(**{kwd: 3})


def test_validate_n_error():
with pytest.raises(TypeError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some more tests here, e.g. for passing floats != to the int

DateOffset(n='Doh!')

with pytest.raises(TypeError):
MonthBegin(n=timedelta(1))

with pytest.raises(TypeError):
BDay(n=np.array([1, 2], dtype=np.int64))


def test_require_integers(offset_types):
cls = offset_types
with pytest.raises(ValueError):
cls(n=1.5)
48 changes: 33 additions & 15 deletions pandas/tseries/offsets.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# -*- coding: utf-8 -*-
from datetime import date, datetime, timedelta
import functools
import operator

from datetime import date, datetime, timedelta
from pandas.compat import range
from pandas import compat
import numpy as np
Expand Down Expand Up @@ -167,7 +167,7 @@ def __add__(date):
normalize = False

def __init__(self, n=1, normalize=False, **kwds):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
self.kwds = kwds

Expand Down Expand Up @@ -474,7 +474,7 @@ class BusinessDay(BusinessMixin, SingleConstructorOffset):
_adjust_dst = True

def __init__(self, n=1, normalize=False, offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
self.kwds = {'offset': offset}
self._offset = offset
Expand Down Expand Up @@ -783,7 +783,7 @@ class BusinessHour(BusinessHourMixin, SingleConstructorOffset):

def __init__(self, n=1, normalize=False, start='09:00',
end='17:00', offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
super(BusinessHour, self).__init__(start=start, end=end, offset=offset)

Expand Down Expand Up @@ -820,7 +820,7 @@ class CustomBusinessDay(BusinessDay):

def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None, offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = offset
self.kwds = {}
Expand Down Expand Up @@ -888,7 +888,7 @@ class CustomBusinessHour(BusinessHourMixin, SingleConstructorOffset):
def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None,
start='09:00', end='17:00', offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
super(CustomBusinessHour, self).__init__(start=start,
end=end, offset=offset)
Expand Down Expand Up @@ -920,6 +920,11 @@ def next_bday(self):
class MonthOffset(SingleConstructorOffset):
_adjust_dst = True

def __init__(self, n=1, normalize=False):
self.n = self._validate_n(n)
self.normalize = normalize
self.kwds = {}

@property
def name(self):
if self.isAnchored:
Expand Down Expand Up @@ -995,7 +1000,8 @@ def __init__(self, n=1, normalize=False, day_of_month=None):
msg = 'day_of_month must be {min}<=day_of_month<=27, got {day}'
raise ValueError(msg.format(min=self._min_day_of_month,
day=self.day_of_month))
self.n = int(n)

self.n = self._validate_n(n)
self.normalize = normalize
self.kwds = {'day_of_month': self.day_of_month}

Expand Down Expand Up @@ -1206,7 +1212,7 @@ class CustomBusinessMonthEnd(BusinessMixin, MonthOffset):

def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None, offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = offset
self.kwds = {}
Expand Down Expand Up @@ -1279,7 +1285,7 @@ class CustomBusinessMonthBegin(BusinessMixin, MonthOffset):

def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None, offset=timedelta(0)):
self.n = int(n)
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = offset
self.kwds = {}
Expand Down Expand Up @@ -1346,7 +1352,7 @@ class Week(EndMixin, DateOffset):
_prefix = 'W'

def __init__(self, n=1, normalize=False, weekday=None):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize
self.weekday = weekday

Expand Down Expand Up @@ -1425,7 +1431,7 @@ class WeekOfMonth(DateOffset):
_adjust_dst = True

def __init__(self, n=1, normalize=False, week=None, weekday=None):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize
self.weekday = weekday
self.week = week
Expand Down Expand Up @@ -1510,7 +1516,7 @@ class LastWeekOfMonth(DateOffset):
_prefix = 'LWOM'

def __init__(self, n=1, normalize=False, weekday=None):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize
self.weekday = weekday

Expand Down Expand Up @@ -1576,7 +1582,7 @@ class QuarterOffset(DateOffset):
# point

def __init__(self, n=1, normalize=False, startingMonth=None):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize
if startingMonth is None:
startingMonth = self._default_startingMonth
Expand Down Expand Up @@ -1824,7 +1830,7 @@ class FY5253(DateOffset):

def __init__(self, n=1, normalize=False, weekday=0, startingMonth=1,
variation="nearest"):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize
self.startingMonth = startingMonth
self.weekday = weekday
Expand Down Expand Up @@ -2074,7 +2080,7 @@ class FY5253Quarter(DateOffset):

def __init__(self, n=1, normalize=False, weekday=0, startingMonth=1,
qtr_with_extra_week=1, variation="nearest"):
self.n = n
self.n = self._validate_n(n)
self.normalize = normalize

self.weekday = weekday
Expand Down Expand Up @@ -2201,6 +2207,11 @@ class Easter(DateOffset):
"""
_adjust_dst = True

def __init__(self, n=1, normalize=False):
self.n = self._validate_n(n)
self.normalize = normalize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same (and for other ones you are adding)

self.kwds = {}

@apply_wraps
def apply(self, other):
currentEaster = easter(other.year)
Expand Down Expand Up @@ -2245,6 +2256,12 @@ class Tick(SingleConstructorOffset):
_inc = Timedelta(microseconds=1000)
_prefix = 'undefined'

def __init__(self, n=1, normalize=False):
# TODO: do Tick classes with normalize=True make sense?
self.n = self._validate_n(n)
self.normalize = normalize
self.kwds = {}

__gt__ = _tick_comp(operator.gt)
__ge__ = _tick_comp(operator.ge)
__lt__ = _tick_comp(operator.lt)
Expand Down Expand Up @@ -2303,6 +2320,7 @@ def delta(self):
def nanos(self):
return delta_to_nanoseconds(self.delta)

# TODO: Should Tick have its own apply_index?
def apply(self, other):
# Timestamp can handle tz and nano sec, thus no need to use apply_wraps
if isinstance(other, Timestamp):
Expand Down