-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Prevent passing invalid kwds to DateOffset constructors #18226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent passing invalid kwds to DateOffset constructors #18226
Conversation
pandas/tests/tseries/test_offsets.py
Outdated
|
||
# --------------------------------------------------------------------- | ||
|
||
offset_classes = [getattr(offsets, x) for x in dir(offsets)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already list the classes somewhere in this module iirc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a fixture defined in conftest, but only used in test_offsets. Instead of checking for the correct types it assume that offsets.__all__
is exactly what it needs. Mind if I a) put it directly test_offsets and b) make it check explicitly to avoid future footgunning?
pandas/tests/tseries/test_offsets.py
Outdated
issubclass(x, DateOffset)] | ||
|
||
|
||
def test_valid_attributes(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parameterize this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
add a new whatsnew note indicating which classes are now strict (you did this for 0.21.0 IIRC), same language |
Will do. The 0.21.0 note didn't list subclasses. Sure you want them listed this time? |
Once we get #18224 sorted out, the assert n == int(n) will be done only once (and probably as a ValueError instead of assertion) in _BaseOffset.init |
Codecov Report
@@ Coverage Diff @@
## master #18226 +/- ##
==========================================
- Coverage 91.42% 91.39% -0.04%
==========================================
Files 163 163
Lines 50064 50099 +35
==========================================
+ Hits 45773 45787 +14
- Misses 4291 4312 +21
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #18226 +/- ##
==========================================
- Coverage 91.33% 91.32% -0.02%
==========================================
Files 163 163
Lines 49752 49764 +12
==========================================
+ Hits 45443 45446 +3
- Misses 4309 4318 +9
Continue to review full report at Codecov.
|
appveyor failure appears unrelated. will rebase+push |
doc/source/whatsnew/v0.22.0.txt
Outdated
@@ -45,6 +45,7 @@ Other API Changes | |||
- :class:`Timestamp` will no longer silently ignore unused or invalid ``tz`` or ``tzinfo`` keyword arguments (:issue:`17690`) | |||
- :class:`Timestamp` will no longer silently ignore invalid ``freq`` arguments (:issue:`5168`) | |||
- :class:`CacheableOffset` and :class:`WeekDay` are no longer available in the ``pandas.tseries.offsets`` module (:issue:`17830`) | |||
- Restricted DateOffset keyword arguments. Previously, ``DateOffset`` subclasses allowed arbitrary keyword arguments which could lead to unexpected behavior. Now, only valid arguments will be accepted. (:issue:`17176`,:issue:`18226`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 space after a period, space after the common between issues
pandas/tests/tseries/conftest.py
Outdated
import pandas.tseries.offsets as offsets | ||
|
||
|
||
@pytest.fixture(params=[getattr(offsets, o) for o in offsets.__all__]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this serves the entire sub-directory, why are you removing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it isn't moved outside test_offsets and causes a "where the heck is this defined" reaction to a new reader. Will revert.
pandas/tests/tseries/test_offsets.py
Outdated
@@ -41,6 +42,20 @@ | |||
from pandas.tseries.holiday import USFederalHolidayCalendar | |||
|
|||
|
|||
offset_classes = [getattr(offsets, x) for x in dir(offsets)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, pls just use the fixture (you can certainly create more if needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
pandas/tests/tseries/test_offsets.py
Outdated
# --------------------------------------------------------------------- | ||
|
||
month_classes = [x for x in offset_classes if | ||
issubclass(x, offsets.MonthOffset)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just make these fixtures
pandas/tseries/offsets.py
Outdated
@@ -163,6 +164,7 @@ def __add__(date): | |||
normalize = False | |||
|
|||
def __init__(self, n=1, normalize=False, **kwds): | |||
assert n == int(n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh? why are you putting asserts here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for this case I actually do like
self.n = self._validate_n(n)
which would raise on a non-integer convertible passed in (with a helpful message that includes the class name); that's the reason to make this a method)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll get rid of the asserts here, handle them later.
#18210 looks kind of a duplicate with this. pls only have 1 per thing open at a time. |
Removed the asserts, which should fix this perception. They are orthogonal. |
Though you're right about too many PRs. I actually closed several last night for that reason. |
not a problem. just sometimes we can lose focus. |
abcde3b
to
0e37a24
Compare
appveyor error just says 'Command exited with code 1'. Any idea how to get something more meaningful than that? |
…libs-offsets-inits
hmm the appveyor error is in the same place as befire in test_ticks.test_Hour |
pandas/_libs/tslibs/offsets.pyx
Outdated
@@ -394,6 +394,15 @@ class _BaseOffset(object): | |||
out = '<%s' % n_str + className + plural + self._repr_attrs() + '>' | |||
return out | |||
|
|||
def _validate_n(self, n): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some tests to exercise this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Got one test so far, will add a few more after addressing the current test failure (which is a bug that may need to be addressed as part of a Larger Fix).
…libs-offsets-inits
…libs-offsets-inits
…libs-offsets-inits
pandas/_libs/tslibs/offsets.pyx
Outdated
except (ValueError, TypeError): | ||
raise TypeError('`n` argument must be an integer') | ||
if n != nint: | ||
raise ValueError('`n` argument must be an integer') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
show the type of what was passed (for 1st)
make this a more informative message (e.g. showing both values) (for 2nd one)
|
||
|
||
def test_validate_n_error(): | ||
with pytest.raises(TypeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some more tests here, e.g. for passing floats != to the int
pandas/tseries/offsets.py
Outdated
self.normalize = normalize | ||
if startingMonth is None: | ||
startingMonth = self._default_startingMonth | ||
self.startingMonth = startingMonth | ||
|
||
self.kwds = {'startingMonth': startingMonth} | ||
self._offset = timedelta(1) | ||
self._use_relativedelta = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should _use_relative_delta
be a class variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and _offset
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_use_relativedelta
is defined at the class level, defaults to False
. So setting again here is unnecessary.
The bigger issue with _relativedelta
&_offset
is that they are only actually relevant for DateOffset
(i.e. not subclasses). It took me a while to figure this out (and in fact I screwed up several PRs back by changing BusinessFoo.offset
--> BusinessFoo._offset
"for consistency" without realizing the business version is used differently). My current thought is to separate out the relativedelta
-using parts of DateOffset
into a subclass that isnt inherited by everything else to avoid this confusion.
@@ -2201,6 +2211,13 @@ class Easter(DateOffset): | |||
""" | |||
_adjust_dst = True | |||
|
|||
def __init__(self, n=1, normalize=False): | |||
self.n = self._validate_n(n) | |||
self.normalize = normalize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same (and for other ones you are adding)
pandas/_libs/tslibs/offsets.pyx
Outdated
@@ -406,6 +406,17 @@ class _BaseOffset(object): | |||
# will raise NotImplementedError. | |||
return get_day_of_month(other, self._day_opt) | |||
|
|||
def _validate_n(self, n): | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the doc-string that you did for #18210
lgtm. ping on green. |
Ping |
…libs-offsets-inits
rebase and ping on green. lgtm. |
…libs-offsets-inits
Ping |
thanks @jbrockmendel |
git diff upstream/master -u -- "*.py" | flake8 --diff