Skip to content

Prevent passing invalid kwds to DateOffset constructors #18226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Nov 25, 2017

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Nov 11, 2017

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry


# ---------------------------------------------------------------------

offset_classes = [getattr(offsets, x) for x in dir(offsets)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already list the classes somewhere in this module iirc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a fixture defined in conftest, but only used in test_offsets. Instead of checking for the correct types it assume that offsets.__all__ is exactly what it needs. Mind if I a) put it directly test_offsets and b) make it check explicitly to avoid future footgunning?

issubclass(x, DateOffset)]


def test_valid_attributes():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parameterize this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@jreback jreback added the Frequency DateOffsets label Nov 11, 2017
@jreback
Copy link
Contributor

jreback commented Nov 11, 2017

add a new whatsnew note indicating which classes are now strict (you did this for 0.21.0 IIRC), same language

@jbrockmendel
Copy link
Member Author

add a new whatsnew note indicating which classes are now strict (you did this for 0.21.0 IIRC), same language

Will do. The 0.21.0 note didn't list subclasses. Sure you want them listed this time?

@jbrockmendel
Copy link
Member Author

Once we get #18224 sorted out, the assert n == int(n) will be done only once (and probably as a ValueError instead of assertion) in _BaseOffset.init

@codecov
Copy link

codecov bot commented Nov 11, 2017

Codecov Report

Merging #18226 into master will decrease coverage by 0.03%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18226      +/-   ##
==========================================
- Coverage   91.42%   91.39%   -0.04%     
==========================================
  Files         163      163              
  Lines       50064    50099      +35     
==========================================
+ Hits        45773    45787      +14     
- Misses       4291     4312      +21
Flag Coverage Δ
#multiple 89.2% <100%> (-0.02%) ⬇️
#single 40.38% <63.41%> (-0.04%) ⬇️
Impacted Files Coverage Δ
pandas/tseries/offsets.py 97.17% <100%> (+0.06%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/plotting/_converter.py 63.38% <0%> (-1.82%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3493aba...1666d44. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 11, 2017

Codecov Report

Merging #18226 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18226      +/-   ##
==========================================
- Coverage   91.33%   91.32%   -0.02%     
==========================================
  Files         163      163              
  Lines       49752    49764      +12     
==========================================
+ Hits        45443    45446       +3     
- Misses       4309     4318       +9
Flag Coverage Δ
#multiple 89.12% <100%> (ø) ⬆️
#single 40.72% <79.41%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/tests/tseries/offsets/conftest.py 97.14% <100%> (ø) ⬆️
pandas/tseries/offsets.py 97.03% <100%> (+0.02%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update be66ef8...b4b9e15. Read the comment docs.

@jbrockmendel
Copy link
Member Author

appveyor failure appears unrelated. will rebase+push

@@ -45,6 +45,7 @@ Other API Changes
- :class:`Timestamp` will no longer silently ignore unused or invalid ``tz`` or ``tzinfo`` keyword arguments (:issue:`17690`)
- :class:`Timestamp` will no longer silently ignore invalid ``freq`` arguments (:issue:`5168`)
- :class:`CacheableOffset` and :class:`WeekDay` are no longer available in the ``pandas.tseries.offsets`` module (:issue:`17830`)
- Restricted DateOffset keyword arguments. Previously, ``DateOffset`` subclasses allowed arbitrary keyword arguments which could lead to unexpected behavior. Now, only valid arguments will be accepted. (:issue:`17176`,:issue:`18226`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 space after a period, space after the common between issues

import pandas.tseries.offsets as offsets


@pytest.fixture(params=[getattr(offsets, o) for o in offsets.__all__])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this serves the entire sub-directory, why are you removing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it isn't moved outside test_offsets and causes a "where the heck is this defined" reaction to a new reader. Will revert.

@@ -41,6 +42,20 @@
from pandas.tseries.holiday import USFederalHolidayCalendar


offset_classes = [getattr(offsets, x) for x in dir(offsets)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, pls just use the fixture (you can certainly create more if needed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

# ---------------------------------------------------------------------

month_classes = [x for x in offset_classes if
issubclass(x, offsets.MonthOffset)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just make these fixtures

@@ -163,6 +164,7 @@ def __add__(date):
normalize = False

def __init__(self, n=1, normalize=False, **kwds):
assert n == int(n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh? why are you putting asserts here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this case I actually do like
self.n = self._validate_n(n)

which would raise on a non-integer convertible passed in (with a helpful message that includes the class name); that's the reason to make this a method)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll get rid of the asserts here, handle them later.

@jreback
Copy link
Contributor

jreback commented Nov 12, 2017

#18210 looks kind of a duplicate with this. pls only have 1 per thing open at a time.

@jbrockmendel
Copy link
Member Author

#18210 looks kind of a duplicate with this

Removed the asserts, which should fix this perception. They are orthogonal.

@jbrockmendel
Copy link
Member Author

Though you're right about too many PRs. I actually closed several last night for that reason.

@jreback
Copy link
Contributor

jreback commented Nov 12, 2017

Though you're right about too many PRs. I actually closed several last night for that reason.

not a problem. just sometimes we can lose focus.

@jbrockmendel
Copy link
Member Author

appveyor error just says 'Command exited with code 1'. Any idea how to get something more meaningful than that?

@jbrockmendel
Copy link
Member Author

hmm the appveyor error is in the same place as befire in test_ticks.test_Hour

@@ -394,6 +394,15 @@ class _BaseOffset(object):
out = '<%s' % n_str + className + plural + self._repr_attrs() + '>'
return out

def _validate_n(self, n):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some tests to exercise this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Got one test so far, will add a few more after addressing the current test failure (which is a bug that may need to be addressed as part of a Larger Fix).

@jbrockmendel jbrockmendel mentioned this pull request Nov 22, 2017
4 tasks
except (ValueError, TypeError):
raise TypeError('`n` argument must be an integer')
if n != nint:
raise ValueError('`n` argument must be an integer')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show the type of what was passed (for 1st)
make this a more informative message (e.g. showing both values) (for 2nd one)



def test_validate_n_error():
with pytest.raises(TypeError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some more tests here, e.g. for passing floats != to the int

self.normalize = normalize
if startingMonth is None:
startingMonth = self._default_startingMonth
self.startingMonth = startingMonth

self.kwds = {'startingMonth': startingMonth}
self._offset = timedelta(1)
self._use_relativedelta = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should _use_relative_delta be a class variable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and _offset ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_use_relativedelta is defined at the class level, defaults to False. So setting again here is unnecessary.

The bigger issue with _relativedelta&_offset is that they are only actually relevant for DateOffset (i.e. not subclasses). It took me a while to figure this out (and in fact I screwed up several PRs back by changing BusinessFoo.offset --> BusinessFoo._offset "for consistency" without realizing the business version is used differently). My current thought is to separate out the relativedelta-using parts of DateOffset into a subclass that isnt inherited by everything else to avoid this confusion.

@@ -2201,6 +2211,13 @@ class Easter(DateOffset):
"""
_adjust_dst = True

def __init__(self, n=1, normalize=False):
self.n = self._validate_n(n)
self.normalize = normalize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same (and for other ones you are adding)

@@ -406,6 +406,17 @@ class _BaseOffset(object):
# will raise NotImplementedError.
return get_day_of_month(other, self._day_opt)

def _validate_n(self, n):
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the doc-string that you did for #18210

@jorisvandenbossche jorisvandenbossche changed the title Patch __init__ to prevent passing invalid kwds Prevent passing invalid kwds to DateOffset constructors Nov 23, 2017
@jreback jreback added this to the 0.22.0 milestone Nov 23, 2017
@jreback
Copy link
Contributor

jreback commented Nov 23, 2017

lgtm. ping on green.

@jbrockmendel
Copy link
Member Author

Ping

@jreback
Copy link
Contributor

jreback commented Nov 25, 2017

rebase and ping on green. lgtm.

@jbrockmendel
Copy link
Member Author

Ping

@jreback jreback merged commit 06518b2 into pandas-dev:master Nov 25, 2017
@jreback
Copy link
Contributor

jreback commented Nov 25, 2017

thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the tslibs-offsets-inits branch November 25, 2017 21:13
@jbrockmendel jbrockmendel mentioned this pull request Dec 19, 2017
39 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Frequency DateOffsets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants