Skip to content

Tslibs offsets validation #18210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

jbrockmendel
Copy link
Member

Lots to do in offsets, trying to push breaking changes to the very end. This just implements validation for DateOffset args/kwargs, fixes a poorly-worded error message and the associated tm.assert_raises_regex tests.

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@codecov
Copy link

codecov bot commented Nov 10, 2017

Codecov Report

Merging #18210 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18210      +/-   ##
==========================================
- Coverage    91.4%   91.38%   -0.02%     
==========================================
  Files         163      163              
  Lines       50064    50055       -9     
==========================================
- Hits        45759    45744      -15     
- Misses       4305     4311       +6
Flag Coverage Δ
#multiple 89.19% <100%> (ø) ⬆️
#single 40.36% <12.5%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/tseries/offsets.py 97.22% <100%> (+0.11%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️
pandas/core/indexes/datetimes.py 95.48% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca737ac...9ef49b4. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 10, 2017

Codecov Report

Merging #18210 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18210      +/-   ##
==========================================
- Coverage   91.36%   91.35%   -0.02%     
==========================================
  Files         164      164              
  Lines       49718    49709       -9     
==========================================
- Hits        45426    45410      -16     
- Misses       4292     4299       +7
Flag Coverage Δ
#multiple 89.15% <100%> (ø) ⬆️
#single 39.61% <50%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/tseries/offsets.py 96.83% <100%> (+0.12%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1647a72...ba12543. Read the comment docs.

@jbrockmendel
Copy link
Member Author

taskset 7 asv continuous -f 1.1 -E virtualenv master HEAD -b timeseries
[...]
       before           after         ratio
     [ca737aca]       [ada38540]
+      43.2±0.2μs       48.0±0.2μs     1.11  timeseries.SemiMonthOffset.time_begin_apply
+     18.4±0.05μs       20.3±0.1μs     1.10  timeseries.Offsets.time_custom_bday_apply_dt64
+       135±0.2μs          149±1μs     1.10  timeseries.Offsets.time_custom_bmonthend_incr
-     8.50±0.08μs      7.65±0.03μs     0.90  timeseries.Offsets.time_timeseries_year_apply
-     21.2±0.08μs      18.9±0.05μs     0.89  timeseries.Offsets.time_custom_bday_incr
-        92.9±9ms       26.9±0.1ms     0.29  timeseries.AsOfDataFrame.time_asof_nan
-        97.8±4ms       24.1±0.3ms     0.25  timeseries.AsOfDataFrame.time_asof

The time_asof and time_asof_nan are pretty frequently outliers. Given the local entropy field, this is about a wash.

return t_input
else:
raise ValueError("time data must be string or datetime.time")


def _validate_n(n):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring on all of these functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@@ -2192,7 +2192,7 @@ def test_repr(self):
def test_corner(self):
pytest.raises(ValueError, Week, weekday=7)
tm.assert_raises_regex(
ValueError, "Day must be", Week, weekday=-1)
ValueError, "weekday must be", Week, weekday=-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we changing all of these "Day" strings to "weekday" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the incorrect argument is "weekday". Many of these also take a "day" argument, so there is potential for ambiguity.

@gfyoung gfyoung added Error Reporting Incorrect or improved errors from pandas Internals Related to non-user accessible pandas implementation labels Nov 10, 2017
raise ValueError("N cannot be 0")


def _validate_month(month):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason that these are not instance methods of BaseOffset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My default is to not make something a method if it doesn't require self.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the current style. agree that these are somewhat of an edge case, but don't change just for the sake of change or its 'your style', prefer to keep the existing code style unless good reason to change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. The fix to the error message in validate_weekday is probably worthwhile. Should I revert the others?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like them in cython, but these make more sense as methods I think as they do use self at least for some)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as they do use self at least for some

None of the ones implemented here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right but the point is you don't need the allow_none arg if you simply make this a method (for the weekday one)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does making it a method change whether allow_none is needed?

@jbrockmendel
Copy link
Member Author

travis error is TestClipboard


Parameters
----------
month : int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess we have this type of method in multiple places (though I agree these should actually be in this module).



def _validate_weekday(weekday, allow_none=False):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is a method, then you don't need allow_none (which is a really odd non-standard kw).

because you are a method, checking self.weekday is obvious.

now you can also have a _validate_weekday method that does everything (but the allow_none) and simply call that if you want.

If you show that you need these in multiple places, then certainly you can call them, from the offsets you need to have a method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't totally understand the reasoning, but it sounds like the upshot here is that to get this done _validate_weekday needs to become a method of _BaseOffset, and without a allow_none kwarg. Is that the correct takeaway?

@jreback
Copy link
Contributor

jreback commented Nov 22, 2017

isn't this competing with #18228 ? thought was closing this one

@jbrockmendel
Copy link
Member Author

isn't this competing with #18228 ? thought was closing this one

#18826 is fundamentally a bugfix, while this is mostly a refactor with a side of unrelated-bugfix. The resemblance is coincidence. #18226 does need rebasing though.

@jreback
Copy link
Contributor

jreback commented Nov 23, 2017

so will leave to after #18228 is merged, you are defining the validate_n again here (but please add the doc-strings to #18228)

@jreback
Copy link
Contributor

jreback commented Nov 25, 2017

needs a rebase (and prob some code removal)

@jbrockmendel
Copy link
Member Author

Low priority, closing for now

@jbrockmendel jbrockmendel deleted the tslibs-offsets-validation branch December 8, 2017 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants