-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add/reorganize scalar Timedelta tests #46936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @patrickmckenna! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2022-05-17 01:55:27 UTC |
it looks like this PR has evolved to involve a lot of stuff. i'd suggest breaking it into tightly-scoped pieces for easier (faster) review |
Yes, it's comically large at the moment, apologies. Fully understand and agree it needs to be smaller to make review not a PITA. (I'd been keeping it open, as a draft, to simplify running the full test suite as I iterated on a few ideas.) I'll push a few more changes and add some high-level comments shortly, to figure out how best to split this up. |
8510414
to
1bc2741
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overwhelming majority of line changes come from edits to 2 test files:
pandas/tests/scalar/timedelta/test_arithmetic.py
pandas/tests/scalar/timedelta/test_timedelta.py
I've tried to keep all existing tests, especially those that map to specific PRs, albeit sometimes simplified, or updated to test a broader range of inputs. I also removed pandas/tests/scalar/timedelta/test_{constructors,formats.py}
, as they contained a lot of duplicate code, and moved the unique stuff to the other 2 files mentioned above.
The other component of this PR is a few updates to Timedelta
, simplifying/deduping the constructor logic and performing overflow checks. I can extract those bits into a standalone PR, though that would still leave a ton of changed lines to review.
@@ -892,6 +882,42 @@ cdef _timedelta_from_value_and_reso(int64_t value, NPY_DATETIMEUNIT reso): | |||
return td_base | |||
|
|||
|
|||
@cython.overflowcheck(True) | |||
cdef object create_timedelta(object value, str in_unit, NPY_DATETIMEUNIT out_reso): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the main functional changes introduced by this PR as it stands. The motivations:
- DRY up
Timedelta
creation logic, which was divided and duplicated over several functions - Do so outside
Timdelta.__new__
, since (AFAICT) there shouldn't yet be a public API for creating instances with non-nano resolution - Add overflow checks, and consistently raise
OutOfBoundsTimedelta
when its detected.
I can move these changes to a separate PR, to keep this one focused on tests alone, if that's desired.
if value is _no_input: | ||
if not len(kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplifications here are of a piece with the introduction of create_timedelta
above. Timedelta.__new__
had duplicated much of the logic in convert_to_timedelta64
, so that's removed. It's now mostly a wrapper around create_timedelta
, which wraps convert_to_timedelta64
and _timedelta_from_value_and_reso
.
+ int(kwargs.get('microseconds', 0) * 1_000) | ||
+ int(kwargs.get('milliseconds', 0) * 1_000_000) | ||
+ seconds | ||
kwargs["weeks"] * 7 * 24 * 3600 * 1_000_000_000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should really be extracted into a separate function, but that's for later.
@@ -1,34 +1,159 @@ | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main changes, both here and in test_timedelta.py
:
- Use fixtures to extract, simplify repeated test setup
- Split up tests with many
assert
s, and/or distinct sections of setup code, into separate tests. - Parameterize tests, wherever sensible, over all arithmetic ops of a similar kind, e.g.
add
andradd
- Parameterize tests to run with
Array/Index/Series/DataFrame
s. - Consistent grouping of like tests into classes
- More descriptive test names
@@ -1,394 +0,0 @@ | |||
from datetime import timedelta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file largely duplicated a subset of the tests in test_timedelta.py
.
@@ -1,5 +1,19 @@ | |||
""" test the scalar Timedelta """ | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments are always at risk of becoming outdated or misleading, but hopefully a few brief notes here are helpful.
Timedelta, | ||
TimedeltaIndex, | ||
offsets, | ||
to_timedelta, | ||
) | ||
import pandas._testing as tm | ||
|
||
TD_UNITS = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some, but not all, of these seem to be defined in other modules, though they're not always exposed for testing, e.g.
pandas/pandas/_libs/tslibs/timedeltas.pyx
Line 95 in 1bc2741
cdef dict timedelta_abbrevs = { |
It's a tale of two cities on the perf front. A local run just now:
Looking into the regressions. |
Better now:
|
I've extracted the functional updates, which touch many fewer lines, into #47004, so this PR can focus on de-duping/expanding test coverage. |
Once #46854 and #47004 merge, I can drastically cut the scope of this PR, and focus it on just consolidating/deduping scalar If that's not useful, or isn't worth prioritizing, please LMK. |
honestly im having trouble motivating myself to carefully review this, but i'll give it another go once the other PRs are done. id advise against spending much time here in the interim |
Think it might be easier to just close this for now. If there are elements worth reviewing and merging, they can always be cherry-picked into a much more narrowly-scoped PR. Or this can be reopened. (As mentioned, I left this open in its current form only to simplify checking CI results especially the Windows and 32-bit runs. But it's probably too big now to easily whittle down into something reasonable and useful.) |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.This is a companion of sorts to #46854. It adds more thorough tests, and groups them by type, but doesn't change any behavior.
/cc @jbrockmendel @jreback