Skip to content

Add more overflow tests for timedelta64 operations #46854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 33 commits into from

Conversation

patrickmckenna
Copy link

@patrickmckenna patrickmckenna commented Apr 23, 2022

  • closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

This PR adds more specific tests, to verify the current, sometimes inconsistent behavior. Much of it has been reported/discussed in previous threads, and some appears to reflect intentional tradeoffs, e.g. allowing rounding error to avoid overflows. Other behavior seems likely unintentional:

  • Inconsistent usage of ValueError and OverflowError
  • Spurious OverflowErrors for 1-element Series w/ valid values

Since this is my first time poking around pandas/numpy internals, it seemed best to make sure the right tests are in the right places before attempting any fixes.

Open Questions

  • What exception should be raised if Series.sum() would yield an invalid Timedelta: ValueError, OutOfBoundsTimedelta, something else? (I can rewrite the tests to mark those cases that currently raise the "wrong" exception as xfail for now.)
  • Is silently introducing rounding error considered ok, or should a warning be emitted?
  • Is this the right spot for these tests? AFAICT, a lot of this behavior is determined by pandas.core.nanops, particularly special-casing of timedelta64 in nansum():
    if is_float_dtype(dtype):
    dtype_sum = dtype
    elif is_timedelta64_dtype(dtype):
    dtype_sum = np.dtype(np.float64)
    My assumption was that tests against the public API would be preferred, but please LMK if that's not the case.

Prior Art

@pep8speaks
Copy link

pep8speaks commented Apr 23, 2022

Hello @patrickmckenna! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-05-15 19:13:51 UTC

@patrickmckenna patrickmckenna changed the title Add more granular tests of current Series[timedelta64[ns]].sum() behavior Add more overflow tests for timedelta64 operations Apr 25, 2022
Copy link
Author

@patrickmckenna patrickmckenna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds some additional overflow tests for timedeltas. It's goal is to accurately capture the current behavior, not fix existing bugs/inconsistencies—but that's the next step 😄

@jbrockmendel as one who's been working on related issues somewhat recently, I'd appreciate your feedback! (Apologies if there are others I should be pinging, too...) This is my first time poking around pandas/numpy internals, so if there's a better way to write/organize the tests, please LMK.

from pandas.tests.arithmetic.common import (
assert_invalid_addsub_type,
assert_invalid_comparison,
get_upcast_box,
)

timedelta_types = (Timedelta, TimedeltaArray, TimedeltaIndex, Series, DataFrame)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the newly-added overflow tests are parameterized over Timedelta scalar and container types, on the assumption that similar exceptions should be raised for all. (The fact that they're not can be confusing, as mentioned in #43178.) Is that assumption accurate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that assumption is accurate

id prefer to keep the Timedelta scalar tests in tests.scalar.timedelta, then use the box_with_array fixture for the non-scalar tests. part of the virtue of this is that when writing the non-scalar tests, we assume that the scalar behavior is correct and tested.

left = wrap_value(Timedelta.max, add_sub_types.left)
right = wrap_value(positive_td, add_sub_types.right)

if add_sub_types.result is Timedelta:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be raising a OutOfBoundsTimedelta in all cases? Ref: #34448

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably, yes

@patrickmckenna patrickmckenna marked this pull request as ready for review April 25, 2022 23:21
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the tests. these look to be nicely comprehensive and you have a lot of questions about these as we may have tradeoffs in the existing operations. If you can move these to a new file will make it easier to run.

@jreback jreback added Testing pandas testing functions or related to the test suite Timedelta Timedelta data type labels Apr 26, 2022
Some td64 overflow tests remain in other modules:

  - tests/tslibs/test_conversion.py::test_ensure_timedelta64ns_overflows()
  - tests/tslibs/test_timedeltas.py::test_huge_nanoseconds_overflow()
  - tests/scalar/timedelta/test_timedelta.py::test_mul_preserves_reso()
  - tests/scalar/timedelta/test_constructors.py::test_construct_from_td64_with_unit(),test_overflow_on_construction()

Still TBD whether these should remain there or also be migrated.

See: github.com/pandas-dev/pull/46854#discussion_r858131625
@jbrockmendel
Copy link
Member

i'll take a look today

@@ -279,6 +281,28 @@ def box_expected(expected, box_cls, transpose=True):
return expected


def wrap_value(value, cls, transpose=False):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation for this was to simplify writing tests that should produce the same output for both scalar and container types, e.g. here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any other thematically similar functions that could be grouped together in a pandas._testing.foo instead of __init__? we've kind of stagnated, but in principle there's a goal to get stuff out of this file

timestamp = Timestamp("2021-01-01")
result = timestamp + timedelta_range("0s", "1s", periods=31)
expected = DatetimeIndex(
class TestAddSub:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this organization makes sense, I can open a follow up PR to migrate many of the existing tests into this and similar classes (and make them fully parameterized across all container types).

left = wrap_value(Timestamp.min, ts_add_sub_types[1])

ex = (OutOfBoundsDatetime, OverflowError)
msg = "|".join(["Out of bounds nanosecond timestamp", "Overflow in int64 addition"])
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Imagine this would be part of the same future PR mentioned in #46854 (comment)?

Copy link
Author

@patrickmckenna patrickmckenna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel @jreback per #46854 (comment), I've migrated the new tests back to existing modules. I've also removed the hypothesis usage, and rewritten the tests to follow the pattern of accepting box types as fixtures. Please LMK if you see anything else that needs changing.

Once this ships, I can open up a follow-on PR to make the exception raising more consistent.

)
tm.assert_index_equal(result, expected)
def test_sub_raises_if_result_would_overflow(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I tried rewriting this to keep the test function body constant, and mapping each input value to an appropriate xfail marker:

xfail_does_not_raise = pytest.mark.xfail(reason="doesn't raise", strict=True)
xfail_raises_overflow_error = pytest.mark.xfail(
    reason="raises wrong error",
    raises=OverflowError,
    strict=True,
)
# ...
    @pytest.mark.parametrize(
        "rval",
        [
            pytest.param(Timedelta(1), marks=xfail_does_not_raise),
            pytest.param(Timedelta(2), marks=xfail_raises_overflow_error),
            pytest.param(Timedelta.max, marks=xfail_raises_overflow_error),
        ],
    )    
    def test_sub_raises_td64_specific_error_if_result_would_overflow(
        self,
        max_td64: TD64_BOX_TYPE,
        rval: Timedelta,
        td64_type: Type[TD64_TYPE],
    ):
        rvalue = tm.wrap_value(rval, td64_type)
        min_td64 = -1 * max_td64

        with pytest.raises(OutOfBoundsTimedelta, match="too small"):
            min_td64 - rvalue

Unfortunately that slowed execution by a factor of ~40 (locally). Decorating the test with a single, catch-all xfail marker led to the same results. Passing run=False did help, but it was still ~5x slower.

@jbrockmendel
Copy link
Member

will take another look tomorrow

from functools import partial
from typing import (
Type,
Union,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use type and |?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish, that's a 3.10 feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if we do from __future__ import annotations? im sure we use pipes elsewhere

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I'd tried that and switching to pipes, and it was failing for type aliases like TD64_BOX_TYPE = TimedeltaArray | TimedeltaIndex | Series | DataFrame. But TIL: pipes will work for function annotations (with that import).

type does work. I'll switch to that, and replace the aliases with pipes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I'd tried that and switching to pipes, and it was failing for type aliases like TD64_BOX_TYPE = TimedeltaArray | TimedeltaIndex | Series | DataFrame. But TIL: pipes will work for function annotations (with that import).

type does work. I'll switch to that, and replace the aliases with pipes.

"""
dt64 = tm.wrap_value(Timestamp.now(), dt64_type)
td64 = tm.wrap_value(Timedelta(0), td64_type)
return type(dt64 + td64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is going to map Timedelta->Timestamp and TimedeltaArray->DatetimeArray but be an identity mapping for everything else?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misunderstanding the question, but think the gist of what you're suggesting is right (assuming Array stands for every box type). So:

Timedelta + Timestamp -> Timestamp
Timedelta + DatetimeArray -> DatetimeArray
Timedelta + DatetimeIndex -> DatetimeIndex
Timedelta + Series -> Series
Timedelta + DataFrame -> DataFrame
TimedeltaArray + Timestamp -> DatetimeArray
TimedeltaArray + DatetimeArray -> DatetimeArray
...
TimedeltaIndex + Timestamp -> DatetimeIndex
TimedeltaIndex + DatetimeArray -> DatetimeIndex
...

Happy to update the docstring, or use a different pattern, if that's unclear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference (mentioned #46854 (comment)) is to handle the Timedelta/Timestamp cases in the existing tests/scalar/... files, and the non-scalar tests here. Then this wrapping becomes unnecessary and you can just use box_with_array

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about scalar/box tests, e.g. Timedelta + DatetimeIndex, where should those live?

@jreback jreback added this to the 1.5 milestone May 4, 2022
@jreback
Copy link
Contributor

jreback commented May 4, 2022

@jbrockmendel happy for you to merge this when you are ready

@patrickmckenna
Copy link
Author

happy for you to merge this when you are ready

@jreback appreciate that! Alas, no green button for me—I haven't got write access 🙃

@jreback
Copy link
Contributor

jreback commented May 4, 2022

happy for you to merge this when you are ready

@jreback appreciate that! Alas, no green button for me—I haven't got write access 🙃

@patrickmckenna that was to @jbrockmendel :->

@patrickmckenna
Copy link
Author

Hah, whoops! 🤦‍♂️😅

@patrickmckenna
Copy link
Author

Please LMK if there's more that needs doing here. Unsure what to do about the few red x's, which appear to be (transient?) failures during CI setup.

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jun 15, 2022
@mroeschke
Copy link
Member

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

@mroeschke mroeschke closed this Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stale Testing pandas testing functions or related to the test suite Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants