Skip to content

[WIP] implement tests using hypothesis #18761

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

jbrockmendel
Copy link
Member

[skipci]

Related: #17978

Exposes a whole bunch of broken cases. Many of them look like they are not currently caught because the large majority of offset tests use tz-naive pydatetime inputs.

Note this does not fix these bugs. The goal here is a Proof Of Concept for using hypothesis and discuss if/how we can make it useful.

It also doesn't add hypothesis to the requirements because I have no idea what the appropriate file is for that.

@jreback
Copy link
Contributor

jreback commented Dec 13, 2017

so you could add this to ci/requirements_dev.txt. but also in this test module you need a pytest.importorskip('hypthosesis')

@jreback jreback added the Testing pandas testing functions or related to the test suite label Dec 13, 2017
@jreback
Copy link
Contributor

jreback commented Dec 13, 2017

are these determinstic tests on a single run? e.g. we are not saving state for travis runs

@st.composite
def gen_random_date_range(draw):
# TODO: Choose the min/max values more systematically
start = st.datetimes(min_value=pd.Timestamp(1900, 1, 1).to_pydatetime(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just datetime(1900, 1, 1)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was copy/pasted from somewhere else where it was originally pd.Timestamp.min, had to adjust to avoid OverfowErrors.

@pganssle
Copy link
Contributor

pganssle commented Dec 13, 2017

@jreback I don't think hypothesis tests are deterministic, but in my experience they are "deterministic enough" (they will usually find the most significant edge cases very quickly - they don't sample the space randomly). If you find edge cases, rather than trying to save the state of the database between runs, you can add specific examples with the @hypothesis.example decorator.

# Allows None
return st.one_of(st.none(), hepytz.timezones())
# TODO: Weighting between naive and timezones?
# TODO: Get datetuil timezones?
Copy link
Contributor

@pganssle pganssle Dec 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been briefly in contact with @DRMacIver about adding a hypothesis.extras.dateutil extra to do just this. It may require some additional public interface stuff on the dateutil side.

That said, it seems that all timezones provided by pytz map identically to dateutil zones, for all datetimes less than 2038-01-01. To prove it, here's a hypothesis test I ran:

from hypothesis import given, assume
from hypothesis import strategies as st
from hypothesis.extra import pytz as hepytz

from dateutil import tz
from datetime import datetime

@given(dt=st.datetimes(), tzi=hepytz.timezones())
def test_dateutil_compat(dt, tzi):
    tzi_du = tz.gettz(str(tzi))
    dt_pytz = tzi.localize(dt)
    dt_du = dt.replace(tzinfo=tzi_du)

    assume(dt < datetime(2038, 1, 1))
    assert dt_pytz == dt_du

So you should be able to get dateutil zones from tz.gettz(str(pytz_zone))

See dateutil/dateutil#590 for the reason why the assume(dt < datetime(2038, 1, 1) is in there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does this need to be a function? Why not just assign timezone_strategy = st.one_of([None, hepytz.timezones()])?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, does this need to be a function?

Probably not. I'm still getting the hang of @composite and, data, and draw.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you should be able to get dateutil zones from tz.gettz(str(pytz_zone))

That looks great, thanks. I'll take a look at dateutil-590 to see if I can be helpful.



@st.composite
def gen_random_relativedelta_DateOffset(draw):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to add some assumptions here to make sure that your year, month, week, day, etc values are valid. You can do this by drawing them from integers with min/max set, drawing them as you are and then adding assume statements, or (and this might be a bit out there), draw a random datetime() and just use the values from that (won't work for week, but you can calculate the week from the datetime if you want to use it, I imagine).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh most definitely. I just threw those in and figured I'd put in the appropriate bounds one-by-one as errors came up.

@jbrockmendel
Copy link
Member Author

After a couple of days of poking at this, I think it is worth implementing, but it is much too slow to run in the CI.

Running locally has turned up dozens of broken cases. To the extent that we can phrase tests as "for all A such that B, assert C" this can be a useful tool for identifying corner cases. When they are found they should just be made into regular tests (and uh, fixed).

@pganssle
Copy link
Contributor

After a couple of days of poking at this, I think it is worth implementing, but it is much too slow to run in the CI.

You can always tune the hypothesis tests to reduce the amount run on any given CI run. Maybe on each PR you run 50 test cases or something and then set up a master branch build that runs once a week that runs 500 or so.

@jbrockmendel
Copy link
Member Author

I'm still interested in this, but closing because it is not actionable at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants