-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
fix #39556 (infer_freq not working with freq="H" and DST #39644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- check that the delta are unique before checking if the are day multiples - add test with freq="H" that raises the bug
…taking for delta the minimum of deltas and checking delta is not null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sdementen for the PR!
we'll need a whatsnew (I expect targeting 1.3)
pandas/tseries/frequencies.py
Outdated
@@ -239,17 +239,18 @@ def get_freq(self) -> Optional[str]: | |||
if not self.is_monotonic or not self.index._is_unique: | |||
return None | |||
|
|||
delta = self.deltas[0] | |||
if _is_multiple(delta, _ONE_DAY): | |||
delta = min(self.deltas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the idea behind using deltas[0] was that deltas should be unique at this point. is that not the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not unique when the index has a business day frequency as you have deltas of 1 day or 3 days (for the weekend).
A first version of the bugfix I tried was to check first the unicity and then take deltas[0] to fix the issue with freq=H and DST but it broke the test for business days
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after reading through the doc of unique_deltas I see the self.deltas area already sorted => no need to take the min.
the key is to check that delta !=0 as with freq="H" and tz with DST, the minimum delta is 0 (in local time)
I have added the whatsnew entry |
There is some check that fails after having added the what's new entry. Any clue why? |
@sdementen The CI failures are due to #39688. They should be gone if you merge master. |
cc @mroeschke i think this is an hour-based analogue of the problem of freq=Day vs freq=DayDST. i.e. this fixes one problem but will introduce others. im hesitant to do that, but the long-term fix has been stuck in limbo for a while |
Do I need to do this myself? Or will this automatically be solved once someone merge my PR into master? |
you wanna merge master yourself yeah |
I haven't used |
merged with master, all tests pass, ready to merge ;-) |
@arw2019 @mroeschke , do I need to do something more re this PR for it to be merged ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u also add an explicit test that is similar to the OP
The test |
@@ -267,7 +267,7 @@ def test_infer_freq_tz(tz_naive_fixture, expected, dates): | |||
], | |||
) | |||
@pytest.mark.parametrize( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not exactly the same as the OP (though it may have revelead the issue). this is a naive fixture (IOW the OP worked for naive & UTC) ,but NOT for other tzs.
so can also make this more comprehensive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure to get your comment yet ...
When I run the test test_infer_freq_tz_transition
, it runs for a lot of tzs (None, UTC, US/Eastern, Asia/Tokyo, ...), for date_pairs that cover the DST changes (Fall, Spring and no change) and for freq = "H" (a.o. as it also tests for other infra-day frequencies). The test also refers to #8772 which is the issue I rephrased with a simple example in #39556.
My OP was only one case (tz=None, UTC, CET and freq=H) amongst these cases.
What was misleading in the original test is that the base frequency "H" that triggers the issue was not covered (probably because the author thought that testing with "3H" would cover "H" + other cases.
I can add a new test but I do not see what would differ from the current one (just my date_pairs would cover a full year which is not really needed for the test and my tz would be "CET" that is not covered yet other tz with DST are covered).
The small change in the test (adding freq="H" to the frequencies to test) breaks pandas before the bugfix.
I could adapt the current comment from " # see gh-8772" to " # see gh-8772 and gh-39556" to make it clearer ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry you are right, was misreading the fixture.
cc @mroeschke if any comments, merge when good |
Thanks @sdementen |
check that the delta are unique before checking if they are day multiples
add test with freq="H" that raises the bug
closes BUG: 'infer_freq' does not work with tz != "UTC" #39556
tests added / passed
Ensure all linting tests pass, see here for how to run them
whatsnew entry => not sure where to enter this ...