Skip to content

fix #39556 (infer_freq not working with freq="H" and DST #39644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 23, 2021
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pandas/tests/tseries/frequencies/test_inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -267,7 +267,7 @@ def test_infer_freq_tz(tz_naive_fixture, expected, dates):
],
)
@pytest.mark.parametrize(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not exactly the same as the OP (though it may have revelead the issue). this is a naive fixture (IOW the OP worked for naive & UTC) ,but NOT for other tzs.

so can also make this more comprehensive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure to get your comment yet ...

When I run the test test_infer_freq_tz_transition, it runs for a lot of tzs (None, UTC, US/Eastern, Asia/Tokyo, ...), for date_pairs that cover the DST changes (Fall, Spring and no change) and for freq = "H" (a.o. as it also tests for other infra-day frequencies). The test also refers to #8772 which is the issue I rephrased with a simple example in #39556.

My OP was only one case (tz=None, UTC, CET and freq=H) amongst these cases.
What was misleading in the original test is that the base frequency "H" that triggers the issue was not covered (probably because the author thought that testing with "3H" would cover "H" + other cases.

I can add a new test but I do not see what would differ from the current one (just my date_pairs would cover a full year which is not really needed for the test and my tz would be "CET" that is not covered yet other tz with DST are covered).

The small change in the test (adding freq="H" to the frequencies to test) breaks pandas before the bugfix.

I could adapt the current comment from " # see gh-8772" to " # see gh-8772 and gh-39556" to make it clearer ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry you are right, was misreading the fixture.

"freq", ["3H", "10T", "3601S", "3600001L", "3600000001U", "3600000000001N"]
"freq", ["H", "3H", "10T", "3601S", "3600001L", "3600000001U", "3600000000001N"]
)
def test_infer_freq_tz_transition(tz_naive_fixture, date_pair, freq):
# see gh-8772
Expand Down
7 changes: 4 additions & 3 deletions pandas/tseries/frequencies.py
Original file line number Diff line number Diff line change
Expand Up @@ -239,17 +239,18 @@ def get_freq(self) -> Optional[str]:
if not self.is_monotonic or not self.index._is_unique:
return None

delta = self.deltas[0]
if _is_multiple(delta, _ONE_DAY):
delta = min(self.deltas)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the idea behind using deltas[0] was that deltas should be unique at this point. is that not the case?

Copy link
Contributor Author

@sdementen sdementen Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not unique when the index has a business day frequency as you have deltas of 1 day or 3 days (for the weekend).
A first version of the bugfix I tried was to check first the unicity and then take deltas[0] to fix the issue with freq=H and DST but it broke the test for business days

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after reading through the doc of unique_deltas I see the self.deltas area already sorted => no need to take the min.
the key is to check that delta !=0 as with freq="H" and tz with DST, the minimum delta is 0 (in local time)

if delta and _is_multiple(delta, _ONE_DAY):
return self._infer_daily_rule()

# Business hourly, maybe. 17: one day / 65: one weekend
if self.hour_deltas in ([1, 17], [1, 65], [1, 17, 65]):
return "BH"

# Possibly intraday frequency. Here we use the
# original .asi8 values as the modified values
# will not work around DST transitions. See #8772
elif not self.is_unique_asi8:
if not self.is_unique_asi8:
return None

delta = self.deltas_asi8[0]
Expand Down