-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: non-standard frequency Period arithmetic #23878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
we already rejected PeriodDelta as pretty niche and not worth the supports costs that said some of the above might be bugs |
I think that ints are fine to represent period differences except it needs some modifications/restrictions (of course, I'm not aware of any of the original decisions on how this should work, so please let me know if there is any reason these wouldn't work):
I would also consider allowing something like a 2-item tuple of ints, that add would also understand, to allow for out of phase operations. Which would just be (, ). These would be easy to implement, add almost no support cost and also make the Period operations much more reliable. Example Implementation (allowing for out of phase arithmetic): def __sub__(self, other):
ret_val = super().__sub__(other) # not actually super, just placeholder for existing logic
return divmod(ret_val, self.freq_scaling_factor) # not sure how to actually access this right now
def __add__(self, other):
if is_2_item_tuple_of_ints(other):
other_val = other[0] * self. self.freq_scaling_factor + other[1]
return (self.astype(self.base_freq) + other_val).astype(self.freq) # not perfect, astype doesn't always behave how I would want for this to work off the bat
else:
return self + other Example Implementation (for no out of phase operations): def __sub__(self, other):
ret_val = super().__sub__(other)
periods, out_of_phase = divmod(ret_val, self.freq_scaling_factor)
if out_of_phase:
raise ValueError('out of phase')
return periods |
Is this in master? We fixed a very similar bug recently (though that may have been for PeriodIndex and not Period) |
@jbrockmendel - I was on 23.4, but looks like Master is still wrong. So from what I can tell the
This would not be a problem (in fact it behaves the way I proposed in option 2 in the original post), however, it seems when diffing scaled freqs the scaling factor is multiplied by the base hours, this is unnecessary (and wrong) because of what was deomstrated above. So all that would need to be done is remove the multiplication of the resulting offset.
|
@jbrockmendel - I believe the fix would just be to change the following line: pandas/_libs/tslibs/period.pyx:1688
To
|
Great, a PR would be welcome. |
@jbrockmendel - PR created #23915 |
Code Sample
Problem description
In short: (A - B) + B != A
The result of diffing two periods of a scaled frequency is represented in the number of periods of the base frequency, not the scaled frequency. Thus when adding back to the scaled frequency the resulting Period will be N (where N is the scaling factor) times more periods than should be added.
The 2 options that don't involve some sort of custom delta object don't really work well:
Suggestion:
A simple PeriodDelta object that just contains (int, freq), where the int is the current resulting int. When added to a period, if the frequency matches the PeriodDelta freq, then the Period will be cast to it's base frequency the int added and then cast back. Ints will still be valid input, but the difference of Periods could return this object instead.
Output of
pd.show_versions()
v0.23.4
The text was updated successfully, but these errors were encountered: