-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Upsampling a time-series is missing an option to properly deal with the end #10449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Ah. I misspoke earlier. The two |
I don't think consistency is what you want here. That would mean that you shorten the Series even more, ending at 2015-01-01 00:45:00 instead of the expected 2015-01-01 01:45:00. What I would expect a priori from pd.date_range(closed = "left") is a representation of the interval, excluding its right endpoint, like What date_range [or Series?] does is not only forgetting about the endpoint, but the complete last interval As far as I can see, you can only get it to go one step over the old endpoint if the new frequency is not a multiple of the old one, like resample("45T") gets you But if the new frequency is a multiple of the old one, it will always end at the last endpoint, no matter which option I tried. |
That is exactly what I mean :) In our context (energy), the timestamps almost always represent the beginning of an interval. In other cases it represents the end but that fails equally, I think. |
@filmor I am not sure why you think this should go to the 2 hour ever. Its not included in the sample. you can simply do this:
|
Yeah, for that I have to remember the end-points. But by having an hourly timeseries with values for each point in time I already indicate (and that's what forward-fill does correctly for all points but the last one) that each timestamp represents the start of a one-hour interval. |
"I am not sure why you think this should go to the 2 hour ever. " Because I expect(ed) pd.date_range("2015-01-01", "2015-01-01 02:00", freq="1H", closed="left") to be an object that models the range from 00:00 until 02:00 (excluded), meaning that all timestamps until 02:00-\eps are in the range. If you sample with 1H frequency, that happens to stop at 01:00, but that doesn't mean that 01:45 is not in the range I supposed to cover. Therefore I expected to get Timestamps between 01:00 and 02:00 back if I resample to a higher frequency, or at least have an option for that in .resample, which I seem to not have. For how it looks to me, but you usually have your reasons why you go one step longer with on open ending, as opposed to stopping already one step earlier with closed ending. |
stops at 1, so not sure how that should somehow magically go to 2. A closed/open right hand interval would generally include/exclude a single right hand point (e.g. the 1 in this case). If you want to upsample to 2, then simply reindex it in the first place. Resample is already way magically, this would add another layer. All that said if you think that you can find a reasonable api that preserves back-compat. go for it. |
It took me hours to land here. My understanding is that |
@decatur
In my understanding, the
Btw: energy data here as well ;-) |
yeah @winklerand soln is the right one here (along with some docs) for doing this. |
Consider the following:
The result looks like this
What I actually want is
Currently it seems that you always have to do the resampling yourself by creating a new index of the new frequency from the same
begin
andend
values, reindexing, and forward filling.Actually, I would have expected
closed
to work like this. Any hints on a reasonable parameter so I can try to prepare a PR for this?The text was updated successfully, but these errors were encountered: