Skip to content

ENH: allow non-Tick offsets in index.round/ceil/floor #27090

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed

ENH: allow non-Tick offsets in index.round/ceil/floor #27090

wants to merge 5 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Jun 28, 2019

related #27087 (Add DayBegin, DayEnd offsets)

I ran into #15303

In [4]: import pandas as pd
   ...: index=pd.date_range("2000-01-01","2000-01-02",freq="1H")
   ...: index.round("M")
ValueError: <MonthEnd> is a non-fixed frequency

This comment argues that implementing round() doesn't make sense here, since with non-fixed offsets it might round some midpoints up and some down. That might be what the user actually wants, but it's a subtle point. I put the commented code in anyway, just as an option, and implemented dti.floor/ceil.

If there's intent-to-merge, I'll add the tests etc'.

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little weary of adding rounding for exotic, non-fixed offsets with the potential ambiguity in behavior.

To begin, if you could create a table with a dummy example and the rounding behavior for the offests contained in this section, that'd help flesh out some potential edge cases:

https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects

@mroeschke mroeschke added Frequency DateOffsets Needs Discussion Requires discussion from core team before further action labels Jun 28, 2019
@ghost
Copy link
Author

ghost commented Jun 28, 2019

DatetimeIndex.snap() already does for me what this possible round would, and I don't need the extra knobs. So you're both not keen and I don't want it, why don't we just punt on it? Implementing ceil/floor which appear well-defined would still be a good step forward.

Is that reasonable? is the code ok?

@ghost ghost mentioned this pull request Jun 28, 2019
3 tasks
@mroeschke
Copy link
Member

Okay for ceil and floor, I think implementing in terms of #27087 (comment) would be fine.

@jreback
Copy link
Contributor

jreback commented Jun 28, 2019

see also #27098, implementing .round() would also be ok for non-Tick based offsets using something like the current implementation of .snap()

@ghost
Copy link
Author

ghost commented Jun 28, 2019

Okay for ceil and floor, I think implementing in terms of #27087 (comment) would be fine.

ok, good, i'll get it together.

As for round, I'm unclear. AFAICT, snap does exactly what we were concerned round shouldn't do.

@ghost
Copy link
Author

ghost commented Jun 30, 2019

Okay for ceil and floor, I think implementing in terms of #27087 (comment) would be fine.

To copy here, the suggested idiom is:

dr.to_period('MS').to_timestamp(how='end')                     

I switched to that and added tests.

There are two problems:

  1. PeriodIndex.to_timestamp fails for frequencies which imply start/end like "MS"/"BMS"/"BME".
import pandas as pd
from pandas._libs.tslibs import frequencies as libfrequencies
pi=pd.period_range("2001-01-01",periods=3,freq="MS")
dti=pi.to_timestamp(how='start') 

pandas/core/arrays/period.py in _get_ordinal_range(start, end, periods, freq, mult)
    883 
    884     if freq is not None:
--> 885         _, mult = libfrequencies.get_freq_code(freq)
ValueError: Invalid frequency: MS
  1. M implies MonthEnd. But dr.to_period('M').to_timestamp(how='start') rounds to the start of the month. It doesn't round to the previous MonthEnd, which is what I think it should do.
In [26]: import pandas as pd
    ...: from pandas._libs.tslibs import frequencies as libfrequencies
    ...: dti=pd.DatetimeIndex(["2001-02-01"])
    ...: print(dti)
    ...: dti.to_period('M').to_timestamp(how='start')
DatetimeIndex(['2001-02-01'], dtype='datetime64[ns]', freq=None)
Out[26]: DatetimeIndex(['2001-02-01'], dtype='datetime64[ns]', freq=None)

I think that result is what dti.to_period('MS').to_timestamp(how='start') is expected to do.

@ghost
Copy link
Author

ghost commented Jul 2, 2019

The approach I originally suggested was rejected (and had issues). The alternative you suggested doesn't work out. Some guidance, please?

A User wanted this two years ago and it stalled because they needed feedback and didn't get it. It'd be a shame for that to happen a second time.

@ghost
Copy link
Author

ghost commented Jul 10, 2019

I'd like to complete this, but I'm stuck without an answer. My approach was rejected, and the one suggested hit a wall. I'm wary of investing time in another approach only to have it rejected again.

@ghost
Copy link
Author

ghost commented Jul 17, 2019

ping.

@ghost
Copy link
Author

ghost commented Jul 31, 2019

I really wanted to get this working but it's been ignored enough to make me give up on it.

@ghost ghost closed this Jul 31, 2019
@ghost ghost deleted the support_non_tick_in_round_floor_ceil branch July 31, 2019 14:09
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Frequency DateOffsets Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't use dt.ceil, dt.floor, dt.round with coarser target frequencies
2 participants