Skip to content

DEPR: QuarterBegin and BQuarterBegin return days that are not quarter beginnings #8435

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nimishgautam opened this issue Oct 1, 2014 · 24 comments
Labels
Deprecate Functionality to remove in pandas Frequency DateOffsets

Comments

@nimishgautam
Copy link

In[43]: datetime(2014,10,10) + BQuarterBegin()
Out[43]: Timestamp('2014-12-01 00:00:00')

In[45]: datetime(2014,10,10) + QuarterBegin()
Out[45]: Timestamp('2014-12-01 00:00:00')

Expected output is 2015-01-01.
(Note QuarterEnd and BQuarterEnd do produce the expected output of 2014-12-31)

@jreback
Copy link
Contributor

jreback commented Oct 1, 2014

The default startingMonth is 3 (which may be wrong, their is a comment in the code to that effect).

You can get what you expect by using a startingMonth=1

In [66]: datetime.datetime(2014,10,10) + pd.offsets.BQuarterBegin(startingMonth=3)
Out[66]: Timestamp('2014-12-01 00:00:00')

In [67]: datetime.datetime(2014,10,10) + pd.offsets.BQuarterBegin(startingMonth=1)
Out[67]: Timestamp('2015-01-01 00:00:00')

If you'd like to investigate to see why this is would be appreciated.

I don't know when/how this was done this way.

@jreback jreback added this to the 0.15.1 milestone Oct 1, 2014
@jreback
Copy link
Contributor

jreback commented Oct 1, 2014

If any of you have a comment about this:

cc @bjonen
cc @cancan101
cc @rockg
cc @MichaelWS

@MichaelWS
Copy link
Contributor

I think I would prefer default QuarterBegin to be 1

@jreback
Copy link
Contributor

jreback commented Oct 1, 2014

agreed - seeing if anyone know why it is not 1 (and why 3)

@nimishgautam
Copy link
Author

Just guessing, but there are a few countries whose fiscal years begin in March (month 3): http://en.wikipedia.org/wiki/Fiscal_year, but even then, the QuarterEnd and BQuarterEnd objects don't default to 3 to match.

@rockg
Copy link
Contributor

rockg commented Nov 8, 2014

I'm inclined to change this default to the standard quarter definition (month begins on 1, 4, 7, 10 and month ends on 3, 6, 9, 12). Any objections?

It is wrong and inconsistent now:

pd.Timestamp('11/2/2012', tz='US/Eastern') + pd.tseries.offsets.QuarterBegin()
Out[29]: Timestamp('2012-12-01 00:00:00-0500', tz='US/Eastern')
pd.Timestamp('11/2/2012', tz='US/Eastern') + pd.tseries.offsets.QuarterEnd()
Out[30]: Timestamp('2012-12-31 00:00:00-0500', tz='US/Eastern')

@rockg
Copy link
Contributor

rockg commented Nov 8, 2014

Related to #5307 probably.

@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
@chris-b1
Copy link
Contributor

@jreback, I'm running into this issue fixing #11370 - I agree with the comments above that the current definition is inconsistent. But I suppose it would need to go through a deprecation cycle where it's breaking? cc @sinhrks

edit: on second thought I guess QS-DEC does sort of imply the quarter starts in December, so maybe it's better to just change the default for QuarterBegin() than the whole set of semantics like I was thinking originally.

# start / end definition for period is symmetrical
In [43]: pd.period_range('2014-1-1', periods=5, freq='Q-DEC').to_timestamp(how='e')
Out[43]: 
DatetimeIndex(['2014-03-31', '2014-06-30', '2014-09-30', '2014-12-31',
               '2015-03-31'],
              dtype='datetime64[ns]', freq='Q-DEC')

In [44]: pd.period_range('2014-1-1', periods=5, freq='Q-DEC').to_timestamp(how='s')
Out[44]: 
DatetimeIndex(['2014-01-01', '2014-04-01', '2014-07-01', '2014-10-01',
               '2015-01-01'],
              dtype='datetime64[ns]', freq='QS-OCT')

# not so for QuarterBegin / QuarterEnd

In [46]: pd.date_range('2014-1-1', periods=5, freq='Q-DEC')
Out[46]: 
DatetimeIndex(['2014-03-31', '2014-06-30', '2014-09-30', '2014-12-31',
               '2015-03-31'],
              dtype='datetime64[ns]', freq='Q-DEC')

In [47]: pd.date_range('2014-1-1', periods=5, freq='QS-DEC')
Out[47]: 
DatetimeIndex(['2014-03-01', '2014-06-01', '2014-09-01', '2014-12-01',
               '2015-03-01'],
              dtype='datetime64[ns]', freq='QS-DEC')

@kawochen
Copy link
Contributor

Those are months of IMM dates.

@jreback jreback modified the milestones: 0.20.0, Next Major Release Sep 20, 2016
@jreback jreback added the Deprecate Functionality to remove in pandas label Sep 20, 2016
@jreback jreback changed the title QuarterBegin and BQuarterBegin return days that are not quarter beginnings DEPR: QuarterBegin and BQuarterBegin return days that are not quarter beginnings Sep 20, 2016
@jreback
Copy link
Contributor

jreback commented Sep 20, 2016

as discussed in #14254 this should be changed in 0.20. In order to make this back-compat, I will propose that we show a warning if startingMonth is not specified. Of course this will show the warning for everyone, but I don't see a good alternative to avoid subtle changed behavior, e.g.

QuarterBegin() (starts on 3) -> Quarterbegin() (starts on 1) is quite subtle.

@TomAugspurger
Copy link
Contributor

I will propose that we show a warning if startingMonth is not specified

+1

@jreback jreback modified the milestones: 0.20.0, 0.21.0 Mar 23, 2017
@jreback jreback removed this from the 0.20.0 milestone Mar 23, 2017
@jreback
Copy link
Contributor

jreback commented Sep 23, 2017

any appetite for this one?

@jreback jreback modified the milestones: 0.21.0, 1.0 Oct 2, 2017
@tdpetrou
Copy link
Contributor

tdpetrou commented Oct 4, 2017

Also, these offsets seem to be poorly documented. I had no idea there was an option for startingMonth.

@jbrockmendel jbrockmendel mentioned this issue Dec 19, 2017
39 tasks
@TomAugspurger TomAugspurger modified the milestones: 1.0, Contributions Welcome Jul 6, 2018
@TomAugspurger
Copy link
Contributor

Pushing as not a blocker for 1.0

@datapythonista datapythonista modified the milestones: Contributions Welcome, Someday Jul 8, 2018
@mroeschke mroeschke removed the Bug label May 16, 2020
@Darrrian
Copy link

Darrrian commented Sep 2, 2020

Hi, what is the status of this issue? It really produces highly unexpected behavior, especially in context of documentation and function QuarterEnd - the fact that for Quarter end to have startingMonth = 3 and quarter ending on 31st of March, while Quarter begin gives 1st of March makes no sense together. I believe they should have defualt starting month equal to 1 (as the financial example is surely a minority case), and while quarterEnd by default should return the same value, quarter begin should return 1st of January...
If there is any logical reason why it should not work like that, please, let me know, however surely those two functions now return (together) inconsisten values....

@jreback
Copy link
Contributor

jreback commented Sep 3, 2020

@Darrrian happy to take a patch which deprecates - if u read the issue and the related ones there is no objection to the change

@davs2rt
Copy link

davs2rt commented Nov 18, 2020

This is not even self-consistent without startingMonth=1:

for m in range(1,13):
     date = pd.Timestamp(2020, m, 1)
     qdate = date - pd.offsets.QuarterBegin()
     print(date, "is in quarter", date.quarter, "which begins", qdate )
2020-01-01 00:00:00 is in quarter 1 which begins 2019-12-01 00:00:00
2020-02-01 00:00:00 is in quarter 1 which begins 2019-12-01 00:00:00
2020-03-01 00:00:00 is in quarter 1 which begins 2019-12-01 00:00:00  <- 03-01 in Q1, but 
2020-04-01 00:00:00 is in quarter 2 which begins 2020-03-01 00:00:00    Q2 begins on 03-01
2020-05-01 00:00:00 is in quarter 2 which begins 2020-03-01 00:00:00
2020-06-01 00:00:00 is in quarter 2 which begins 2020-03-01 00:00:00
2020-07-01 00:00:00 is in quarter 3 which begins 2020-06-01 00:00:00
2020-08-01 00:00:00 is in quarter 3 which begins 2020-06-01 00:00:00
2020-09-01 00:00:00 is in quarter 3 which begins 2020-06-01 00:00:00
2020-10-01 00:00:00 is in quarter 4 which begins 2020-09-01 00:00:00
2020-11-01 00:00:00 is in quarter 4 which begins 2020-09-01 00:00:00
2020-12-01 00:00:00 is in quarter 4 which begins 2020-09-01 00:00:00

Pawel-Kranzberg added a commit to Pawel-Kranzberg/pandas that referenced this issue Feb 18, 2021
@mroeschke mroeschke removed this from the Someday milestone Oct 13, 2022
@bryanwhiting
Copy link

bryanwhiting commented Jun 16, 2023

Still wrong as of Pandas 1.5.3:

        anchor_date = date(2020, 10, 1)
        start_date = (anchor_date + pd.offsets.QuarterBegin(1)).to_pydatetime().date()
        start_date

image

But this works:

ipdb> anchor_date + pd.offsets.QuarterBegin(1, startingMonth=1)
Timestamp('2021-01-01 00:00:00')

@ThomasA
Copy link

ThomasA commented Mar 5, 2024

I am amazed to see that this discussion has been going on for almost 10 years and there still does not seem to be any conclusion to it.
I think of quarters as starting in January, April, July, and October. I can make pandas.tseries.offsets.QuarterBegin behave accordingly by using (tested in Pandas 2.2.1):

for month in range(1,13):
    print(pd.Timestamp(month=month, day=1, year=2024) + pd.tseries.offsets.QuarterBegin(normalize=True, startingMonth=1))

2024-04-01 00:00:00
2024-04-01 00:00:00
2024-04-01 00:00:00
2024-07-01 00:00:00
2024-07-01 00:00:00
2024-07-01 00:00:00
2024-10-01 00:00:00
2024-10-01 00:00:00
2024-10-01 00:00:00
2025-01-01 00:00:00
2025-01-01 00:00:00
2025-01-01 00:00:00

But then I expect pd.tseries.offsets.QuarterBegin to return the ends of the same quarters when used with the same arguments:

for month in range(1,13):
    print(pd.Timestamp(month=month, day=1, year=2024) + pd.tseries.offsets.QuarterEnd(normalize=True, startingMonth=1))

2024-01-31 00:00:00
2024-04-30 00:00:00
2024-04-30 00:00:00
2024-04-30 00:00:00
2024-07-31 00:00:00
2024-07-31 00:00:00
2024-07-31 00:00:00
2024-10-31 00:00:00
2024-10-31 00:00:00
2024-10-31 00:00:00
2025-01-31 00:00:00
2025-01-31 00:00:00

Instead, I have to set startingMonth=3 for pd.tseries.offsets.QuarterEnd to return the ends of the same quarters as pd.tseries.offsets.QuarterBegin with startingMonth=1 and that seems inconsistent to me. I expect pd.tseries.offsets.QuarterBegin and pd.tseries.offsets.QuarterEnd to return, respectively, the beginnings and ends of the same quarters when run with the same arguments:

for month in range(1,13):
    print(pd.Timestamp(month=month, day=1, year=2024) + pd.tseries.offsets.QuarterEnd(normalize=True, startingMonth=3))

2024-03-31 00:00:00
2024-03-31 00:00:00
2024-03-31 00:00:00
2024-06-30 00:00:00
2024-06-30 00:00:00
2024-06-30 00:00:00
2024-09-30 00:00:00
2024-09-30 00:00:00
2024-09-30 00:00:00
2024-12-31 00:00:00
2024-12-31 00:00:00
2024-12-31 00:00:00

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Mar 21, 2024

I expect pd.tseries.offsets.QuarterBegin and pd.tseries.offsets.QuarterEnd to return, respectively, the beginnings and ends of the same quarters when run with the same arguments:

Agree, this is really odd:

In [29]: pandas.tseries.offsets.QuarterBegin().rollback(datetime(2000, 1, 15, 2))
Out[29]: Timestamp('1999-12-01 02:00:00')

In [30]: pandas.tseries.offsets.QuarterEnd().rollforward(datetime(2000, 1, 15, 2))
Out[30]: Timestamp('2000-03-31 02:00:00')

I'd also like to change startingMonth to be 1, but that requires a deprecation cycle to not break everyone's code...

The way forwards may be:

  • pandas 3.0: issue a warning if startingMonth is not specified, warning that the default will change for QuarterBegin
  • pandas 4.0: enforce the change

Though aside from the default, QuarterEnd's definition of "end" seems really off. Maybe that can just be changed as a bug fix in 3.0

I'll bring this up in the next call

@rt87
Copy link

rt87 commented Aug 21, 2024

Okay, whoever and for whatever reasons decided that 3 would be the best default for startingMonth, I honestly do not care, but this has not been fixed for an entire DECADE. Come on...

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Aug 21, 2024

@rt87 :

Okay, whoever and for whatever reasons decided that 3 would be the best default for startingMonth, I honestly do not care, but this has not been fixed for an entire DECADE. Come on...

People on every historic issue: "this hasn't been fixed in a decade, just change it already!"

People if/when it gets changed: "this broke my code, stop changing things!"

This could be changed as a breaking change in 3.0, but I'm not sure I have the energy to deal with potential backlash

@Darrrian
Copy link

Darrrian commented Aug 21, 2024

@rt87 So maybe fix it instead of being rude... Problem is that startingMonth = 3 is not the problem, the problem is that the 2 functions are incosistent together...
@MarcoGorelli Do you consider it code breaking change? Given that the behavior is clearly wrong now - functions QuarterBegin and QuarterEnd work incorrectli...
Or maybe it would be possible to "solve" it by creating new consistenent methods, dunno QuarterBeginNew or something haha... It is an annoyance, not a real problem...

@rt87
Copy link

rt87 commented Aug 21, 2024

OF COURSE it may break something. Since when is this a reason not to fix things? Such thoughts always remind me of https://xkcd.com/1172/. By that logic, the XKCD heat problem will remain till the end of time... XD.

Also, did you stop to think about all apps that may be working incorrectly beause of this? Past, present and future? Yes, the responsible DEVs should have tested their code, but nonetheless.... So, call me rude all you want, I am sticking with "needs to be fixed!". Not that anyone seems interested in my opinion, but I strongly feel that procrastinating is not the way to go here. And btw I never said "just screw everyone and flip the switch tomorrow". After all, there are means to handle breaking changes. Introduce deprecations etc..., if that would have been implemented said decade ago, this issue would now be solved.

And following that train of thought: Even if I do not really like it, libs DO introduce breaking changes in minor versions. Yes, they shouldn't, but sometimes it is warranted, e.g. for fixing bugs...

I'm obviously not in charge of this, so fix it in v2.x, fix it in v3, or just never fix it at all. Decide as you see fit! This is just my two cents regarding bugs, and I will continue to think (and speak) "Come on..." when known bugs have not been addressed for such a long time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Frequency DateOffsets
Projects
None yet
Development

Successfully merging a pull request may close this issue.