Skip to content

BUG: CustomBusinessMonthBegin(End) sometimes ignores extra offset #41356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
jackzyliu opened this issue May 6, 2021 · 0 comments · Fixed by #41488
Closed
3 tasks done

BUG: CustomBusinessMonthBegin(End) sometimes ignores extra offset #41356

jackzyliu opened this issue May 6, 2021 · 0 comments · Fixed by #41488
Labels
Bug Frequency DateOffsets
Milestone

Comments

@jackzyliu
Copy link
Contributor

jackzyliu commented May 6, 2021

  • I have checked that this issue has not already been reported (although it is somewhat related to CustomBusinessMonthBegin offset parameter #14869).

  • I have confirmed this bug exists on the latest version of pandas. [1.2.4]

  • (optional) I have confirmed this bug exists on the master branch of pandas. [1.3.0.dev0+1544.ga43c42c32d.dirty]


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd
from datetime import timedelta
print(pd.__version__)    # 1.3.0.dev0+1544.ga43c42c32d.dirty

offset_1 = pd.offsets.CustomBusinessMonthBegin(n=1)           # 1 month BMS
offset_1_plus_5d = pd.offsets.CustomBusinessMonthBegin(n=1, offset=timedelta(days=5))   # 1 month BMS + 5 days
offset_2_plus_5d = pd.offsets.CustomBusinessMonthBegin(n=2, offset=timedelta(days=5))   # 2 month BMS + 5 days

test_timestamp = pd.Timestamp('2021-03-01')
print(test_timestamp + offset_1)                # Expected: 2021-04-01, Actual: 2021-04-01 [Correct]
print(test_timestamp + offset_1_plus_5d)  # Expected: 2021-04-06, Actual: 2021-04-01 [Incorrect] 
print(test_timestamp + offset_2_plus_5d)  # Expected: 2021-05-08, Actual: 2021-05-08 [Correct]

Expected Output

I am using 2021-03-01 as the example in the code snippet but I am listing below the expected vs. actual outputs with 2021-04-01 as well.

Date offset Expected Actual Correct
2021-03-01 offset_1 2021-04-01 2021-04-01 ✔️
2021-03-01 offset_1_plus_5d 2021-04-06 2021-04-01
2021-03-01 offset_2_plus_5d 2021-05-08 2021-05-08 ✔️
2021-04-01 offset_1 2021-05-03 2021-05-03 ✔️
2021-04-01 offset_1_plus_5d 2021-05-08 2021-05-08 ✔️
2021-04-01 offset_2_plus_5d 2021-06-06 2021-06-01

Problem description

From reading the implementations, I think the expected behavior of test_timestamp + offset_1_plus_5d for example should be essentially test_timestamp + offset_1 + timedelta(days=5), but we are seeing some inconsistent behaviors.

The issue lies with the .apply logic in _CustomBusinessMonth [link here] , which first rolls with a vanilla MonthBegin or MonthEnd (to variable new) and then uses CustomBusinessDay to roll new to the corresponding business day + extra offset (variable result). However, if new is already on offset, then no further rolling will be applied and no extra offset (e.g. 5 days) will be added either, leading to the inconsistencies we are seeing here. See implementation for rollforward and rollback.

As an example, with CustomBusinessMonthBegin(n=1, offset=timedelta(days=5)), 2021-03-01 first gets rolled to 2021-04-01. Since 2021-04-01 is already a business day, it is not rolled to the next business day and the extra 5-day offset is not applied either.

Since _CustomBusinesMonth is the underlying implementation for both CustomBusinessMonthBegin and CustomBusinessMonthEnd, this issue applies to both classes (although I am only showing CustomBusinessMonthBegin ).

Fix: #41488

Output of pd.show_versions()

INSTALLED VERSIONS

commit : a43c42c
python : 3.8.8.final.0
python-bits : 64
OS : Linux
OS-release : 5.8.0-50-generic
Version : #56~20.04.1-Ubuntu SMP Mon Apr 12 21:46:35 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.0.dev0+1544.ga43c42c32d.dirty
numpy : 1.20.1
pytz : 2021.1
dateutil : 2.8.1
pip : 21.0.1
setuptools : 49.6.0.post20210108
Cython : 0.29.22
pytest : 6.2.2
hypothesis : 6.8.1
sphinx : 3.5.3
blosc : None
feather : None
xlsxwriter : 1.3.7
lxml.etree : 4.6.2
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.3
IPython : 7.21.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.3.2
fsspec : 0.8.7
fastparquet : 0.5.0
gcsfs : 0.7.2
matplotlib : 3.3.4
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : 3.0.0
pyxlsb : None
s3fs : 0.5.2
scipy : 1.6.1
sqlalchemy : 1.4.2
tables : 3.6.1
tabulate : 0.8.9
xarray : 0.17.0
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.53.0

@jackzyliu jackzyliu added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 6, 2021
@jackzyliu jackzyliu changed the title BUG: CustomBusinessMonthBegin and CustomBusinessMonthEnd have inconsistent behaviors with offset parameter BUG: CustomBusinessMonthBegin and CustomBusinessMonthEnd sometimes does not apply the extra offset May 15, 2021
@jackzyliu jackzyliu changed the title BUG: CustomBusinessMonthBegin and CustomBusinessMonthEnd sometimes does not apply the extra offset BUG: CustomBusinessMonthBegin and CustomBusinessMonthEnd sometimes do not apply the extra offset May 15, 2021
@jackzyliu jackzyliu changed the title BUG: CustomBusinessMonthBegin and CustomBusinessMonthEnd sometimes do not apply the extra offset BUG: CustomBusinessMonthBegin(End) sometimes ignores extra offset May 15, 2021
@lithomas1 lithomas1 added Frequency DateOffsets and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 2, 2021
@jreback jreback added this to the 1.4 milestone Jul 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Frequency DateOffsets
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants