-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Inconsistent date_range output when using CustomBusinessDay as freq #57456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I am guessing the Python range won't return the last value. Does it relate to this issue? |
I can replicate the issue (installed versions given at the bottom). I tried messing around with the inputs a bit, and found some potentially interesting things. For one, this does not appear to be an issue when there is no weekday mask. The following code works as you expect: import pandas as pd
from pandas.tseries.offsets import CustomBusinessDay
from datetime import datetime
start = datetime(2024, 2, 8, 23)
end = datetime(2024, 2, 16, 14)
d_range = pd.date_range(start, end)
print(d_range)
#DatetimeIndex(['2024-02-08 23:00:00', '2024-02-09 23:00:00',
# '2024-02-10 23:00:00', '2024-02-11 23:00:00',
# '2024-02-12 23:00:00', '2024-02-13 23:00:00',
# '2024-02-14 23:00:00', '2024-02-15 23:00:00'],
# dtype='datetime64[ns]', freq='D')
start = datetime(2024, 2, 9, 23)
d_range = pd.date_range(start, end)
print(d_range)
#DatetimeIndex(['2024-02-09 23:00:00', '2024-02-10 23:00:00',
# '2024-02-11 23:00:00', '2024-02-12 23:00:00',
# '2024-02-13 23:00:00', '2024-02-14 23:00:00',
# '2024-02-15 23:00:00'],
# dtype='datetime64[ns]', freq='D') I thought it might have something to do with the fact that the import pandas as pd
from pandas.tseries.offsets import CustomBusinessDay
from datetime import datetime
start = datetime(2024, 2, 8, 23)
end = datetime(2024, 2, 15, 14)
offset = CustomBusinessDay(weekmask='Sun Mon Tue Wed')
d_range = pd.date_range(start, end, freq=offset)
print(d_range)
#DatetimeIndex(['2024-02-11 23:00:00', '2024-02-12 23:00:00',
# '2024-02-13 23:00:00', '2024-02-14 23:00:00'],
# dtype='datetime64[ns]', freq='C')
start = datetime(2024, 2, 9, 23)
d_range = pd.date_range(start, end, freq=offset)
print(d_range)
#DatetimeIndex(['2024-02-11 23:00:00', '2024-02-12 23:00:00',
# '2024-02-13 23:00:00', '2024-02-14 23:00:00'],
# dtype='datetime64[ns]', freq='C') After that, I thought maybe it had to do with the fact that both the beginning and ending day fell on masked days, so I tried shifting the starting days back by one: import pandas as pd
from pandas.tseries.offsets import CustomBusinessDay
from datetime import datetime
start = datetime(2024, 2, 7, 23)
end = datetime(2024, 2, 15, 14)
offset = CustomBusinessDay(weekmask='Sun Mon Tue Wed')
d_range = pd.date_range(start, end, freq=offset)
print(d_range)
#DatetimeIndex(['2024-02-07 23:00:00', '2024-02-11 23:00:00',
# '2024-02-12 23:00:00', '2024-02-13 23:00:00'],
# dtype='datetime64[ns]', freq='C')
start = datetime(2024, 2, 8, 23)
d_range = pd.date_range(start, end, freq=offset)
print(d_range)
#DatetimeIndex(['2024-02-11 23:00:00', '2024-02-12 23:00:00',
# '2024-02-13 23:00:00', '2024-02-14 23:00:00'],
# dtype='datetime64[ns]', freq='C') So it seems like the problem does exist in this case, which was also the case of the original example given. After that, I tried to change the import pandas as pd
from pandas.tseries.offsets import CustomBusinessDay
from datetime import datetime
start = datetime(2024, 2, 7, 23)
end = datetime(2024, 2, 14, 14)
offset = CustomBusinessDay(weekmask='Sun Mon Tue Wed')
d_range = pd.date_range(start, end, freq=offset)
print(d_range)
#DatetimeIndex(['2024-02-07 23:00:00', '2024-02-11 23:00:00',
# '2024-02-12 23:00:00', '2024-02-13 23:00:00'],
# dtype='datetime64[ns]', freq='C')
start = datetime(2024, 2, 8, 23)
d_range = pd.date_range(start, end, freq=offset)
print(d_range)
#DatetimeIndex(['2024-02-11 23:00:00', '2024-02-12 23:00:00',
# '2024-02-13 23:00:00'],
# dtype='datetime64[ns]', freq='C') So, overall, seems like it takes effect when the start and end days fall on masked days of the week. Version detailsINSTALLED VERSIONS ------------------ commit : a671b5a python : 3.10.9.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.22000 machine : AMD64 processor : Intel64 Family 6 Model 142 Stepping 10, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_United States.1252pandas : 2.1.4 |
take |
take |
TST: Added test for date_range for bug #57456 Co-authored-by: Matthew Roeschke <[email protected]>
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
The last date generated from
date_range
withstart = datetime(2024, 2, 8, 23)
does not equal the last date generated fromdate_range
whenstart = datetime(2024, 2, 9, 23)
, other arguments unchanged.When the start date falls on a day within the weekmask in
CustomBusinessDay
,date_range
output does not include 2024-02-15 (Thursday). However, when the start date is on a day not included in the weekmask (Friday or Saturday),date_range
does include 2024-02-15.Expected Behavior
With a freq of
CustomBusinessDay(weekmask='Sun Mon Tue Wed Thu')
and an end of 2024-02-16, I'd expectdate_range
to always include 2024-02-15, regardless of start.Installed Versions
INSTALLED VERSIONS
commit : 2e218d1
python : 3.10.4.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22621
machine : AMD64
processor : Intel64 Family 6 Model 141 Stepping 1, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.5.3
numpy : 1.24.3
pytz : 2022.7.1
dateutil : 2.8.2
setuptools : 58.1.0
pip : 24.0
Cython : 0.29.32
pytest : 7.2.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.9.1
html5lib : 1.1
pymysql : 1.0.2
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.4.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.3
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.1
pandas_gbq : None
pyarrow : 15.0.0
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.9.0
snappy : None
sqlalchemy : 2.0.12
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None
tzdata : 2023.3
The text was updated successfully, but these errors were encountered: