Skip to content

BUG: Unable to use CustomBusinessDays in a MultiIndex #57949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
cokelid opened this issue Mar 21, 2024 · 2 comments
Open
3 tasks done

BUG: Unable to use CustomBusinessDays in a MultiIndex #57949

cokelid opened this issue Mar 21, 2024 · 2 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@cokelid
Copy link

cokelid commented Mar 21, 2024

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

bus_day = pd.offsets.CustomBusinessDay(holidays=['2023-07-04'])  # Mon-Fri except July 4th

dates = pd.date_range('2023-07-03', '2023-07-06', freq=bus_day)
df1 = pd.DataFrame({'col1': [300, 500, 600]}, index=dates)
print(df1.index.freq.holidays)  # Can get freq from dataframe index

mi = pd.MultiIndex.from_product([dates, ('foo','bar')])
df2 = pd.DataFrame({'col1': [300, 301, 500, 501, 600, 601]}, index=mi)
print(df2.index.get_level_values(0).freq)  # Gives None. Freq lost from dataframe multi-index

periods = pd.period_range('2023-07-03', '2023-07-06', freq=bus_day)  # Try PeriodIndex?
# Gives: TypeError: CustomBusinessDay cannot be used with Period or PeriodDtype

Issue Description

I can use a CustomBusinessDay frequency in a DatetimeIndex and retrieve the frequency from the .index of a dataframe.

If I use a DatetimeIndex in a MultiIndex, then I lose the freq from the level of the index.

I note the freq does persist in a PeriodIndex within a MultiIndex, but you can't use a CustomBusinessDay in a PeriodIndex as shown.

Expected Behavior

I would like the Datetime level of the MultiIndex to show the orginial freq:
print(df2.index.get_level_values(0).freq) would give <CustomBusinessDay>

Installed Versions

INSTALLED VERSIONS ------------------ commit : a671b5a python : 3.10.13.final.0 python-bits : 64 OS : Linux OS-release : 5.15.145.2-1.cm2 Version : #1 SMP Wed Jan 17 15:39:07 UTC 2024 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : en_US.UTF-8 LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 2.1.4
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0.post0
setuptools : 65.5.0
pip : 24.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.3
IPython : 8.22.2
pandas_datareader : None
bs4 : 4.12.3
bottleneck : None
dataframe-api-compat: None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None

@cokelid cokelid added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 21, 2024
@dontgoto
Copy link
Contributor

dontgoto commented Mar 23, 2024

This loss of information does happen with the DateTimeIndex and not with the PeriodIndex. Reason is that the freqis an attribute of each value in the PeriodIndex. With the DateTimeIndex, the freq, etc. get lost when calling get_level_values(), due to the shallow copying logic.

The DateTimeIndex is the only index type with this edge case.

@cokelid
Copy link
Author

cokelid commented Mar 25, 2024

Thanks. Is this the shallow copy you're referring to?
return lev._shallow_copy(filled, name=name)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants