Skip to content

BUG: df.assign no longer works with multilevel columns #61295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
Zendaug opened this issue Apr 15, 2025 · 4 comments
Closed
3 tasks done

BUG: df.assign no longer works with multilevel columns #61295

Zendaug opened this issue Apr 15, 2025 · 4 comments
Labels

Comments

@Zendaug
Copy link

Zendaug commented Apr 15, 2025

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np

# Creating a DataFrame with multilevel columns
arrays = [['A', 'A', 'B', 'B'], ['one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(3, 4), columns=index)

# Try to create a new column using "assign"
df = df.assign(**{('C', 'one'): [1, 2, 3]})

Issue Description

The final line reports the error: TypeError: keywords must be strings

Expected Behavior

It should create a new column ("C", 'one') with 1,2,3 in it.

Installed Versions

INSTALLED VERSIONS

commit : 0691c5c
python : 3.12.3
python-bits : 64
OS : Windows
OS-release : 11
Version : 10.0.26100
machine : AMD64
processor : Intel64 Family 6 Model 154 Stepping 4, GenuineIntel
byteorder : little
LC_ALL : None
LANG : en
LOCALE : English_Australia.1252

pandas : 2.2.3
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0.post0
pip : 25.0.1
Cython : None
sphinx : 7.3.7
IPython : 8.30.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
blosc : None
bottleneck : 1.4.2
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : 3.1.6
lxml.etree : 5.3.0
matplotlib : 3.10.0
numba : None
numexpr : 2.10.1
odfpy : None
openpyxl : 3.1.5
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : 19.0.0
pyreadstat : 1.2.7
pytest : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.15.2
sqlalchemy : None
tables : None
tabulate : 0.9.0
xarray : None
xlrd : None
xlsxwriter : 3.1.1
zstandard : None
tzdata : 2023.3
qtpy : 2.4.1
pyqt5 : None

@Zendaug Zendaug added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 15, 2025
@asishm
Copy link
Contributor

asishm commented Apr 16, 2025

This is a python limitation. You're using tuple unpacking to pass in arguments to a function. Function argument names have to be valid variable names. https://stackoverflow.com/questions/65392503/keyword-error-generated-when-passing-a-dictionary-to-a-function-with-tuples-as-t

@arthurlw
Copy link
Contributor

arthurlw commented Apr 16, 2025

Thanks for raising this! The assign function accepts **kwargs which only allows string keys, so tuple-based MultiIndex labels can’t be passed this way.

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.assign.html

Technically speaking this isn't a pandas bug, but maybe pandas could consider supporting tuple keys?

@asishm
Copy link
Contributor

asishm commented Apr 16, 2025

pandas could improve the error message

this is not possible (with the current signature), the error is thrown directly by the python interpreter.

In [1]: def foo(**kwargs):
    ...:     print(kwargs)
    ...:

In [2]: foo(**{('C', 'one'): 2})
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 1
----> 1 foo(**{('C', 'one'): 2})

TypeError: keywords must be strings

@rhshadrach
Copy link
Member

Agreed @asishm - nothing can be done here. Closing.

@rhshadrach rhshadrach removed the Needs Triage Issue that has not been reviewed by a pandas team member label Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants