Dataframe loses level name after concat, only if columns types are str #27230

pepicello · 2019-07-04T15:17:49Z

Code Sample

a = pd.DataFrame([1], index=[1], columns=['1'])
a.columns.names = ['n']
b = pd.DataFrame([1, 1], index=['1', '2'], columns=[1]).T
b.columns.names = ['n']
df1 = pd.concat([a, b])

a = pd.DataFrame([1], index=[1], columns=[1])
a.columns.names = ['n']
b = pd.DataFrame([1, 1], index=[1, 2], columns=[1]).T
b.columns.names = ['n']
df2 = pd.concat([a, b])

Problem description

When concatenating two dataframes with the same column level name ('n' in the case above), it is expected to find the same level name after the concatenation, i.e. on df1. This is not the case. If the columns are not str, as in df2, the behaviour is as expected. If this is not the expected behaviour, the two results should at least agree.

Expected Output

df1.columns.name == df2.columns.name

Output of `pd.show_versions()`

commit: None
python: 3.7.3.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.24.2
pytest: None
pip: 19.1.1
setuptools: 41.0.1
Cython: None
numpy: 1.17.0rc1
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

The text was updated successfully, but these errors were encountered:

WillAyd · 2019-07-04T15:18:57Z

I think this is a duplicate of #21629 - can you check?

pepicello · 2019-07-04T15:21:14Z

Yes, looks like it is. The only addition here is that it does not happen with non-str, but this can be closed as solving that issue should also solve this. Thanks!

pepicello closed this as completed Jul 4, 2019

WillAyd added the Duplicate Report Duplicate issue or pull request label Jul 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataframe loses level name after concat, only if columns types are str #27230

Dataframe loses level name after concat, only if columns types are str #27230

pepicello commented Jul 4, 2019

WillAyd commented Jul 4, 2019

pepicello commented Jul 4, 2019

Dataframe loses level name after concat, only if columns types are str #27230

Dataframe loses level name after concat, only if columns types are str #27230

Comments

pepicello commented Jul 4, 2019

Code Sample

Problem description

Expected Output

Output of pd.show_versions()

WillAyd commented Jul 4, 2019

pepicello commented Jul 4, 2019

Output of `pd.show_versions()`