Skip to content

BUG: pd.pivot_table with single column return DataFrame.columns as MultiIndex #23379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Sobes76rus opened this issue Oct 27, 2018 · 3 comments
Closed
Labels
MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@Sobes76rus
Copy link

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

df1 = pd.DataFrame(np.arange(9).reshape(3, 3), columns=['a', 'b', 'c'])
df2 = df1.loc[df1['a'] == 100]
df3 = pd.pivot_table(df2, values='a', index='b', columns='c', aggfunc='sum')
df3.columns

Problem description

df3.columns is MultiIndex

MultiIndex(levels=[['a'], []],
           labels=[[], []],
           names=[None, 'c'])

In sample above I call pivot_table with columns='c' (just one column) so I expect pivot_table to return DataFrame with Index-ed columns, not MultiIndex-ed.

More examples:

I did some research to understand similar cases. Here is a bunch of examples you can find useful.

  1. df2 defined as Empty DataFrame using pd.DataFrame(). df3.columns in this case is Index (makes sense).
In [12]: df2 = pd.DataFrame(columns=['a', 'b', 'c'])
    ...: df3 = pd.pivot_table(df2, values='a', index='b', columns='c', aggfunc='sum')
    ...: df3.columns
Out[12]: Index([], dtype='object')
  1. df2 defined as df1.loc[df1['a'] == 0]. df3.columns in this case is Index (makes sense too).
In [13]: df2 = df1.loc[df1['a'] == 0]
    ...: df3 = pd.pivot_table(df2, values='a', index='b', columns='c', aggfunc='sum')
    ...: df3.columns
Out[13]: Int64Index([2], dtype='int64', name='c')
  1. df2 defined as Empty DataFrame with dtype=Int64. df3.columns in this case is MultiIndex (and this not makes sense at all, what is the difference?).
In [19]: df2 = pd.DataFrame(columns=['a', 'b', 'c'], dtype='int64')
    ...: df3 = pd.pivot_table(df2, values='a', index='b', columns='c', aggfunc='sum')
    ...: df3.columns
Out[19]:
MultiIndex(levels=[['a'], []],
           labels=[[], []],
           names=[None, 'c'])
  1. df2 defined as not empty DataFrame. Call of pivot_table return Index-ed DataFrame (and this makes sense again).
In [24]: df2 = pd.DataFrame([[1, 2, 3]], columns=['a', 'b', 'c'], dtype='int64')
    ...: df3 = pd.pivot_table(df2, values='a', index='b', columns='c', aggfunc='sum')
    ...: df3.columns
Out[24]: Int64Index([3], dtype='int64', name='c')
My assumptions:

For Empty DataFrame with dtype=object pivot_table returns DataFrame with Index-ed columns (as expected)
For non-empty DataFrame pivot_table returns DataFrame with Index-ed columns (as expected)
For Empty DataFrame with numeric dtype pivot_table returns DataFrame with MutiIndex-ed columns (I suppose it’s a bug)

Any insights?

Expected Output

Int64Index([], dtype='int64', name='c')

Output of pd.show_versions()

commit: None python: 3.6.2.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-38-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.23.4
pytest: 3.1.2
pip: 9.0.1
setuptools: 36.4.0
Cython: 0.28.5
numpy: 1.15.2
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: 1.0.5
lxml: None
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@gfyoung gfyoung added Reshaping Concat, Merge/Join, Stack/Unstack, Explode MultiIndex labels Oct 27, 2018
@gfyoung
Copy link
Member

gfyoung commented Oct 27, 2018

That does look a little odd, but not sure if intentional or not...

cc @jreback @toobaz

@0x26res
Copy link

0x26res commented Oct 7, 2019

#21932 and #13483 are possible duplicates

@mroeschke
Copy link
Member

Yes, appears to be a duplicate of #21932 which just needs a test. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

4 participants