Skip to content

Unstacking a MultiIndex with integer names is ambiguous #17123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mroeschke opened this issue Jul 30, 2017 · 4 comments
Open

Unstacking a MultiIndex with integer names is ambiguous #17123

mroeschke opened this issue Jul 30, 2017 · 4 comments
Labels
Enhancement MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@mroeschke
Copy link
Member

Code Sample, a copy-pastable example if possible

In [19]: index = pd.MultiIndex.from_tuples([('one', 'a'), ('one', 'b'),
    ...: ...                                    ('two', 'a'), ('two', 'b'
    ...: )])

In [20]: s = pd.Series(np.arange(1.0, 5.0), index=index)

In [21]: s
Out[21]:
one  a    1.0
     b    2.0
two  a    3.0
     b    4.0
dtype: float64

In [22]: s.unstack(0)
Out[22]:
   one  two
a  1.0  3.0
b  2.0  4.0

In [24]: s.index.names= [1,0]

In [25]: s.unstack(0)
Out[25]:
0      a    b
1
one  1.0  2.0
two  3.0  4.0

Problem description

Related #14969

When calling unstack(level=1) on a MultiIndex with integer names, it is ambiguous whether to unstack by level name or the level position since both are valid level identifiers in unstack() (but will prioritize level name)

While possibly an uncommon case, a keyword identifier in unstack() may help clarify whether the user wants to unstack by the level position or level name.

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: 465c59f964c8d71d8bedd16fcaa00e4328177cb1
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-45-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None

pandas: 0.21.0.dev+313.g465c59f
pytest: 3.0.6
pip: 8.1.2
setuptools: 27.2.0
Cython: 0.24.1
numpy: 1.11.2
scipy: None
pyarrow: None
xarray: None
IPython: 5.1.0
sphinx: 1.4.8
patsy: None
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: None
@gfyoung gfyoung added MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode API Design labels Jul 30, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 30, 2017

Uncommon or not, it makes sense that it should be clarified! 😄

@jbrockmendel
Copy link
Member

Could/should there be a check for the ambiguous cases? This seems like the sort of thing where it should raise rather than guess.

@gfyoung
Copy link
Member

gfyoung commented Nov 22, 2017

I don't see why not. At the same time, we should try to allow the user clarify their intentions, hence a new parameter as was suggested by @mroeschke

@mroeschke
Copy link
Member Author

IMO I think a signature of unstack(level=None, name=None, fill_value=None) would be better. It could still default to stacking by the last level, but the user can clearly specify to unstack by the level position or level name. Mixing arguments can raise as ambiguous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

3 participants