-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Unstack throws when run on subset of DataFrame #19351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Milestone
Comments
As a workaround, this yields the expected results: index_cols = df1.index.names
df1.iloc[1:2].reset_index().set_index(index_cols).unstack('ix2') |
PR welcome! |
cannot repeat the error bpython version 0.17.1 on top of Python 3.6.5 /usr/bin/python3 >>> import pandas as pd >>> pd.__version__ '0.20.1' >>> df3 = pd.DataFrame([ ... [1, 1, None, None, 30.0], ... [2, None, None, None, 30.0], ... ], columns=[u'ix1', u'ix2', u'col1', u'col2', u'col3']).set_index([u'ix1', 'ix2']) >>> >>> df3.iloc[1:2].unstack('ix2') col1 col2 col3 ix2 NaN NaN NaN ix1 2 None None 30.0 |
@EverydayQA cool, thanks! (But surprising: before closing, I think it's worth trying to git bisect until we understand what fixed the problem) |
acarl005
pushed a commit
to acarl005/pandas
that referenced
this issue
Nov 6, 2019
Unit tests for this issue in PR #29438 |
jreback
pushed a commit
that referenced
this issue
Nov 6, 2019
Reksbril
pushed a commit
to Reksbril/pandas
that referenced
this issue
Nov 18, 2019
proost
pushed a commit
to proost/pandas
that referenced
this issue
Dec 19, 2019
proost
pushed a commit
to proost/pandas
that referenced
this issue
Dec 19, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Code Sample
Consider the following three cases:
Problem description
When
DataFrame().unstack()
is run on a subset of the DataFrame (i.e. the Index levels contain values that are not present in the subset), spurious errors are triggered.We have encountered the exception triggered by
df1
in production whengroupby(A).apply(B)
which performed an unstack in theB
function. When trying to build a reduced test case, we encountered the issues ofdf2
anddf3
and found them worthwhile to report.Note that none of these fail when the DataFrame has only 3 columns or less.
Expected Output
In all cases, I expect a pivoted DataFrame simila to this:
Actual Output
Output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: