Skip to content

BUG: unstack(fill_value) does nothing when unstacking multiple columns #13971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jzwinck opened this issue Aug 11, 2016 · 2 comments · Fixed by #17887
Closed

BUG: unstack(fill_value) does nothing when unstacking multiple columns #13971

jzwinck opened this issue Aug 11, 2016 · 2 comments · Fixed by #17887
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@jzwinck
Copy link
Contributor

jzwinck commented Aug 11, 2016

In #9746 we got a fill_value argument for pd.unstack(). But it does nothing if we unstack multiple levels at once:

df = pd.DataFrame({'x':['a', 'a', 'b'], 'y':['j', 'k', 'j'], 'z':[0, 1, 2], 'w':[0, 1, 2]})
df.set_index(['x', 'y', 'z']).unstack(['x', 'y'], fill_value=0)

That gives:

     w          
x    a         b
y    j    k    j
z               
0  0.0  NaN  NaN
1  NaN  1.0  NaN
2  NaN  NaN  2.0

It should give zeros instead of NaN.

Pandas 0.18.1.

@jreback
Copy link
Contributor

jreback commented Aug 11, 2016

multiple unstacks are fraught with issues, see #9023 and #11847. you can do this. of course a PR to fix is welcome!

In [19]: df = pd.DataFrame({'x':['a', 'a', 'b'], 'y':['j', 'k', 'j'], 'z':[0, 1, 2], 'w':[0, 1, 2]})
    ...: df.set_index(['x', 'y', 'z']).unstack('x', fill_value=0).unstack('y', fill_value=0)
    ...: 
    ...: 
Out[19]: 
   w         
x  a     b   
y  j  k  j  k
z            
0  0  0  0  0
1  0  1  0  0
2  0  0  2  0

I'll mark it, but its a rabbit hole.

@jreback jreback added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode Difficulty Intermediate labels Aug 11, 2016
@jreback jreback added this to the Next Major Release milestone Aug 11, 2016
@kordek
Copy link
Contributor

kordek commented Aug 30, 2016

Unstacking is inconsistent since _unstack_frame is used if column name is passed to unstack() and _unstack_multiple if the list is passed. Maybe it would be OK to play a little and make it more consistent? Or there is some reason (which I am not aware of), for this (almost) complete separation?

@jreback jreback modified the milestones: Next Major Release, 0.21.0 Oct 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants