df.reset_index introduces wrong elements with NaN index values #3727

floux · 2013-05-31T11:52:34Z

When there are NaN values in the index, then reset_index introduces incorrect values. That is the case even if the reset_index operation occurs on a different index than the one containing the NaN values:

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({
   ...:         'col1' : [1,2,3,4,5,6,7,8],
   ...:         'col2' : [8,7,6,5,4,3,2,1],
   ...: })

In [4]: arrays = [
   ...:         ['a','a','a','a','b','b','b','b'],
   ...:         ['c',None,'d',None,'e','e',None,'f']
   ...:         ]

In [6]: idx = pd.MultiIndex.from_tuples(zip(*arrays),
   ...:         names=['first', 'second'])

In [7]: df.index = idx

In [8]: df
Out[8]:
              col1  col2
first second
a     c          1     8
      NaN        2     7
      d          3     6
      NaN        4     5
b     e          5     4
      e          6     3
      NaN        7     2
      f          8     1

In [9]: df.reset_index()
Out[9]:
  first second  col1  col2
0     a      c     1     8
1     a      f     2     7
2     a      d     3     6
3     a      f     4     5
4     b      e     5     4
5     b      e     6     3
6     b      f     7     2
7     b      f     8     1

In [10]: df.reset_index('second')
Out[10]:
      second  col1  col2
first
a          c     1     8
a          f     2     7
a          d     3     6
a          f     4     5
b          e     5     4
b          e     6     3
b          f     7     2
b          f     8     1

In [11]: df.reset_index('first')
Out[11]:
       first  col1  col2
second
c          a     1     8
f          a     2     7
d          a     3     6
f          a     4     5
e          b     5     4
e          b     6     3
f          b     7     2
f          b     8     1

jreback · 2013-05-31T13:14:17Z

This is a dup of #3586, fixed in #3587, currently in master

In [25]: df
Out[25]: 
              col1  col2
first second            
a     c          1     8
      NaN        2     7
      d          3     6
      NaN        4     5
b     e          5     4
      e          6     3
      NaN        7     2
      f          8     1

In [26]: df.reset_index().set_index('first')
Out[26]: 
      second  col1  col2
first                   
a          c     1     8
a        NaN     2     7
a          d     3     6
a        NaN     4     5
b          e     5     4
b          e     6     3
b        NaN     7     2
b          f     8     1

In [27]: df.reset_index().set_index('second')
Out[27]: 
       first  col1  col2
second                  
c          a     1     8
NaN        a     2     7
d          a     3     6
NaN        a     4     5
e          b     5     4
e          b     6     3
NaN        b     7     2
f          b     8     1

floux · 2013-06-01T14:13:57Z

Okay, thanks.

floux closed this as completed Jun 1, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

df.reset_index introduces wrong elements with NaN index values #3727

df.reset_index introduces wrong elements with NaN index values #3727

floux commented May 31, 2013

jreback commented May 31, 2013

Uh oh!

floux commented Jun 1, 2013

Uh oh!

Uh oh!

df.reset_index introduces wrong elements with NaN index values #3727

df.reset_index introduces wrong elements with NaN index values #3727

Comments

floux commented May 31, 2013

jreback commented May 31, 2013

Uh oh!

floux commented Jun 1, 2013

Uh oh!