You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MultiIndex with categorical levels without missing values, this works:
In [23]: idx = pd.MultiIndex([pd.CategoricalIndex(['A', 'B']), pd.CategoricalIndex(['a', 'b'])], [[0, 0, 1, 1], [0, 1, 0, 1]])
In [25]: df = pd.DataFrame({'col': range(len(idx))}, index=idx)
In [26]: df
Out[26]:
col
A a 0
b 1
B a 2
b 3
In [28]: df.reset_index()
Out[28]:
level_0 level_1 col
0 A a 0
1 A b 1
2 B a 2
3 B b 3
Now with a missing value (note the last -1 in the labels, that's the only difference):
In [29]: idx = pd.MultiIndex([pd.CategoricalIndex(['A', 'B']), pd.CategoricalIndex(['a', 'b'])], [[0, 0, 1, 1], [0, 1, 0, -1]])
In [30]: df = pd.DataFrame({'col': range(len(idx))}, index=idx)
In [31]: df
Out[31]:
col
A a 0
b 1
B a 2
NaN 3
In [32]: df.reset_index()
/home/joris/miniconda3/lib/python3.5/site-packages/pandas/core/frame.py:4091: FutureWarning: Interpreting negative values in 'indexer' as missing values.
In the future, this will change to meaning positional indicies
from the right.
Use 'allow_fill=True' to retain the previous behavior and silence this
warning.
Use 'allow_fill=False' to accept the new behavior.
values = values.take(labels)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~/miniconda3/lib/python3.5/site-packages/pandas/core/dtypes/cast.py in maybe_upcast_putmask(result, mask, other)
249 try:
--> 250 np.place(result, mask, other)
251 except Exception:
~/miniconda3/lib/python3.5/site-packages/numpy/lib/function_base.py in place(arr, mask, vals)
2371 raise TypeError("argument 1 must be numpy.ndarray, "
-> 2372 "not {name}".format(name=type(arr).__name__))
2373
TypeError: argument 1 must be numpy.ndarray, not Categorical
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-32-6983677cc901> in <module>()
----> 1 df.reset_index()
~/miniconda3/lib/python3.5/site-packages/pandas/core/frame.py in reset_index(self, level, drop, inplace, col_level, col_fill)
4136 name = tuple(name_lst)
4137 # to ndarray and maybe infer different dtype
-> 4138 level_values = _maybe_casted_values(lev, lab)
4139 new_obj.insert(0, name, level_values)
4140
~/miniconda3/lib/python3.5/site-packages/pandas/core/frame.py in _maybe_casted_values(index, labels)
4092 if mask.any():
4093 values, changed = maybe_upcast_putmask(
-> 4094 values, mask, np.nan)
4095 return values
4096
~/miniconda3/lib/python3.5/site-packages/pandas/core/dtypes/cast.py in maybe_upcast_putmask(result, mask, other)
250 np.place(result, mask, other)
251 except Exception:
--> 252 return changeit()
253
254 return result, False
~/miniconda3/lib/python3.5/site-packages/pandas/core/dtypes/cast.py in changeit()
222 # isn't compatible
223 r, _ = maybe_upcast(result, fill_value=other, copy=True)
--> 224 np.place(r, mask, other)
225
226 return r, True
~/miniconda3/lib/python3.5/site-packages/numpy/lib/function_base.py in place(arr, mask, vals)
2370 if not isinstance(arr, np.ndarray):
2371 raise TypeError("argument 1 must be numpy.ndarray, "
-> 2372 "not {name}".format(name=type(arr).__name__))
2373
2374 return _insert(arr, mask, vals)
TypeError: argument 1 must be numpy.ndarray, not Categorical
The text was updated successfully, but these errors were encountered:
I got around this issue by repeatedly calling reset_index(0), once for each level in the index. So df.reset_index(0).reset_index(0) accomplishes without error what df.reset_index() should.
defreset_multi_index_safe(df):
"""Pandas has a bug with resetting categorical multi-index if one of the index categories has a missing value. Issue #24206"""try:
df=df.reset_index()
exceptTypeError: # pandas bugwhiletype(df.index) isnotpd.RangeIndex:
df=df.reset_index(0)
returndf
With Pandas 0.25.3 or Pandas 1.0.0 this fails in a slightly different way, with "ValueError: the result input must be a ndarray". This traceback is from Pandas 1:
File "C:\Users\mboling\AppData\Local\Continuum\anaconda3\envs\pandas1test\lib\site-packages\pandas\core\frame.py", line 4600, in reset_index
level_values = _maybe_casted_values(lev, lab)
File "C:\Users\mboling\AppData\Local\Continuum\anaconda3\envs\pandas1test\lib\site-packages\pandas\core\frame.py", line 4551, in _maybe_casted_values
values, _ = maybe_upcast_putmask(values, mask, np.nan)
File "C:\Users\mboling\AppData\Local\Continuum\anaconda3\envs\pandas1test\lib\site-packages\pandas\core\dtypes\cast.py", line 272, in maybe_upcast_putmask
raise ValueError("The result input must be a ndarray.")
ValueError: The result input must be a ndarray.
MultiIndex with categorical levels without missing values, this works:
Now with a missing value (note the last
-1
in the labels, that's the only difference):The text was updated successfully, but these errors were encountered: