-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: set_index with passing key of first level of MI produces invalid result #24683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
And here a smaller example (with floats, same problem):
|
On 1.2 master the OP throws at In [16]: df = pd.DataFrame(np.random.randn(5, 4), columns=pd.MultiIndex.from_product([['A', 'B'], ['a',
...: 'b']]))
In [17]: df
Out[17]:
A B
a b a b
0 0.029458 0.639062 -0.405116 1.329762
1 -0.029833 0.670068 0.279081 0.259562
2 -0.003328 -0.585462 2.433622 1.408814
3 -0.620299 -0.255258 0.099439 -0.289729
4 0.691509 -0.801464 0.506687 -0.297512
In [18]: res = df.set_index('A')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-c4a76a0c5158> in <module>
----> 1 res = df.set_index('A')
/workspaces/pandas-arw2019/pandas/core/frame.py in set_index(self, keys, drop, append, inplace, verify_integrity)
4635 )
4636
-> 4637 index = ensure_index_from_sequences(arrays, names)
4638
4639 if verify_integrity and not index.is_unique:
/workspaces/pandas-arw2019/pandas/core/indexes/base.py in ensure_index_from_sequences(sequences, names)
5595 if names is not None:
5596 names = names[0]
-> 5597 return Index(sequences[0], name=names)
5598 else:
5599 return MultiIndex.from_arrays(sequences, names=names)
/workspaces/pandas-arw2019/pandas/core/indexes/base.py in __new__(cls, data, dtype, copy, name, tupleize_cols, **kwargs)
393 return UInt64Index(data, copy=copy, dtype=dtype, name=name)
394 elif is_float_dtype(data.dtype):
--> 395 return Float64Index(data, copy=copy, dtype=dtype, name=name)
396 elif issubclass(data.dtype.type, bool) or is_bool_dtype(data):
397 subarr = data.astype("object")
/workspaces/pandas-arw2019/pandas/core/indexes/numeric.py in __new__(cls, data, dtype, copy, name)
70 if subarr.ndim > 1:
71 # GH#13601, GH#20285, GH#27125
---> 72 raise ValueError("Index data must be 1-dimensional")
73
74 subarr = np.asarray(subarr)
ValueError: Index data must be 1-dimensional This seems like the right behavior? xref #25567 for the fix Re: tests I think this is covered here: pandas/pandas/tests/indexing/test_indexing.py Lines 54 to 90 in f34a56b
so potentially this issue can be closed. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I didn't find yet a small reproducible example, but with the actual (also small) data, I see the following problem:
When doing a
set_index
with a key of the first level of the index (which I think is not supported), it actually gives a result, but an invalid one, which is illustrated by the repr that is erroring:The invalid part is that
res.index
seems to be an Int64Index, but is backed by a 2D array:Done with up to date master (0.24.dev)
The text was updated successfully, but these errors were encountered: