-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ERR: reindexed non-included labels on a multiindex are dropped #7886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @immerrr |
FWIW, I agree that it's better to refuse temptation to guess and raise in such cases. |
me three, this is a weird operation |
though by virtue of fixing #7866
even though these are basically the same type of operation, conceptually
so maybe just doc this? |
I think incomplete missing keys must raise in any case: there's not enough information to insert new labels and there's no data to be retrieved with those (well, save for searchsorted-like lookups) For |
As of now, yes, I don't see anything broken there to be fixed. |
Here was the original motivation: http://stackoverflow.com/questions/25006197/multiple-key-cross-sections-in-pandas-dealing-with-misses-and-duplicate-indices/25014301#25014301 do you see any other way to do this? |
I'd go for boolean mask, like you proposed there. To me that sounds closer to the problem definition: find all rows where A is one of the following. In fact, it would probably be nice to have a |
hmm, I like that idea for |
Or maybe even |
is more consistent (with how we use |
Yup, that last one is probably me getting too carried away with syntax sugar. |
ok, if you'd open a new issue (for have to think about reverting #7866 though (I agree its a bit of a stretch, but it IS convient) |
Ok, done |
Speaking of not enough information, I remembered that there's some kind of "variable length" multiindex emulation with empty string keys: In [70]: df
Out[70]:
a b
1 1
0 0 3
1 6 9
2 12 15
3 18 21
4 24 27
In [71]: df.loc[:, ('c','')] = 100.
In [72]: df
Out[72]:
a b c
1 1
0 0 3 100
1 6 9 100
2 12 15 100
3 18 21 100
4 24 27 100
In [73]: df['c']
Out[73]:
0 100
1 100
2 100
3 100
4 100
Name: c, dtype: float64
I'm not sure how it works across the library, though, it was so slow that we didn't even consider it. But I suppose it can be made to work nicely (e.g. change empty string to nan/nat to include numeric and datetime indices, optimize here and there) and then incomplete missing keys would be ok. |
I think this is actually a candidate for adding to a MultiIndex (maybe via an attribute or something). Separate issue though. |
related #4088
related #7867
I think this should raise as this is not clear how this should work (e.g. should you get all the other levels set to
nan
?)or see my comment below, maybe just document?
The text was updated successfully, but these errors were encountered: