-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Option to suppress automatic conversion of tuples to MultiIndex #11799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You can do this, but using tuples as index is VERY awkward and barely supported. These are much more naturally represented (and performant) as
|
@jreback Thanks for the response. The problem is that
Even though
So there seem to be API inconsistencies here (if not bugs). |
@htzh its not exposed because its not recommended in any way to do this. to be honest we should completely ban tuples as columns, this should never have been allowed IMHO. but we are living with it. MultiIndexes are much much better supported. (and not sure what you mean by:
not in the main API's |
I accept that tuple is not desirable. My problem is that during data cleaning phase I need to rename index (column or row) at a few places. However the conversion of tuple to
And as
|
#4160 is waiting for you ! that's the fastest / best way to get a change as that is an actual bug. What you want is marginal behavior which mucks with some long defined semantics, so very low likelood of change. |
Thanks for the reference. What about the behavior of row index? Is there a reason why row uses tuple index by default while the column uses |
you are creating a list of tuples not an Index (if u wrapped this in Index it would be the same) |
Sorry for bringing this up one more time. I wonder if you think the following behavior is expected:
I know you don't like tuple index. I use them as a quick way to specify properties without going through the boilerplates of defining the class. During data cleaning phase I need to change some names depending on the data. It looks like my alternatives are:
|
@htzh why don't you use a MultiIndex? |
@MaximilianR The problem is that I need to occasionally rename some row names during data cleaning and that needs to be done before MultiIndex is created. Let's say I want to merge two tables. The two tables agree on most row names but it is also possible that they name a particular row differently. They have some overlapping columns so I can infer for which rows the tables differ in row names from the data cells in the shared column. The real problem is more complicated but this is the gist. So in my case the data schema is not completely known a priori and needs some data dependent inference based on logical relationships. Using MultiIndex during the cleaning phase would only make things more complicated (for example I may not know how many levels I will have a priori either). |
@htzh If you need a container-like object, use something like In [22]: from argparse import Namespace
In [24]: n=Namespace(a=3, b=4)
In [25]: n
Out[25]: Namespace(a=3, b=4)
In [26]: n.a
Out[26]: 3 |
Thanks to all for the help. I think I will try to grok MultiIndex better and more effectively incorporate it into the processing pipeline sooner. I also realized that one way to achieve what I wanted is to first convert names of rows as a data column and change the names in the data column. Pandas offers convenient ways to then use the changed column as Index or MultiIndex. @jreback I read the #4160 thread but it is not completely clear to me what the proposed API is. Is the proposal to make the following statement work for MultiIndex:
presumably moving the particular column from one part of the hierarchy to another? If that is the proposal that is also what I requested earlier: " I am not yet familiar with pandas source but I will keep that in mind. |
@htzh I think what you proposed above should work. There are quite a few cases that need working out though. E.g. what if you only have a partial level rename
where 'foo' is in level=0 I think this should work too |
Right now we have:
Could we have an option to suppress this behavior? One problem this causes is that the
.rename
method is not uniform and does not work if a tuple is silently converted:To see why such behavior is problematic consider the following unintuitive example:
The text was updated successfully, but these errors were encountered: