Skip to content

Accept index in addition to list for set_index #10797

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
michaelbilow opened this issue Aug 11, 2015 · 6 comments
Closed

Accept index in addition to list for set_index #10797

michaelbilow opened this issue Aug 11, 2015 · 6 comments

Comments

@michaelbilow
Copy link

Running this code generates an error, since columns are now considered an Index and not a list. I'm not sure if the error that gets raised (set_index ends up asking for an index that is the entire length of the data frame) is actually by design. In this case, it was a bit unintuitive to figure out how to catch, since Index objects and python lists often work so similarly.

import string
import pandas as pd

data1 = 'x'*5 + 'y'*5
data2 = string.lowercase[:5]*2
data3 = range(10)
data_dict = {'Cat': list(data1), 'SubCat': list(data2), 'Vals':data3}
df = pd.DataFrame(data_dict)
ordered_df = df[['Cat', 'SubCat', 'Vals']]

correct_df = ordered_df.reset_index([x for x in ordered_df.columns[:2]])
error_df = ordered_df.set_index(ordered_df.columns[:2])
@jreback
Copy link
Contributor

jreback commented Aug 11, 2015

So you want this?

In [23]: ordered_df.set_index(list(ordered_df.columns[:2]))
Out[23]: 
            Vals
Cat SubCat      
x   a          0
    b          1
    c          2
    d          3
    e          4
y   a          5
    b          6
    c          7
    d          8
    e          9

yeh, its a pretty trivial change. want to take a go at it. Appears innocuous.

@jreback jreback added API Design Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Aug 11, 2015
@jreback jreback added this to the Next Major Release milestone Aug 11, 2015
@jreback
Copy link
Contributor

jreback commented Aug 11, 2015

change here to not is_list_like(...)

@michaelbilow
Copy link
Author

Yep, exactly.

@StephenKappel
Copy link
Contributor

It's valid to pass an Index or row labels to set_index with the expected behavior of that index becoming the index. So, to achieve the desired behavior, I am opening a PR that only treats an index like a list of column labels when the index is not the same length as the DataFrame. This should avoid breaking the existing behavior while making ordered_df.set_index(ordered_df.columns[:2]) possible in most cases.

@hundredrab
Copy link

Hi, can I take a go at this?

@jbrockmendel jbrockmendel removed the Indexing Related to indexing on series/frames, not to indexes themselves label Apr 14, 2020
@simonjayhawkins simonjayhawkins added Reshaping Concat, Merge/Join, Stack/Unstack, Explode and removed good first issue labels Apr 24, 2020
@mroeschke mroeschke added Enhancement and removed API Design Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Apr 18, 2021
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@szabi
Copy link

szabi commented Jan 27, 2023

I came across what is essentially the same issue in 2023: I wanted to pass an instance of typing.Collection (specifically a tuple) fully expect it to work, going by the documentation:

keys : label or array-like or list of labels/arrays
This parameter can be either a single column key, a single array of
the same length as the calling DataFrame, or a list containing an
arbitrary combination of column keys and arrays. Here, "array"
encompasses :class:Series, :class:Index, np.ndarray, and
instances of :class:~collections.abc.Iterator.

The check in

pandas/pandas/core/frame.py

Lines 5787 to 5788 in 01693d6

if not isinstance(keys, list):
keys = [keys]
is too restrictive.

It seems to be a simple one-liner. However, in 2015 not is_list_like(...) but that does not seem to be part of common.py any more. Would the correct course of action in 2023 be to use if not isinstance(keys, Collection) from typing? Or ist is_list_like the way to go (import from where?)

Pinging @jreback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants