-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: allow index of col names in set_index GH10797 #11944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: allow index of col names in set_index GH10797 #11944
Conversation
you don't need to open a new PR when changing things , just force push to the old one. |
@@ -108,6 +108,7 @@ Other enhancements | |||
- A simple version of ``Panel.round()`` is now implemented (:issue:`11763`) | |||
- For Python 3.x, ``round(DataFrame)``, ``round(Series)``, ``round(Panel)`` will work (:issue:`11763`) | |||
- ``Dataframe`` has gained a ``_repr_latex_`` method in order to allow for automatic conversion to latex in a ipython/jupyter notebook using nbconvert. Options ``display.latex.escape`` and ``display.latex.longtable`` have been added to the configuration and are used automatically by the ``to_latex`` method.(:issue:`11778`) | |||
- ``set_index`` now accepts indexes of column labels in the keys parameter (:issue:`10797`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.set_index
now accepts list-likes .....
dd1cb29
to
6cac509
Compare
6cac509
to
557c2ad
Compare
@jreback - not sure if this approach is better or worse, but I gave it a shot. Instead of checking for specific values in the index, I've written it to check that the array underlying the I feel like this will give the most intuitive behavior, but we do have the case where two What do you think? |
if isinstance(keys, Index): | ||
# if the index is a slice of the column index, treat it like | ||
# a list of column labels; otherwise, treat it like a new index | ||
keys_base = keys.base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a reliable approach, nor should you be using the private .base
.
I am not convinced we can do this in a reliable way.
I think that we could allow a passed actual Further I think it makes sense to accept (only an
|
@StephenKappel can you update |
Yes; I will update. Just been tied up for the last week with some other stuff... |
@StephenKappel np thanks |
I don't feel like any solution here is a net improvement. Currently, when passed an
My initial attempts were kludgy hacks that attempted to guess the user's intention based on the characteristics of the passed
vs.
Thoughts? |
re-reading, I think this would be extra confusing w/o a different keyword. Ok, let's instead repurpose this to a doc-example in the set_index section here and put an example for If its more clear then use a separate sub-section (or maybe put it all in a note). Further can you enhance the doc-strings with examples like this? thanks |
pls rebase/update. |
Sorry for the delay; I'll update the documentation this week. |
@StephenKappel I am a bit concerned that this is going to be confusing. yes it would be convenient to do:
|
Yes - my understanding was that the current code shouldn't change, but I was going to update the docs to include examples of how the desired column subsetting could be done in the current code (i.e. converting index to a list). I'll remove my code changes and add the doc changes. |
@StephenKappel the problem with docing this is that we have very different behavior for a list and an Index. |
I think this needs a re-think on the API. |
See #10797
The desired result is to allow slices of the column index to be used in set_index. Like this:
However, passing an Index of values to set_index currently has similar behavior to passing an array of values; the index will be used as the values of the new index. As to not break this behavior, this PR only treats the index like a list of column labels when the length of the index is not the same as the length of the DataFrame.