-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
set_axis with callable #29145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you provide code samples for the usage cases with expected outputs for this? |
Usage example,
df.set_axis(lambda x: x.iloc[0], axis=1)
df.set_axis(lambda x: map('_'.join, x.columns), axis=1)
df.set_axis(lambda x: x.columns.str.split('_',expand=True), axis=1) |
Couple things to note: Number 1 is already possible, though requires you to drop the row after setting. So maybe we need a keyword argument for that (it's called >>> df = pd.DataFrame(np.arange(10, 16).reshape((-1, 2)))
>>> df.set_axis(df.iloc[0], axis=1).drop(0)
0 10 11
1 12 13
2 14 15 For number two there is already a multi index >>> df = pd.DataFrame(np.arange(10, 16).reshape((-1, 2)), columns=pd.MultiIndex.from_product((("a",), ("b", "c"))))
>>> df.set_axis(df.columns.to_flat_index(), axis=1)
(a, b) (a, c)
0 10 11
1 12 13
2 14 15 For number three you can also use >>> df = pd.DataFrame(np.arange(10, 16).reshape((-1, 2)), columns=[("a", "b"), ("a", "c")])
>>> df.set_axis(pd.MultiIndex.from_tuples(df.columns), axis=1)
a
b c
0 10 11
1 12 13
2 14 15 So out of these I think the most actionable thing may be to add a Note this is something tied into the conversations of #24046 but I think this is an actionable item |
I understand all are possible, but difficult to work with method chaining. You have to create df = (pd.read_file(file)
.set_axis(lambda x:x...)
.do_a()
.do_b()) |
If you don't know yet, have a look at Pandas isn't designed for method chaining. (Although I think it is a good idea to have more methods like this that do tie in more with the method chaining pattern, but for now you can ease the pain with |
@hwaling I do use pipe as the last resort and wonder if it is possible to bypass pipe when it comes to header assignment. |
Also, there are many options where you can both use df = do_stuff().pipe(lambda x: x.assign(new_col=x.old_col + 1))
# or
df = do_stuff().assign(new_col=lambda x: x.old_col + 1) Same with I wonder what the preferred method is when both options are available. |
Hi,
I've been using method chaining to write most of my data wrangling processes and one of the things that bother me a bit is to modify column names as part of the chain.
With a static list of column names,
df.set_axis
can do the work. But in the following cases, I have to usedf.columns=xxx
to modify the names.I wonder if these could be achieved by allowing
df.set_axis
to take callables, something similar todf.assign
.The text was updated successfully, but these errors were encountered: