API/ENH: Add mutate like method to DataFrames

In my [notebook](http://nbviewer.ipython.org/6e052140eaa5fdb6e8c0) comparing dplyr and pandas, I gained a new level of appreciation for the ability to chain strings of operations together. In my own code, the biggest impediment to this is adding additional columns that are calculations on existing columns. For example

``` R
# R / dplyr
mutate(flights,
   gain = arr_delay - dep_delay,
   speed = distance / air_time * 60)

# ... calculation involving these
```

vs.

``` python
flights['gain'] = flights.arr_delay - flights.dep_delay
flights['speed'] = flights.distance / flights.air_time * 60

# ... calculation involving these later
```

just doesn't flow as nicely, especially if this `mutate` is in the middle of a chain.

I'd propose a new method (perhaps stealing `mutate`) that's similar to dplyr's.
The function signature could be kwarg only, where the keywords are the new column names. e.g.

``` python
flights.mutate(gain=flights.arr_delay - flights.dep_delay
```

This would return a DataFrame with the new column `gain` in addition to the original columns.

Worked out example

``` python

import pandas as pd
import seaborn as sns

iris = sns.load_dataset('iris')

(iris.query('sepal_length > 4.5')
     .mutate(ratio=iris.sepal_length / iris.sepal_width)  # new part
     .groupby(pd.cut(iris.ratio)).mean()
)

```

Thoughts?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

API/ENH: Add mutate like method to DataFrames #9229

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

API/ENH: Add mutate like method to DataFrames #9229

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions