Skip to content

Commit fb784ca

Browse files
aeltanawyjschendel
authored andcommitted
DOC: Updated the DataFrame.assign docstring (#21917)
1 parent bdb7a16 commit fb784ca

File tree

2 files changed

+29
-43
lines changed

2 files changed

+29
-43
lines changed

ci/doctests.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ if [ "$DOCTEST" ]; then
2121

2222
# DataFrame / Series docstrings
2323
pytest --doctest-modules -v pandas/core/frame.py \
24-
-k"-assign -axes -combine -isin -itertuples -join -nlargest -nsmallest -nunique -pivot_table -quantile -query -reindex -reindex_axis -replace -round -set_index -stack -to_dict -to_stata"
24+
-k"-axes -combine -isin -itertuples -join -nlargest -nsmallest -nunique -pivot_table -quantile -query -reindex -reindex_axis -replace -round -set_index -stack -to_dict -to_stata"
2525

2626
if [ $? -ne "0" ]; then
2727
RET=1

pandas/core/frame.py

+28-42
Original file line numberDiff line numberDiff line change
@@ -3280,7 +3280,7 @@ def assign(self, **kwargs):
32803280
32813281
Parameters
32823282
----------
3283-
kwargs : keyword, value pairs
3283+
**kwargs : dict of {str: callable or Series}
32843284
The column names are keywords. If the values are
32853285
callable, they are computed on the DataFrame and
32863286
assigned to the new columns. The callable must not
@@ -3290,7 +3290,7 @@ def assign(self, **kwargs):
32903290
32913291
Returns
32923292
-------
3293-
df : DataFrame
3293+
DataFrame
32943294
A new DataFrame with the new columns in addition to
32953295
all the existing columns.
32963296
@@ -3310,48 +3310,34 @@ def assign(self, **kwargs):
33103310
33113311
Examples
33123312
--------
3313-
>>> df = pd.DataFrame({'A': range(1, 11), 'B': np.random.randn(10)})
3313+
>>> df = pd.DataFrame({'temp_c': [17.0, 25.0]},
3314+
... index=['Portland', 'Berkeley'])
3315+
>>> df
3316+
temp_c
3317+
Portland 17.0
3318+
Berkeley 25.0
33143319
33153320
Where the value is a callable, evaluated on `df`:
3316-
3317-
>>> df.assign(ln_A = lambda x: np.log(x.A))
3318-
A B ln_A
3319-
0 1 0.426905 0.000000
3320-
1 2 -0.780949 0.693147
3321-
2 3 -0.418711 1.098612
3322-
3 4 -0.269708 1.386294
3323-
4 5 -0.274002 1.609438
3324-
5 6 -0.500792 1.791759
3325-
6 7 1.649697 1.945910
3326-
7 8 -1.495604 2.079442
3327-
8 9 0.549296 2.197225
3328-
9 10 -0.758542 2.302585
3329-
3330-
Where the value already exists and is inserted:
3331-
3332-
>>> newcol = np.log(df['A'])
3333-
>>> df.assign(ln_A=newcol)
3334-
A B ln_A
3335-
0 1 0.426905 0.000000
3336-
1 2 -0.780949 0.693147
3337-
2 3 -0.418711 1.098612
3338-
3 4 -0.269708 1.386294
3339-
4 5 -0.274002 1.609438
3340-
5 6 -0.500792 1.791759
3341-
6 7 1.649697 1.945910
3342-
7 8 -1.495604 2.079442
3343-
8 9 0.549296 2.197225
3344-
9 10 -0.758542 2.302585
3345-
3346-
Where the keyword arguments depend on each other
3347-
3348-
>>> df = pd.DataFrame({'A': [1, 2, 3]})
3349-
3350-
>>> df.assign(B=df.A, C=lambda x:x['A']+ x['B'])
3351-
A B C
3352-
0 1 1 2
3353-
1 2 2 4
3354-
2 3 3 6
3321+
>>> df.assign(temp_f=lambda x: x.temp_c * 9 / 5 + 32)
3322+
temp_c temp_f
3323+
Portland 17.0 62.6
3324+
Berkeley 25.0 77.0
3325+
3326+
Alternatively, the same behavior can be achieved by directly
3327+
referencing an existing Series or sequence:
3328+
>>> df.assign(temp_f=df['temp_c'] * 9 / 5 + 32)
3329+
temp_c temp_f
3330+
Portland 17.0 62.6
3331+
Berkeley 25.0 77.0
3332+
3333+
In Python 3.6+, you can create multiple columns within the same assign
3334+
where one of the columns depends on another one defined within the same
3335+
assign:
3336+
>>> df.assign(temp_f=lambda x: x['temp_c'] * 9 / 5 + 32,
3337+
... temp_k=lambda x: (x['temp_f'] + 459.67) * 5 / 9)
3338+
temp_c temp_f temp_k
3339+
Portland 17.0 62.6 290.15
3340+
Berkeley 25.0 77.0 298.15
33553341
"""
33563342
data = self.copy()
33573343

0 commit comments

Comments
 (0)