Skip to content

Commit 75403c8

Browse files
author
datajanko
committed
populates dsintro and frame.py with examples and warning
- adds example to frame.py - reworked warning in dsintro - reworked Notes in frame.py - additional fixups
1 parent 390a8d2 commit 75403c8

File tree

3 files changed

+76
-32
lines changed

3 files changed

+76
-32
lines changed

doc/source/dsintro.rst

+34-2
Original file line numberDiff line numberDiff line change
@@ -507,9 +507,41 @@ of one argument to be called on the ``DataFrame``. A *copy* of the original
507507
DataFrame is returned, with the new values inserted.
508508

509509
.. warning::
510+
Starting from Python 3.6 ``**kwargs`` is an ordered dictionary and ``assign``
511+
respects the order of the keyword arguments. It is allowed to write
510512

511-
Since the function signature of ``assign`` is ``**kwargs``, a dictionary,
512-
the order of the new columns in the resulting DataFrame cannot be guaranteed
513+
.. ipython::
514+
:verbatim:
515+
516+
In [1]: # Allowed for Python 3.6 and later
517+
df.assign(C = lambda x: x['A'] + x['B'],
518+
D = lambda x: x['A'] + x['C'])
519+
520+
This may subtly change the behavior of your code when you're
521+
using ``.assign()`` to update an existing column. Prior to Python 3.6,
522+
callables referring to other variables being updated would get the "old" values
523+
524+
Previous Behaviour:
525+
526+
.. code-block:: ipython
527+
528+
In [2]: df = pd.DataFrame({"A": [1, 2, 3]})
529+
530+
In [3]: df.assign(A=lambda df: df.A + 1, C=lambda df: df.A * -1)
531+
Out[3]:
532+
A C
533+
0 2 -1
534+
1 3 -2
535+
2 4 -3
536+
537+
New Behaviour:
538+
539+
.. ipython:: python
540+
541+
df.assign(A=df.A+1, C= lambda df: df.A* -1)
542+
543+
For Python 3.5 and earlier the function signature of ``assign`` is ``**kwargs``,
544+
a dictionary, the order of the new columns in the resulting DataFrame cannot be guaranteed
513545
to match the order you pass in. To make things predictable, items are inserted
514546
alphabetically (by key) at the end of the DataFrame.
515547

doc/source/whatsnew/v0.23.0.txt

+18-18
Original file line numberDiff line numberDiff line change
@@ -181,39 +181,39 @@ Please note that the string `index` is not supported with the round trip format,
181181
``.assign()`` accepts dependent arguments
182182
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
183183

184-
The :func:`DataFrame.assign()` now accepts dependent kwargs for python version later than 3.6 (see also `PEP 468
185-
<https://www.python.org/dev/peps/pep-0468/>`_). Now the keyword-value pairs passed to `.assign()` may depend on their predecessors if the values are callables. (:issue:`14207`)
184+
The :func:`DataFrame.assign` now accepts dependent keyword arguments for python version later than 3.6 (see also `PEP 468
185+
<https://www.python.org/dev/peps/pep-0468/>`_). Later keyword arguments may now refer to earlier ones if the argument is a callable. (:issue:`14207`)
186186

187187
.. ipython:: python
188188

189189
df = pd.DataFrame({'A': [1, 2, 3]})
190190
df
191-
df.assign(B=df.A, C=lambda x:x['A']+ x['B'])
191+
df.assign(B=df.A, C=lambda x:x['A']+ x['B'])
192192

193193
.. warning::
194194

195-
This may subtly change the behavior of your code when you're
196-
using ``.assign()`` to update an existing column. Previously, callables
197-
referring to other variables being updated would get the "old" values
195+
This may subtly change the behavior of your code when you're
196+
using ``.assign()`` to update an existing column. Previously, callables
197+
referring to other variables being updated would get the "old" values
198198

199-
Previous Behaviour:
199+
Previous Behaviour:
200200

201-
.. code-block:: ipython
201+
.. code-block:: ipython
202202

203-
In [2]: df = pd.DataFrame({"A": [1, 2, 3]})
203+
In [2]: df = pd.DataFrame({"A": [1, 2, 3]})
204204

205-
In [3]: df.assign(A=lambda df: df.A + 1, C=lambda df: df.A * -1)
206-
Out[3]:
207-
A C
208-
0 2 -1
209-
1 3 -2
210-
2 4 -3
205+
In [3]: df.assign(A=lambda df: df.A + 1, C=lambda df: df.A * -1)
206+
Out[3]:
207+
A C
208+
0 2 -1
209+
1 3 -2
210+
2 4 -3
211211

212-
New Behaviour:
212+
New Behaviour:
213213

214-
.. ipython:: python
214+
.. ipython:: python
215215

216-
df.assign(A=df.A+1, C= lambda df: df.A* -1)
216+
df.assign(A=df.A+1, C= lambda df: df.A* -1)
217217

218218
.. _whatsnew_0230.enhancements.other:
219219

pandas/core/frame.py

+24-12
Original file line numberDiff line numberDiff line change
@@ -2671,15 +2671,17 @@ def assign(self, **kwargs):
26712671
26722672
Notes
26732673
-----
2674-
For python 3.6 and above, the columns are inserted in the order of
2675-
\*\*kwargs. For python 3.5 and earlier, since \*\*kwargs is unordered,
2676-
the columns are inserted in alphabetical order at the end of your
2677-
DataFrame. Assigning multiple columns within the same ``assign``
2678-
is possible, but for python 3.5 and earlier, you cannot reference
2679-
other columns created within the same ``assign`` call.
2680-
For python 3.6 and above it is possible to reference columns created
2681-
in an assignment. To this end you have to respect the order of kwargs
2682-
and use callables referencing the assigned columns.
2674+
Assigning multiple columns within the same ``assign`` is possible.
2675+
For Python 3.6 and above, later items in '\*\*kwargs' may refer to
2676+
newly created or modified columns in 'df'; items are computed and
2677+
assigned into 'df' in order. For Python 3.5 and below, the order of
2678+
keyword arguments is not specified, you cannot refer to newly created
2679+
or modified columns. All items are computed first, and then assigned
2680+
in alphabetical order.
2681+
2682+
.. versionmodified :: 0.23.0
2683+
2684+
Keyword argument order is maintained for Python 3.6 and later.
26832685
26842686
Examples
26852687
--------
@@ -2715,20 +2717,30 @@ def assign(self, **kwargs):
27152717
7 8 -1.495604 2.079442
27162718
8 9 0.549296 2.197225
27172719
9 10 -0.758542 2.302585
2720+
2721+
Where the keyword arguments depend on each other
2722+
2723+
>>> df = pd.DataFrame({'A': [1, 2, 3]})
2724+
2725+
>>> df.assign(B=df.A, C=lambda x:x['A']+ x['B'])
2726+
A B C
2727+
0 1 1 2
2728+
1 2 2 4
2729+
2 3 3 6
27182730
"""
27192731
data = self.copy()
27202732

2721-
# for 3.6 preserve order of kwargs
2733+
# >= 3.6 preserve order of kwargs
27222734
if PY36:
27232735
for k, v in kwargs.items():
27242736
data[k] = com._apply_if_callable(v, data)
27252737
else:
2726-
# for 3.5 or earlier: do all calculations first...
2738+
# <= 3.5: do all calculations first...
27272739
results = OrderedDict()
27282740
for k, v in kwargs.items():
27292741
results[k] = com._apply_if_callable(v, data)
27302742

2731-
# sort by key for 3.5 and earlier
2743+
# <= 3.5 and earlier
27322744
results = sorted(results.items())
27332745
# ... and then assign
27342746
for k, v in results:

0 commit comments

Comments
 (0)