@@ -95,7 +95,7 @@ constructed from the sorted keys of the dict, if possible.
95
95
96
96
NaN (not a number) is the standard missing data marker used in pandas.
97
97
98
- **From scalar value **
98
+ **From scalar value **
99
99
100
100
If ``data `` is a scalar value, an index must be
101
101
provided. The value will be repeated to match the length of **index **.
@@ -154,7 +154,7 @@ See also the :ref:`section on attribute access<indexing.attribute_access>`.
154
154
Vectorized operations and label alignment with Series
155
155
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
156
156
157
- When working with raw NumPy arrays, looping through value-by-value is usually
157
+ When working with raw NumPy arrays, looping through value-by-value is usually
158
158
not necessary. The same is true when working with Series in pandas.
159
159
Series can also be passed into most NumPy methods expecting an ndarray.
160
160
@@ -324,7 +324,7 @@ From a list of dicts
324
324
From a dict of tuples
325
325
~~~~~~~~~~~~~~~~~~~~~
326
326
327
- You can automatically create a multi-indexed frame by passing a tuples
327
+ You can automatically create a multi-indexed frame by passing a tuples
328
328
dictionary.
329
329
330
330
.. ipython :: python
@@ -347,7 +347,7 @@ column name provided).
347
347
**Missing Data **
348
348
349
349
Much more will be said on this topic in the :ref: `Missing data <missing_data >`
350
- section. To construct a DataFrame with missing data, we use ``np.nan `` to
350
+ section. To construct a DataFrame with missing data, we use ``np.nan `` to
351
351
represent missing values. Alternatively, you may pass a ``numpy.MaskedArray ``
352
352
as the data argument to the DataFrame constructor, and its masked entries will
353
353
be considered missing.
@@ -370,7 +370,7 @@ set to ``'index'`` in order to use the dict keys as row labels.
370
370
371
371
``DataFrame.from_records `` takes a list of tuples or an ndarray with structured
372
372
dtype. It works analogously to the normal ``DataFrame `` constructor, except that
373
- the resulting DataFrame index may be a specific field of the structured
373
+ the resulting DataFrame index may be a specific field of the structured
374
374
dtype. For example:
375
375
376
376
.. ipython :: python
@@ -506,25 +506,36 @@ to be inserted (for example, a ``Series`` or NumPy array), or a function
506
506
of one argument to be called on the ``DataFrame ``. A *copy * of the original
507
507
DataFrame is returned, with the new values inserted.
508
508
509
+ Starting from Python 3.6 ``**kwargs `` is an ordered dictionary and :func: `DataFrame.assign `
510
+ respects the order of the keyword arguments. You can use assign in the following way:
511
+
512
+ .. ipython :: python
513
+
514
+ dfa = pd.DataFrame({" A" : [1 , 2 , 3 ],
515
+ " B" : [4 , 5 , 6 ]})
516
+ dfa.assign(C = lambda x : x[' A' ] + x[' B' ],
517
+ D = lambda x : x[' A' ] + x[' C' ])
518
+
509
519
.. warning ::
510
520
511
- Since the function signature of ``assign `` is ``**kwargs ``, a dictionary,
512
- the order of the new columns in the resulting DataFrame cannot be guaranteed
513
- to match the order you pass in. To make things predictable, items are inserted
514
- alphabetically (by key) at the end of the DataFrame.
521
+ Prior to Python 3.6, this may subtly change the behavior of your code when you are
522
+ using :func: `DataFrame.assign ` to update an existing column.
515
523
516
- All expressions are computed first, and then assigned. So you can't refer
517
- to another column being assigned in the same call to ``assign ``. For example:
524
+ Since the function signature of ``assign `` is ``**kwargs ``, a dictionary,
525
+ the order of the new columns in the resulting DataFrame cannot be guaranteed
526
+ to match the order you pass in. To make things predictable, items are inserted
527
+ alphabetically (by key) at the end of the DataFrame.
518
528
519
529
.. ipython ::
520
- :verbatim:
530
+ :verbatim:
531
+
532
+ In [1]: # Don't do this, bad reference to `C `
533
+ df.assign(C = lambda x: x['A'] + x['B'],
534
+ D = lambda x: x['A'] + x['C'])
535
+ In [2]: # Instead, break it into two assigns
536
+ (df.assign(C = lambda x: x['A'] + x['B'])
537
+ .assign(D = lambda x: x['A'] + x['C']))
521
538
522
- In [1]: # Don't do this, bad reference to `C `
523
- df.assign(C = lambda x: x['A'] + x['B'],
524
- D = lambda x: x['A'] + x['C'])
525
- In [2]: # Instead, break it into two assigns
526
- (df.assign(C = lambda x: x['A'] + x['B'])
527
- .assign(D = lambda x: x['A'] + x['C']))
528
539
529
540
Indexing / Selection
530
541
~~~~~~~~~~~~~~~~~~~~
@@ -914,7 +925,7 @@ For example, using the earlier example data, we could do:
914
925
Squeezing
915
926
~~~~~~~~~
916
927
917
- Another way to change the dimensionality of an object is to ``squeeze `` a 1-len
928
+ Another way to change the dimensionality of an object is to ``squeeze `` a 1-len
918
929
object, similar to ``wp['Item1'] ``.
919
930
920
931
.. ipython :: python
0 commit comments