@@ -31,10 +31,10 @@ operations.
31
31
Concatenating objects
32
32
---------------------
33
33
34
- The :func: `~pandas.concat ` function (in the main pandas namespace) does all of
35
- the heavy lifting of performing concatenation operations along an axis while
36
- performing optional set logic (union or intersection) of the indexes (if any) on
37
- the other axes. Note that I say "if any" because there is only a single possible
34
+ The :func: `~pandas.concat ` function (in the main pandas namespace) does all of
35
+ the heavy lifting of performing concatenation operations along an axis while
36
+ performing optional set logic (union or intersection) of the indexes (if any) on
37
+ the other axes. Note that I say "if any" because there is only a single possible
38
38
axis of concatenation for Series.
39
39
40
40
Before diving into all of the details of ``concat `` and what it can do, here is
@@ -109,9 +109,9 @@ some configurable handling of "what to do with the other axes":
109
109
to the actual data concatenation.
110
110
- ``copy `` : boolean, default True. If False, do not copy data unnecessarily.
111
111
112
- Without a little bit of context many of these arguments don't make much sense.
113
- Let's revisit the above example. Suppose we wanted to associate specific keys
114
- with each of the pieces of the chopped up DataFrame. We can do this using the
112
+ Without a little bit of context many of these arguments don't make much sense.
113
+ Let's revisit the above example. Suppose we wanted to associate specific keys
114
+ with each of the pieces of the chopped up DataFrame. We can do this using the
115
115
``keys `` argument:
116
116
117
117
.. ipython :: python
@@ -138,9 +138,9 @@ It's not a stretch to see how this can be very useful. More detail on this
138
138
functionality below.
139
139
140
140
.. note ::
141
- It is worth noting that :func: `~pandas.concat ` (and therefore
142
- :func: `~pandas.append `) makes a full copy of the data, and that constantly
143
- reusing this function can create a significant performance hit. If you need
141
+ It is worth noting that :func: `~pandas.concat ` (and therefore
142
+ :func: `~pandas.append `) makes a full copy of the data, and that constantly
143
+ reusing this function can create a significant performance hit. If you need
144
144
to use the operation over several datasets, use a list comprehension.
145
145
146
146
::
@@ -153,7 +153,7 @@ Set logic on the other axes
153
153
~~~~~~~~~~~~~~~~~~~~~~~~~~~
154
154
155
155
When gluing together multiple DataFrames, you have a choice of how to handle
156
- the other axes (other than the one being concatenated). This can be done in
156
+ the other axes (other than the one being concatenated). This can be done in
157
157
the following three ways:
158
158
159
159
- Take the (sorted) union of them all, ``join='outer' ``. This is the default
@@ -216,8 +216,8 @@ DataFrame:
216
216
Concatenating using ``append ``
217
217
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218
218
219
- A useful shortcut to :func: `~pandas.concat ` are the :meth: `~DataFrame.append `
220
- instance methods on ``Series `` and ``DataFrame ``. These methods actually predated
219
+ A useful shortcut to :func: `~pandas.concat ` are the :meth: `~DataFrame.append `
220
+ instance methods on ``Series `` and ``DataFrame ``. These methods actually predated
221
221
``concat ``. They concatenate along ``axis=0 ``, namely the index:
222
222
223
223
.. ipython :: python
@@ -263,8 +263,8 @@ need to be:
263
263
264
264
.. note ::
265
265
266
- Unlike the :py:meth: `~list.append ` method, which appends to the original list
267
- and returns ``None ``, :meth: `~DataFrame.append ` here **does not ** modify
266
+ Unlike the :py:meth: `~list.append ` method, which appends to the original list
267
+ and returns ``None ``, :meth: `~DataFrame.append ` here **does not ** modify
268
268
``df1 `` and returns its copy with ``df2 `` appended.
269
269
270
270
.. _merging.ignore_index :
@@ -362,9 +362,9 @@ Passing ``ignore_index=True`` will drop all name references.
362
362
More concatenating with group keys
363
363
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
364
364
365
- A fairly common use of the ``keys `` argument is to override the column names
365
+ A fairly common use of the ``keys `` argument is to override the column names
366
366
when creating a new ``DataFrame `` based on existing ``Series ``.
367
- Notice how the default behaviour consists on letting the resulting ``DataFrame ``
367
+ Notice how the default behaviour consists on letting the resulting ``DataFrame ``
368
368
inherit the parent ``Series ``' name, when these existed.
369
369
370
370
.. ipython :: python
@@ -460,7 +460,7 @@ Appending rows to a DataFrame
460
460
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
461
461
462
462
While not especially efficient (since a new object must be created), you can
463
- append a single row to a ``DataFrame `` by passing a ``Series `` or dict to
463
+ append a single row to a ``DataFrame `` by passing a ``Series `` or dict to
464
464
``append ``, which returns a new ``DataFrame `` as above.
465
465
466
466
.. ipython :: python
@@ -505,15 +505,15 @@ pandas has full-featured, **high performance** in-memory join operations
505
505
idiomatically very similar to relational databases like SQL. These methods
506
506
perform significantly better (in some cases well over an order of magnitude
507
507
better) than other open source implementations (like ``base::merge.data.frame ``
508
- in R). The reason for this is careful algorithmic design and the internal layout
508
+ in R). The reason for this is careful algorithmic design and the internal layout
509
509
of the data in ``DataFrame ``.
510
510
511
511
See the :ref: `cookbook<cookbook.merge> ` for some advanced strategies.
512
512
513
513
Users who are familiar with SQL but new to pandas might be interested in a
514
514
:ref: `comparison with SQL<compare_with_sql.join> `.
515
515
516
- pandas provides a single function, :func: `~pandas.merge `, as the entry point for
516
+ pandas provides a single function, :func: `~pandas.merge `, as the entry point for
517
517
all standard database join operations between ``DataFrame `` objects:
518
518
519
519
::
@@ -582,7 +582,7 @@ and ``right`` is a subclass of DataFrame, the return type will still be
582
582
``DataFrame ``.
583
583
584
584
``merge `` is a function in the pandas namespace, and it is also available as a
585
- ``DataFrame `` instance method :meth: `~DataFrame.merge `, with the calling
585
+ ``DataFrame `` instance method :meth: `~DataFrame.merge `, with the calling
586
586
``DataFrame `` being implicitly considered the left object in the join.
587
587
588
588
The related :meth:`~DataFrame.join` method, uses ``merge `` internally for the
@@ -594,7 +594,7 @@ Brief primer on merge methods (relational algebra)
594
594
595
595
Experienced users of relational databases like SQL will be familiar with the
596
596
terminology used to describe join operations between two SQL-table like
597
- structures (``DataFrame `` objects). There are several cases to consider which
597
+ structures (``DataFrame `` objects). There are several cases to consider which
598
598
are very important to understand:
599
599
600
600
- **one-to-one ** joins: for example when joining two ``DataFrame `` objects on
@@ -634,8 +634,8 @@ key combination:
634
634
labels = [' left' , ' right' ], vertical = False );
635
635
plt.close(' all' );
636
636
637
- Here is a more complicated example with multiple join keys. Only the keys
638
- appearing in ``left `` and ``right `` are present (the intersection), since
637
+ Here is a more complicated example with multiple join keys. Only the keys
638
+ appearing in ``left `` and ``right `` are present (the intersection), since
639
639
``how='inner' `` by default.
640
640
641
641
.. ipython :: python
@@ -751,13 +751,13 @@ Checking for duplicate keys
751
751
752
752
.. versionadded :: 0.21.0
753
753
754
- Users can use the ``validate `` argument to automatically check whether there
755
- are unexpected duplicates in their merge keys. Key uniqueness is checked before
756
- merge operations and so should protect against memory overflows. Checking key
757
- uniqueness is also a good way to ensure user data structures are as expected.
754
+ Users can use the ``validate `` argument to automatically check whether there
755
+ are unexpected duplicates in their merge keys. Key uniqueness is checked before
756
+ merge operations and so should protect against memory overflows. Checking key
757
+ uniqueness is also a good way to ensure user data structures are as expected.
758
758
759
- In the following example, there are duplicate values of ``B `` in the right
760
- ``DataFrame ``. As this is not a one-to-one merge -- as specified in the
759
+ In the following example, there are duplicate values of ``B `` in the right
760
+ ``DataFrame ``. As this is not a one-to-one merge -- as specified in the
761
761
``validate `` argument -- an exception will be raised.
762
762
763
763
@@ -770,11 +770,11 @@ In the following example, there are duplicate values of ``B`` in the right
770
770
771
771
In [53]: result = pd.merge(left, right, on='B', how='outer', validate="one_to_one")
772
772
...
773
- MergeError: Merge keys are not unique in right dataset; not a one-to-one merge
773
+ MergeError: Merge keys are not unique in right dataset; not a one-to-one merge
774
774
775
- If the user is aware of the duplicates in the right ``DataFrame `` but wants to
776
- ensure there are no duplicates in the left DataFrame, one can use the
777
- ``validate='one_to_many' `` argument instead, which will not raise an exception.
775
+ If the user is aware of the duplicates in the right ``DataFrame `` but wants to
776
+ ensure there are no duplicates in the left DataFrame, one can use the
777
+ ``validate='one_to_many' `` argument instead, which will not raise an exception.
778
778
779
779
.. ipython :: python
780
780
@@ -786,8 +786,8 @@ ensure there are no duplicates in the left DataFrame, one can use the
786
786
The merge indicator
787
787
~~~~~~~~~~~~~~~~~~~
788
788
789
- :func: `~pandas.merge ` accepts the argument ``indicator ``. If ``True ``, a
790
- Categorical-type column called ``_merge `` will be added to the output object
789
+ :func: `~pandas.merge ` accepts the argument ``indicator ``. If ``True ``, a
790
+ Categorical-type column called ``_merge `` will be added to the output object
791
791
that takes on values:
792
792
793
793
=================================== ================
@@ -895,7 +895,7 @@ Joining on index
895
895
~~~~~~~~~~~~~~~~
896
896
897
897
:meth: `DataFrame.join ` is a convenient method for combining the columns of two
898
- potentially differently-indexed ``DataFrames `` into a single result
898
+ potentially differently-indexed ``DataFrames `` into a single result
899
899
``DataFrame ``. Here is a very basic example:
900
900
901
901
.. ipython :: python
@@ -975,9 +975,9 @@ indexes:
975
975
Joining key columns on an index
976
976
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
977
977
978
- :meth: `~DataFrame.join ` takes an optional ``on `` argument which may be a column
978
+ :meth: `~DataFrame.join ` takes an optional ``on `` argument which may be a column
979
979
or multiple column names, which specifies that the passed ``DataFrame `` is to be
980
- aligned on that column in the ``DataFrame ``. These two function calls are
980
+ aligned on that column in the ``DataFrame ``. These two function calls are
981
981
completely equivalent:
982
982
983
983
::
@@ -987,7 +987,7 @@ completely equivalent:
987
987
how='left', sort=False)
988
988
989
989
Obviously you can choose whichever form you find more convenient. For
990
- many-to-one joins (where one of the ``DataFrame ``'s is already indexed by the
990
+ many-to-one joins (where one of the ``DataFrame ``'s is already indexed by the
991
991
join key), using ``join `` may be more convenient. Here is a simple example:
992
992
993
993
.. ipython :: python
@@ -1266,7 +1266,7 @@ similarly.
1266
1266
Joining multiple DataFrame or Panel objects
1267
1267
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1268
1268
1269
- A list or tuple of ``DataFrames `` can also be passed to :meth: `~DataFrame.join `
1269
+ A list or tuple of ``DataFrames `` can also be passed to :meth: `~DataFrame.join `
1270
1270
to join them together on their indexes.
1271
1271
1272
1272
.. ipython :: python
@@ -1288,7 +1288,7 @@ Merging together values within Series or DataFrame columns
1288
1288
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1289
1289
1290
1290
Another fairly common situation is to have two like-indexed (or similarly
1291
- indexed) ``Series `` or ``DataFrame `` objects and wanting to "patch" values in
1291
+ indexed) ``Series `` or ``DataFrame `` objects and wanting to "patch" values in
1292
1292
one object from values for matching indices in the other. Here is an example:
1293
1293
1294
1294
.. ipython :: python
@@ -1313,7 +1313,7 @@ For this, use the :meth:`~DataFrame.combine_first` method:
1313
1313
plt.close(' all' );
1314
1314
1315
1315
Note that this method only takes values from the right ``DataFrame `` if they are
1316
- missing in the left ``DataFrame ``. A related method, :meth: `~DataFrame.update `,
1316
+ missing in the left ``DataFrame ``. A related method, :meth: `~DataFrame.update `,
1317
1317
alters non-NA values inplace:
1318
1318
1319
1319
.. ipython :: python
@@ -1365,15 +1365,15 @@ Merging AsOf
1365
1365
1366
1366
.. versionadded :: 0.19.0
1367
1367
1368
- A :func: `merge_asof ` is similar to an ordered left-join except that we match on
1369
- nearest key rather than equal keys. For each row in the ``left `` ``DataFrame ``,
1370
- we select the last row in the ``right `` ``DataFrame `` whose ``on `` key is less
1368
+ A :func: `merge_asof ` is similar to an ordered left-join except that we match on
1369
+ nearest key rather than equal keys. For each row in the ``left `` ``DataFrame ``,
1370
+ we select the last row in the ``right `` ``DataFrame `` whose ``on `` key is less
1371
1371
than the left's key. Both DataFrames must be sorted by the key.
1372
1372
1373
- Optionally an asof merge can perform a group-wise merge. This matches the
1373
+ Optionally an asof merge can perform a group-wise merge. This matches the
1374
1374
``by `` key equally, in addition to the nearest match on the ``on `` key.
1375
1375
1376
- For example; we might have ``trades `` and ``quotes `` and we want to ``asof ``
1376
+ For example; we might have ``trades `` and ``quotes `` and we want to ``asof ``
1377
1377
merge them.
1378
1378
1379
1379
.. ipython :: python
@@ -1432,8 +1432,8 @@ We only asof within ``2ms`` between the quote time and the trade time.
1432
1432
by = ' ticker' ,
1433
1433
tolerance = pd.Timedelta(' 2ms' ))
1434
1434
1435
- We only asof within ``10ms `` between the quote time and the trade time and we
1436
- exclude exact matches on time. Note that though we exclude the exact matches
1435
+ We only asof within ``10ms `` between the quote time and the trade time and we
1436
+ exclude exact matches on time. Note that though we exclude the exact matches
1437
1437
(of the quotes), prior quotes **do ** propagate to that point in time.
1438
1438
1439
1439
.. ipython :: python
0 commit comments