@@ -4351,18 +4351,20 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='',
4351
4351
Series is passed, its name attribute must be set, and that will be
4352
4352
used as the column name in the resulting joined DataFrame
4353
4353
on : column name, tuple/list of column names, or array-like
4354
- Column(s) to use for joining, otherwise join on index. If multiples
4354
+ Column(s) in the caller to join on the index in other,
4355
+ otherwise joins index-on-index. If multiples
4355
4356
columns given, the passed DataFrame must have a MultiIndex. Can
4356
4357
pass an array as the join key if not already contained in the
4357
4358
calling DataFrame. Like an Excel VLOOKUP operation
4358
- how : {'left', 'right', 'outer', 'inner'}
4359
- How to handle indexes of the two objects. Default: 'left'
4360
- for joining on index, None otherwise
4361
-
4362
- * left: use calling frame's index
4363
- * right: use input frame's index
4364
- * outer: form union of indexes
4365
- * inner: use intersection of indexes
4359
+ how : {'left', 'right', 'outer', 'inner'}, default: 'left'
4360
+ How to handle the operation of the two objects.
4361
+
4362
+ * left: use calling frame's index (or column if on is specified)
4363
+ * right: use other frame's index
4364
+ * outer: form union of calling frame's index (or column if on is
4365
+ specified) with other frame's index
4366
+ * inner: form intersection of calling frame's index (or column if
4367
+ on is specified) with other frame's index
4366
4368
lsuffix : string
4367
4369
Suffix to use from left frame's overlapping columns
4368
4370
rsuffix : string
@@ -4376,6 +4378,77 @@ def join(self, other, on=None, how='left', lsuffix='', rsuffix='',
4376
4378
on, lsuffix, and rsuffix options are not supported when passing a list
4377
4379
of DataFrame objects
4378
4380
4381
+ Examples
4382
+ --------
4383
+ >>> caller = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'],
4384
+ ... 'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']})
4385
+
4386
+ >>> caller
4387
+ A key
4388
+ 0 A0 K0
4389
+ 1 A1 K1
4390
+ 2 A2 K2
4391
+ 3 A3 K3
4392
+ 4 A4 K4
4393
+ 5 A5 K5
4394
+
4395
+ >>> other = pd.DataFrame({'key': ['K0', 'K1', 'K2'],
4396
+ ... 'B': ['B0', 'B1', 'B2']})
4397
+
4398
+ >>> other
4399
+ B key
4400
+ 0 B0 K0
4401
+ 1 B1 K1
4402
+ 2 B2 K2
4403
+
4404
+ Join DataFrames using their indexes.
4405
+
4406
+ >>> caller.join(other, lsuffix='_caller', rsuffix='_other')
4407
+
4408
+ >>> A key_caller B key_other
4409
+ 0 A0 K0 B0 K0
4410
+ 1 A1 K1 B1 K1
4411
+ 2 A2 K2 B2 K2
4412
+ 3 A3 K3 NaN NaN
4413
+ 4 A4 K4 NaN NaN
4414
+ 5 A5 K5 NaN NaN
4415
+
4416
+
4417
+ If we want to join using the key columns, we need to set key to be
4418
+ the index in both caller and other. The joined DataFrame will have
4419
+ key as its index.
4420
+
4421
+ >>> caller.set_index('key').join(other.set_index('key'))
4422
+
4423
+ >>> A B
4424
+ key
4425
+ K0 A0 B0
4426
+ K1 A1 B1
4427
+ K2 A2 B2
4428
+ K3 A3 NaN
4429
+ K4 A4 NaN
4430
+ K5 A5 NaN
4431
+
4432
+ Another option to join using the key columns is to use the on
4433
+ parameter. DataFrame.join always uses other's index but we can use any
4434
+ column in the caller. This method preserves the original caller's
4435
+ index in the result.
4436
+
4437
+ >>> caller.join(other.set_index('key'), on='key')
4438
+
4439
+ >>> A key B
4440
+ 0 A0 K0 B0
4441
+ 1 A1 K1 B1
4442
+ 2 A2 K2 B2
4443
+ 3 A3 K3 NaN
4444
+ 4 A4 K4 NaN
4445
+ 5 A5 K5 NaN
4446
+
4447
+
4448
+ See also
4449
+ --------
4450
+ DataFrame.merge : For column(s)-on-columns(s) operations
4451
+
4379
4452
Returns
4380
4453
-------
4381
4454
joined : DataFrame
0 commit comments