-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Improve the docstring of pandas.DataFrame.append() #20267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
4377e47
01af9d1
1ffcece
6c3946a
083afcc
389d7ee
cb78862
df9e2a8
023ed83
cb9ee77
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -140,9 +140,11 @@ | |
columns, the index will be passed on. | ||
|
||
Parameters | ||
----------%s | ||
right : DataFrame | ||
---------- | ||
right : DataFrame or Series/dict-like object | ||
how : {'left', 'right', 'outer', 'inner'}, default 'inner' | ||
How to handle the operation of the two objects. | ||
|
||
* left: use only keys from left frame, similar to a SQL left outer join; | ||
preserve key order | ||
* right: use only keys from right frame, similar to a SQL right outer join; | ||
|
@@ -166,18 +168,18 @@ | |
left_index : boolean, default False | ||
Use the index from the left DataFrame as the join key(s). If it is a | ||
MultiIndex, the number of keys in the other DataFrame (either the index | ||
or a number of columns) must match the number of levels | ||
or a number of columns) must match the number of levels. | ||
right_index : boolean, default False | ||
Use the index from the right DataFrame as the join key. Same caveats as | ||
left_index | ||
left_index. | ||
sort : boolean, default False | ||
Sort the join keys lexicographically in the result DataFrame. If False, | ||
the order of the join keys depends on the join type (how keyword) | ||
the order of the join keys depends on the join type (how keyword). | ||
suffixes : 2-length sequence (tuple, list, ...) | ||
Suffix to apply to overlapping column names in the left and right | ||
side, respectively | ||
side, respectively. | ||
copy : boolean, default True | ||
If False, do not copy data unnecessarily | ||
If False, do not copy data unnecessarily. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Generally put built-ins in backticks, so `False` wherever applicable |
||
indicator : boolean or string, default False | ||
If True, adds a column to output DataFrame called "_merge" with | ||
information on the source of each row. | ||
|
@@ -199,7 +201,7 @@ | |
dataset. | ||
* "many_to_many" or "m:m": allowed, but does not result in checks. | ||
|
||
.. versionadded:: 0.21.0 | ||
.. versionadded:: 0.21.0. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't put a period at the end of the |
||
|
||
Notes | ||
----- | ||
|
@@ -209,21 +211,31 @@ | |
Examples | ||
-------- | ||
|
||
>>> A >>> B | ||
lkey value rkey value | ||
0 foo 1 0 foo 5 | ||
1 bar 2 1 bar 6 | ||
2 baz 3 2 qux 7 | ||
3 foo 4 3 bar 8 | ||
>>> A = pd.DataFrame({'lkey': ['foo', 'bar', 'baz', 'foo'], | ||
... 'value': [1, 2, 3, 5]}) | ||
>>> B = pd.DataFrame({'rkey': ['foo', 'bar', 'baz', 'foo'], | ||
... 'value': [5, 6, 7, 8]}) | ||
>>> A | ||
lkey value | ||
0 foo 1 | ||
1 bar 2 | ||
2 baz 3 | ||
3 foo 4 | ||
>>> B | ||
rkey value | ||
0 foo 5 | ||
1 bar 6 | ||
2 baz 7 | ||
3 foo 8 | ||
|
||
>>> A.merge(B, left_on='lkey', right_on='rkey', how='outer') | ||
lkey value_x rkey value_y | ||
0 foo 1 foo 5 | ||
1 foo 4 foo 5 | ||
2 bar 2 bar 6 | ||
3 bar 2 bar 8 | ||
4 baz 3 NaN NaN | ||
5 NaN NaN qux 7 | ||
lkey value_x rkey value_y | ||
0 foo 1 foo 5 | ||
1 foo 1 foo 8 | ||
2 foo 5 foo 5 | ||
3 foo 5 foo 8 | ||
4 bar 2 bar 6 | ||
5 baz 3 baz 7 | ||
|
||
Returns | ||
------- | ||
|
@@ -237,11 +249,9 @@ | |
merge_asof | ||
DataFrame.join | ||
""" | ||
|
||
# ----------------------------------------------------------------------- | ||
# DataFrame class | ||
|
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Triple0 Can you undo those unrelated white-space changes? |
||
class DataFrame(NDFrame): | ||
""" Two-dimensional size-mutable, potentially heterogeneous tabular data | ||
structure with labeled axes (rows and columns). Arithmetic operations | ||
|
@@ -2689,14 +2699,15 @@ def insert(self, loc, column, value, allow_duplicates=False): | |
allow_duplicates=allow_duplicates) | ||
|
||
def assign(self, **kwargs): | ||
r""" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a reason for this 'r' (the |
||
Assign new columns to a DataFrame, returning a new object | ||
(a copy) with all the original columns in addition to the new ones. | ||
""" | ||
Assign new columns to a DataFrame. | ||
|
||
Returns a new object with all original columns in addition to new ones. | ||
|
||
Parameters | ||
---------- | ||
kwargs : keyword, value pairs | ||
keywords are the column names. If the values are | ||
The column names are keywords. If the values are | ||
callable, they are computed on the DataFrame and | ||
assigned to the new columns. The callable must not | ||
change input DataFrame (though pandas doesn't check it). | ||
|
@@ -2719,9 +2730,13 @@ def assign(self, **kwargs): | |
or modified columns. All items are computed first, and then assigned | ||
in alphabetical order. | ||
|
||
.. versionchanged :: 0.23.0 | ||
.. versionchanged :: 0.23.0. | ||
|
||
Keyword argument order is maintained for Python 3.6 and later. | ||
Keyword argument order is maintained for Python 3.6 and later. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The indentation was needed for having it part of the versionchanged directive |
||
|
||
See Also | ||
-------- | ||
DataFrame.assign: For column(s)-on-DataFrame operations | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the docstring of DataFrame.assign I think? So then not needed to refer to itself |
||
|
||
Examples | ||
-------- | ||
|
@@ -5053,8 +5068,9 @@ def infer(x): | |
|
||
def append(self, other, ignore_index=False, verify_integrity=False): | ||
""" | ||
Append rows of `other` to the end of this frame, returning a new | ||
object. Columns not in this frame are added as new columns. | ||
Append rows of `other` to the end of `caller`, returning a new object. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 'caller' is not needed to quote (it's not an argument). Also given this is the docstring specific to DataFrame (not shared with Series), I think we can be more specific. So maybe "calling DataFrame" ? (or is it then too long ?) |
||
|
||
Columns not in other are added as new columns. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not the columns in |
||
|
||
Parameters | ||
---------- | ||
|
@@ -5136,7 +5152,6 @@ def append(self, other, ignore_index=False, verify_integrity=False): | |
2 2 | ||
3 3 | ||
4 4 | ||
|
||
""" | ||
if isinstance(other, (Series, dict)): | ||
if isinstance(other, dict): | ||
|
@@ -5402,7 +5417,6 @@ def round(self, decimals=0, *args, **kwargs): | |
-------- | ||
numpy.around | ||
Series.round | ||
|
||
""" | ||
from pandas.core.reshape.concat import concat | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since
Series
anddict
are distinct types I'd separate these out asDataFrame
,Series
ordict
. Update for append as well