Skip to content

DOC: Improve the docstring of pandas.DataFrame.append() #20267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 8, 2018
80 changes: 47 additions & 33 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,9 +140,11 @@
columns, the index will be passed on.

Parameters
----------%s
right : DataFrame
----------
right : DataFrame or Series/dict-like object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Series and dict are distinct types I'd separate these out as DataFrame, Series or dict. Update for append as well

how : {'left', 'right', 'outer', 'inner'}, default 'inner'
How to handle the operation of the two objects.

* left: use only keys from left frame, similar to a SQL left outer join;
preserve key order
* right: use only keys from right frame, similar to a SQL right outer join;
Expand All @@ -166,18 +168,18 @@
left_index : boolean, default False
Use the index from the left DataFrame as the join key(s). If it is a
MultiIndex, the number of keys in the other DataFrame (either the index
or a number of columns) must match the number of levels
or a number of columns) must match the number of levels.
right_index : boolean, default False
Use the index from the right DataFrame as the join key. Same caveats as
left_index
left_index.
sort : boolean, default False
Sort the join keys lexicographically in the result DataFrame. If False,
the order of the join keys depends on the join type (how keyword)
the order of the join keys depends on the join type (how keyword).
suffixes : 2-length sequence (tuple, list, ...)
Suffix to apply to overlapping column names in the left and right
side, respectively
side, respectively.
copy : boolean, default True
If False, do not copy data unnecessarily
If False, do not copy data unnecessarily.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally put built-ins in backticks, so `False` wherever applicable

indicator : boolean or string, default False
If True, adds a column to output DataFrame called "_merge" with
information on the source of each row.
Expand All @@ -199,7 +201,7 @@
dataset.
* "many_to_many" or "m:m": allowed, but does not result in checks.

.. versionadded:: 0.21.0
.. versionadded:: 0.21.0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't put a period at the end of the versionadded directive


Notes
-----
Expand All @@ -209,21 +211,31 @@
Examples
--------

>>> A >>> B
lkey value rkey value
0 foo 1 0 foo 5
1 bar 2 1 bar 6
2 baz 3 2 qux 7
3 foo 4 3 bar 8
>>> A = pd.DataFrame({'lkey': ['foo', 'bar', 'baz', 'foo'],
... 'value': [1, 2, 3, 5]})
>>> B = pd.DataFrame({'rkey': ['foo', 'bar', 'baz', 'foo'],
... 'value': [5, 6, 7, 8]})
>>> A
lkey value
0 foo 1
1 bar 2
2 baz 3
3 foo 4
>>> B
rkey value
0 foo 5
1 bar 6
2 baz 7
3 foo 8

>>> A.merge(B, left_on='lkey', right_on='rkey', how='outer')
lkey value_x rkey value_y
0 foo 1 foo 5
1 foo 4 foo 5
2 bar 2 bar 6
3 bar 2 bar 8
4 baz 3 NaN NaN
5 NaN NaN qux 7
lkey value_x rkey value_y
0 foo 1 foo 5
1 foo 1 foo 8
2 foo 5 foo 5
3 foo 5 foo 8
4 bar 2 bar 6
5 baz 3 baz 7

Returns
-------
Expand All @@ -237,11 +249,9 @@
merge_asof
DataFrame.join
"""

# -----------------------------------------------------------------------
# DataFrame class


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Triple0 Can you undo those unrelated white-space changes?

class DataFrame(NDFrame):
""" Two-dimensional size-mutable, potentially heterogeneous tabular data
structure with labeled axes (rows and columns). Arithmetic operations
Expand Down Expand Up @@ -2689,14 +2699,15 @@ def insert(self, loc, column, value, allow_duplicates=False):
allow_duplicates=allow_duplicates)

def assign(self, **kwargs):
r"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a reason for this 'r' (the \ should not be interpreted)

Assign new columns to a DataFrame, returning a new object
(a copy) with all the original columns in addition to the new ones.
"""
Assign new columns to a DataFrame.

Returns a new object with all original columns in addition to new ones.

Parameters
----------
kwargs : keyword, value pairs
keywords are the column names. If the values are
The column names are keywords. If the values are
callable, they are computed on the DataFrame and
assigned to the new columns. The callable must not
change input DataFrame (though pandas doesn't check it).
Expand All @@ -2719,9 +2730,13 @@ def assign(self, **kwargs):
or modified columns. All items are computed first, and then assigned
in alphabetical order.

.. versionchanged :: 0.23.0
.. versionchanged :: 0.23.0.

Keyword argument order is maintained for Python 3.6 and later.
Keyword argument order is maintained for Python 3.6 and later.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation was needed for having it part of the versionchanged directive


See Also
--------
DataFrame.assign: For column(s)-on-DataFrame operations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the docstring of DataFrame.assign I think? So then not needed to refer to itself


Examples
--------
Expand Down Expand Up @@ -5053,8 +5068,9 @@ def infer(x):

def append(self, other, ignore_index=False, verify_integrity=False):
"""
Append rows of `other` to the end of this frame, returning a new
object. Columns not in this frame are added as new columns.
Append rows of `other` to the end of `caller`, returning a new object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'caller' is not needed to quote (it's not an argument).

Also given this is the docstring specific to DataFrame (not shared with Series), I think we can be more specific. So maybe "calling DataFrame" ? (or is it then too long ?)


Columns not in other are added as new columns.
Copy link
Member

@jorisvandenbossche jorisvandenbossche Mar 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the columns in other, but in "calling DataFrame" (or more explicit: Columns in `other` that are not in the calling DataFrame are added as new columns


Parameters
----------
Expand Down Expand Up @@ -5136,7 +5152,6 @@ def append(self, other, ignore_index=False, verify_integrity=False):
2 2
3 3
4 4

"""
if isinstance(other, (Series, dict)):
if isinstance(other, dict):
Expand Down Expand Up @@ -5402,7 +5417,6 @@ def round(self, decimals=0, *args, **kwargs):
--------
numpy.around
Series.round

"""
from pandas.core.reshape.concat import concat

Expand Down