-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Updating operators docstrings #20415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
acab08e
1818aeb
86cfd56
b68b61f
13fed5f
8bdbc14
4668c5f
e6eb9b9
db143c4
33ff1e4
e138d92
50e9d98
aa016fd
c2cc037
644273b
240a502
bbcdcbe
a33f003
70950c0
6bcb9b9
e7da1e9
20cbec1
722ae81
4580f7a
49c7b82
1e4e450
ec71a04
25129ff
d344688
eaaee0d
e777e87
6879e89
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -397,6 +397,169 @@ def _get_op_name(op, special): | |
e NaN -2.0 | ||
""" | ||
|
||
_eq_example_FRAME = """ | ||
>>> df1 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [100, 200, 300]}, | ||
... columns=['tool', 'score']) | ||
>>> df1 | ||
tool score | ||
0 python 100 | ||
1 r 200 | ||
2 julia 300 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't want to create controversy with other languages, or give the idea that Python is competing with them. I know it's a bit exagerated, but let's better avoid this example. I like how it makes easier to understand the point, than with the random data. But I find slightly misleading that the languages are not the labels, and that the dataframes are compared with the rows sorted in a different way. I know it's very hard to find great examples (I know by experience, still couldn't find good ones for some docstrings I'm working on). But let's see if we can find something better. It just came to my mind that we could have two dataframes, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Understood. Will add a new example. Just thought in the spirit of open source! |
||
>>> df2 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [300, 200, 100]}, | ||
... columns=['tool', 'score']) | ||
>>> df2 | ||
tool score | ||
0 python 300 | ||
1 r 200 | ||
2 julia 100 | ||
>>> df1.eq(df2) | ||
tool score | ||
0 True False | ||
1 True True | ||
2 True False | ||
""" | ||
|
||
_ne_example_FRAME = """ | ||
>>> df1 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [100, 200, 300]}, | ||
... columns=['tool', 'score']) | ||
>>> df1 | ||
tool score | ||
0 python 100 | ||
1 r 200 | ||
2 julia 300 | ||
>>> df2 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [300, 200, 100]}, | ||
... columns=['tool', 'score']) | ||
>>> df2 | ||
tool score | ||
0 python 300 | ||
1 r 200 | ||
2 julia 100 | ||
>>> df1.ne(df2) | ||
tool score | ||
0 False True | ||
1 False False | ||
2 False True | ||
""" | ||
|
||
_lt_example_FRAME = """ | ||
>>> df1 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [100, 200, 300]}, | ||
... columns=['tool', 'score']) | ||
>>> df1 | ||
tool score | ||
0 python 100 | ||
1 r 200 | ||
2 julia 300 | ||
>>> df2 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [300, 200, 100]}, | ||
... columns=['tool', 'score']) | ||
>>> df2 | ||
tool score | ||
0 python 300 | ||
1 r 200 | ||
2 julia 100 | ||
>>> df1.lt(df2) | ||
tool score | ||
0 False True | ||
1 False False | ||
2 False False | ||
""" | ||
|
||
_le_example_FRAME = """ | ||
>>> df1 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [100, 200, 300]}, | ||
... columns=['tool', 'score']) | ||
>>> df1 | ||
tool score | ||
0 python 100 | ||
1 r 200 | ||
2 julia 300 | ||
>>> df2 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [300, 200, 100]}, | ||
... columns=['tool', 'score']) | ||
>>> df2 | ||
tool score | ||
0 python 300 | ||
1 r 200 | ||
2 julia 100 | ||
>>> df1.le(df2) | ||
tool score | ||
0 True True | ||
1 True True | ||
2 True False | ||
""" | ||
|
||
_gt_example_FRAME = """ | ||
>>> df1 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [100, 200, 300]}, | ||
... columns=['tool', 'score']) | ||
>>> df1 | ||
tool score | ||
0 python 100 | ||
1 r 200 | ||
2 julia 300 | ||
>>> df2 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [300, 200, 100]}, | ||
... columns=['tool', 'score']) | ||
>>> df2 | ||
tool score | ||
0 python 300 | ||
1 r 200 | ||
2 julia 100 | ||
>>> df1.gt(df2) | ||
tool score | ||
0 False False | ||
1 False False | ||
2 False True | ||
""" | ||
|
||
_ge_example_FRAME = """ | ||
>>> df1 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [100, 200, 300]}, | ||
... columns=['tool', 'score']) | ||
>>> df1 | ||
tool score | ||
0 python 100 | ||
1 r 200 | ||
2 julia 300 | ||
>>> df2 = pd.DataFrame({'tool': ['python', 'r', 'julia'], | ||
... 'score': [300, 200, 100]}, | ||
... columns=['tool', 'score']) | ||
>>> df2 | ||
tool score | ||
0 python 300 | ||
1 r 200 | ||
2 julia 100 | ||
>>> df1.ge(df2) | ||
tool score | ||
0 True False | ||
1 True True | ||
2 True True | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personally I think that more than having a simple example for each docstring, we could also reuse the examples. So, we create the dataframes once, and we show the methods (not necessarily all, I'm sure the users will be able to extrapolate). And what it would be nice, is to show in examples what the parameters do, and how they change the results. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All examples uses the same df1 and df2. But I will work on showing the use of rows and columns axis args. |
||
""" | ||
|
||
_comp_others = """ | ||
DataFrame.eq : Return DataFrame of boolean values equal to the | ||
elementwise rows or columns of one DataFrame to another | ||
DataFrame.ne : Return DataFrame of boolean values not equal to | ||
the elementwise rows or columns of one DataFrame to another | ||
DataFrame.le : Return DataFrame of boolean values less than or | ||
equal the elementwise rows or columns of one DataFrame | ||
to another | ||
DataFrame.lt : Return DataFrame of boolean values strictly less | ||
than the elementwise rows or columns of one DataFrame to | ||
another | ||
DataFrame.ge : Return DataFrame of boolean values greater than or | ||
equal to the elementwise rows or columns of one DataFrame to | ||
another | ||
DataFrame.gt : Return DataFrame of boolean values strictly greater | ||
than to the elementwise rows or columns of one DataFrame to | ||
another | ||
""" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if I'm missing something, but is this always the same for all methods? It could go directly to the doctring and not in a separate variable if that's the case. Besides good docstrings, it's good if we keep the code as simple as possible. So we usually leave the same Also, do you think we can find something less verbose for the descriptions. Something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Though much of the words are the same, they differ slightly in the middle. Are you advising removing descriptions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, what I meant is that if it's possible, it'd be good that each of these descriptions is shorter. Just making them more concise if you find the way. |
||
|
||
_op_descriptions = { | ||
# Arithmetic Operators | ||
'add': {'op': '+', | ||
|
@@ -435,28 +598,34 @@ def _get_op_name(op, special): | |
# Comparison Operators | ||
'eq': {'op': '==', | ||
'desc': 'Equal to', | ||
'reverse': None, | ||
'df_examples': None}, | ||
'reverse': 'ne', | ||
'df_examples': _eq_example_FRAME, | ||
'others': _comp_others}, | ||
'ne': {'op': '!=', | ||
'desc': 'Not equal to', | ||
'reverse': None, | ||
'df_examples': None}, | ||
'reverse': 'eq', | ||
'df_examples': _ne_example_FRAME, | ||
'others': _comp_others}, | ||
'lt': {'op': '<', | ||
'desc': 'Less than', | ||
'reverse': None, | ||
'df_examples': None}, | ||
'reverse': 'ge', | ||
'df_examples': _lt_example_FRAME, | ||
'others': _comp_others}, | ||
'le': {'op': '<=', | ||
'desc': 'Less than or equal to', | ||
'reverse': None, | ||
'df_examples': None}, | ||
'reverse': 'gt', | ||
'df_examples': _le_example_FRAME, | ||
'others': _comp_others}, | ||
'gt': {'op': '>', | ||
'desc': 'Greater than', | ||
'reverse': None, | ||
'df_examples': None}, | ||
'reverse': 'le', | ||
'df_examples': _gt_example_FRAME, | ||
'others': _comp_others}, | ||
'ge': {'op': '>=', | ||
'desc': 'Greater than or equal to', | ||
'reverse': None, | ||
'df_examples': None}} | ||
'reverse': 'lt', | ||
'df_examples': _ge_example_FRAME, | ||
'others': _comp_others}} | ||
|
||
_op_names = list(_op_descriptions.keys()) | ||
for key in _op_names: | ||
|
@@ -582,6 +751,35 @@ def _get_op_name(op, special): | |
DataFrame.{reverse} | ||
""" | ||
|
||
_flex_comp_doc_FRAME = """ | ||
Flexible wrappers to comparison operators (specifically ``{name}``). | ||
|
||
Wrappers (``eq``, ``ne``, ``le``, ``lt``, ``ge``, ``gt``) are equivalent to | ||
operators (``==``, ``=!``, ``<=``, ``<``, ``>=``, ``>``) with support to choose | ||
axis (rows or columns) for comparison. | ||
|
||
Parameters | ||
---------- | ||
other : DataFrame | ||
axis : {{0, 1, 'columns', 'rows'}} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here you can check an example on how we usually document the axis values. Also, it would be nice to have description for both There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually I pulled these two args from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep, still plenty of work in many docstrings, but we're getting closer. :) |
||
level : int or name | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will do. |
||
Broadcast across a level, matching Index values on the | ||
passed MultiIndex level | ||
|
||
Returns | ||
------- | ||
result : DataFrame | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can leave just the type, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. |
||
Consisting of boolean values | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it's a bit redundant, but a bit more explanation, like "Result of the comparison.". Note the period at the end. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even better. |
||
|
||
Examples | ||
-------- | ||
{df_examples} | ||
|
||
See also | ||
-------- | ||
{reverse} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Can you run There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are right. And strangely errors did not mention the order of See Also (possibly capitalization of A was the issue?). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We still need to add several validations to that script. The order of the sections is one of them. |
||
""" | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we need to explain what's going on here. And I'd probably set The blank line after the docstring is not required. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Has this been resolved? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you also add this explanation? So users can understand in an easier way what we are showing here. |
||
_flex_doc_PANEL = """ | ||
{desc} of series and other, element-wise (binary operator `{op_name}`). | ||
Equivalent to ``{equiv}``. | ||
|
@@ -1546,8 +1744,15 @@ def na_op(x, y): | |
result = mask_cmp_op(x, y, op, (np.ndarray, ABCSeries)) | ||
return result | ||
|
||
@Appender('Wrapper for flexible comparison methods {name}' | ||
.format(name=op_name)) | ||
if op_name in _op_descriptions: | ||
op_desc = _op_descriptions[op_name] | ||
|
||
base_doc = _flex_comp_doc_FRAME | ||
doc = base_doc.format(name=op_name, | ||
df_examples=op_desc['df_examples'].strip(), | ||
reverse=op_desc['others'].strip()) | ||
|
||
@Appender(doc) | ||
datapythonista marked this conversation as resolved.
Show resolved
Hide resolved
|
||
def f(self, other, axis=default_axis, level=None): | ||
|
||
other = _align_method_FRAME(self, other, axis) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to specify columns, they are already taken from the dictionary keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Columns specify order here as score will precede tool due to alphabetical order. But I will adjust with new examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, in that case I'd use a list of tuples, with the
columns
argument. But just a personal opinion, whatever you find clearer.