-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: fixed doc-string for combine & combine_first in pandas/core/series.py #22971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @brute4s99! Thanks for updating the PR.
Comment last updated on November 20, 2018 at 22:39 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #22971 +/- ##
==========================================
+ Coverage 92.28% 92.29% +<.01%
==========================================
Files 161 161
Lines 51457 51493 +36
==========================================
+ Hits 47489 47524 +35
- Misses 3968 3969 +1
Continue to review full report at Codecov.
|
pandas/core/series.py
Outdated
@@ -2266,30 +2266,41 @@ def _binop(self, other, func, level=None, fill_value=None): | |||
|
|||
def combine(self, other, func, fill_value=None): | |||
""" | |||
Combine the Series with a `Series` or `Scalar` according to `func`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this sentence is necessary given the explanation below.
pandas/core/series.py
Outdated
func : function | ||
Function that takes two scalars as inputs and return a scalar | ||
Function that takes two scalars as inputs and return a scalar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of returns a scalar
, I'd say returns a boolean
pandas/core/series.py
Outdated
|
||
Returns | ||
------- | ||
result : Series | ||
result : the combined `Series` object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of all this put Series
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, @mroeschke
pandas/core/series.py
Outdated
|
||
Examples | ||
-------- | ||
>>> s1 = pd.Series([1, 2]) | ||
>>> s2 = pd.Series([0, 3]) | ||
>>> s2 = pd.Series([0, 3, 4]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep this first example as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, @mroeschke
pandas/core/series.py
Outdated
>>> s1.combine(s2, lambda x1, x2: x1 if x1 < x2 else x2) | ||
0 0 | ||
1 2 | ||
2 4 | ||
dtype: int64 | ||
>>> s1.combine(s2, lambda x1, x2: x1 if x1 > x2 else x2,fill_value=787) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this example add a new s2
with some commentary about how the result make sense with the given fill_value
.
Also make sure this command complies with pep8.
@mroeschke please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several formatting issues
pandas/core/series.py
Outdated
@@ -2266,36 +2266,61 @@ def _binop(self, other, func, level=None, fill_value=None): | |||
|
|||
def combine(self, other, func, fill_value=None): | |||
""" | |||
Combine the Series with a Series or Scalar according to `func`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lowercase scalar.
pandas/core/series.py
Outdated
with optional fill value when an index is missing from one Series or | ||
the other | ||
with optional `fill_value` when an index is missing from the Series or | ||
the other value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading the whole summary I still don't understand what this method does. Can you try to explain it more clearly (I know the description comes from the original docstring and not your changes).
pandas/core/series.py
Outdated
The value(s) to be combined with the `Series`. | ||
func : Function | ||
`function` that takes two Scalars as inputs and returns a `bool`. | ||
fill_value : Scalar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Series
is capitalized because that's the name of the class. function
and scalar
show be lower case.
@brute4s99 can you update based on the previous review? |
Sure, just a minute
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, added few more comments.
@brute4s99 will you have time to address @datapythonista's comments? |
I am so sorry, @datapythonista
I totally forgot about this PR.
Will push requested changes in an hour.
|
Can you take a closer look at the document I sent. I think |
oh, darn it! Sorry, @datapythonista |
but @datapythonista , The order says See Also -> Notes -> Examples |
Not sure about that everywhere. The link is the standard we follow, if there is any docstring that doesn't follow that order, it'll be eventually changed. And we'll soon also fail the CI if the order is wrong (see #23245). |
Got it. I'll follow the link too then. |
599c421
to
f6099da
Compare
thinking of an example atm |
Please review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some issues.
pandas/core/series.py
Outdated
>>> arms | ||
dog 2 | ||
cat 2 | ||
mouse 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say all those animals have 4 legs and 0 arms. Use correct data, and different values (spider, monkey...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this example look fine?
>>> arms = pd.Series({'starfish': 5, 'kangaroo': 2})
>>> legs = pd.Series({'dog': 4,'kangaroo': 3})
>>> arms.combine(legs, lambda x, y: x+y, fill_value=0)
dog 4
kangaroo 5
starfish 5
dtype: int64
>>> arms.combine(legs, lambda x, y: x+y, fill_value=10)
dog 14
kangaroo 5
starfish 15
dtype: int64
>>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not find any alternative to defining a lambda
, since the next alternative was to use +
itself, which cannot be done as a function param afaik
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the example not following pep8 again, you may not be able to use a sum without a lambda
, but as I said, you can use the builtin max
function.
In the first example, remove completely the fill_value
param, and in the second use a example that makes sense.
Also, add a short description before each case, explaining what is the goal of what you're showing. pandas does not contain functions or methods that are not useful for real-world examples. And in the documentation examples we should show those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of combine() is to merge two Series into one.
So, I need 2 Series with some conflicting values[that use func
to resolve conflict], and some NOT CONFLICTING ones[that could make use of fill_value
to assume a value from the lacking Series]
I will use max
for an example I just thought!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks much better, yes. I'd name the Series s1
and s2
following our standards, and also have the animal names in all lower case.
And I don't quite understand the fill_value
part. I guess I'm misunderstanding something, but if:
>>> max(float('NaN'), 25)
nan
I would expect that in the first example duck
has a value of nan
. And in that case I'd use fill_value=0
to get the known value. I don't quite lack making up an average speed. But I guess the function doesn't work as I think.
Can you do a bit of research and move the examples to the PR, so I can add new comments in a review.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@datapythonista look! something funny!
>>> max(30,float('nan'))
30
>>> max(float('nan'),30)
nan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@datapythonista fill_value
is used as the default value for the missing key in the corresponding series before the comparison is made.
So in the example above, since data_B
doesn't have Duck
, the comparison is made as if data_B['Duck'] = fill_value = 40
so max(data_B['Duck'], data_A['Duck']) = max(40, 30) = 40
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect that in the first example
duck
has a value ofnan
.
This answer on StackOverflow clears this doubt for me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@datapythonista please read this !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice example @brute4s99, added some comments, but it's looking much better.
@mroeschke the problem was what @brute4s99 mentioned, the order in the max
with a NaN
is relevant (I didn't know that tricky case in Python).
pandas/core/series.py
Outdated
See Also | ||
-------- | ||
Series.combine_first : Combine Series values, choosing the calling | ||
Series' values first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you fix this please?
@datapythonista if
|
@mroeschke I agree on that. But the problem I had is that when
Which is weird, and seems to me like a bug in Python. |
Gotcha, sorry I assume the premise was if fill_value was not FWIW,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of things. Did you run validate_docstrings.py
and git diff upstream/master -u -- "*.py" | flake8 --diff
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks @brute4s99
thanks @brute4s99 |
Yay \o/ |
git diff upstream/master -u -- "*.py" | flake8 --diff