-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: update the Series.str.join docstring #20463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #20463 +/- ##
==========================================
+ Coverage 91.8% 91.85% +0.04%
==========================================
Files 152 152
Lines 49223 49231 +8
==========================================
+ Hits 45191 45220 +29
+ Misses 4032 4011 -21
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, just a comment about naming conventions, and an idea to make examples more compact and more illustrative.
pandas/core/strings.py
Outdated
Examples | ||
-------- | ||
|
||
>>> df = pd.Series({1: ['dog', 'cat', 'fish'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rename df
to s
. df
is used for DataFrame
, and makes it a bit confusing.
pandas/core/strings.py
Outdated
|
||
Example with a list that contains non-string elements. | ||
|
||
>>> df = pd.Series({1: [1.1, 2.2, 3.3], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before.
pandas/core/strings.py
Outdated
1 NaN | ||
2 lion-elephant-zebra | ||
3 cow-pig-chicken | ||
dtype: object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this second example, the first one is actually a bit redundant. As this exemplifies both cases, strings and not strings.
And I think we could even show in one of the rows the floats, another the strings (both as you did), and use the third one to illustrate a list with a NaN
, which I assume it returns a NaN
, but it may be not obvious for all users.
Thanks for the feedback @datapythonista have updated the PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, added couple of ideas.
pandas/core/strings.py
Outdated
|
||
>>> s = pd.Series({1: ['lion', 'elephant', 'zebra'], | ||
... 2: [1.1, 2.2, 3.3], | ||
... 3: [np.nan, np.nan, np.nan]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment, but I think it'd be more useful for users to see that joining ['cat', np.nan, 'dog']
becomes NaN
, than that joining some NaN
s become NaN
. That could also apply in the number example ['cat', 23, 'dog']
. Feel free to disagree if you think it's more clear with unique types.
Also, it's something very subtle, but I find slightly distracting using a specific index in the example, that is not used. It surely doesn't make a big difference, but I'd construct the Series
with a list of lists, and use the default index, so nobody thinks the index has an impact on joining the elements.
pandas/core/strings.py
Outdated
3 [nan, nan, nan] | ||
dtype: object | ||
|
||
Join all lists using an '-', the list of floats will become a NaN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment needs to be updated after adding the NaN
?
pandas/core/strings.py
Outdated
|
||
Returns | ||
------- | ||
joined : Series/Index of objects | ||
Series/Index of objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use object
instead of objects
, as this is more a type definition than an explanations (may be one day we can use these types as annotations?)
good points, thanks a lot 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks @fdroessler ! |
Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):
scripts/validate_docstrings.py <your-function-or-method>
git diff upstream/master -u -- "*.py" | flake8 --diff
python doc/make.py --single <your-function-or-method>
Please include the output of the validation script below between the "```" ticks:
If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.