-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: update the Series.str.join docstring #22174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: update the Series.str.join docstring #22174
Conversation
Hello @feiphoon! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on August 07, 2018 at 17:24 Hours UTC |
Looks great, minor problem, the line (Line 1108:80) is too long so may be break it up in 2 lines |
Codecov Report
@@ Coverage Diff @@
## master #22174 +/- ##
==========================================
+ Coverage 92.06% 92.15% +0.09%
==========================================
Files 169 169
Lines 50694 51312 +618
==========================================
+ Hits 46671 47288 +617
- Misses 4023 4024 +1
Continue to review full report at Codecov.
|
Tagging @sebryan as pairing partner |
pandas/core/strings.py
Outdated
will be `NaN`. | ||
|
||
If the Series does not contain string objects, performing a join will raise | ||
an AttributeError. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's not very clear. Do you mean that if the Series elements are not lists or NaN
the join operation will raise an AttributeError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I agree. How about:
"If the supplied Series does not contain string objects, performing a join will raise an AttributeError.
s = pd.Series([1.1, np.nan]) # applying str.join raises an AttributeError
s = pd.Series([np.nan, np.nan]) # ditto
s = pd.Series([1.1, np.nan, ['swan', 'fish']]) # applying str.join produces something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be accurate, I'd say something like If the supplied Series does not contain strings or lists, performing a join will raise an AttributeError
.
>>> import pandas
>>> pandas.Series(['foo', 'bar']).str.join('')
0 foo
1 bar
dtype: object
And if you think it's useful, you can add an example with a case when it raises the exception, in the Examples
section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌 We asked the question of mentioning exceptions last night - of the two examples below, which is the preferred method here?
- Notes & Examples: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.select_dtypes.html
- Raises section: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.update.html
Appreciate all your guidance on this, I'm learning a lot! Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't pay so much attention to the Raises
section, to not add extra complexity, but it's surely a nice to have. So feel free to add it.
Notes
is mainly used for implementation details. In this cases Raises
can be more appropriate. I'd use Notes
for example to explain that adding a NaN
to a int
column will convert it to float
, and this kind of stuff.
Examples
are usually the best way to illustrate things. But we don't want to get crazy with them either. Specially in docstrings, that being mixed with the code, can be annoying if too long. In this case, feel free to choose whether you think it's useful to show with an example a case where the AttributeError
is raised. Or to omit it if you don't think it add much value. Up to you.
pandas/core/strings.py
Outdated
... ['duck', ['swan', 'fish'], 'guppy']]) | ||
... [1.1, 2.2, 3.3], | ||
... ['cat', np.nan, 'dog'], | ||
... ['cow', 4.5, 'goat'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch of the typo, but I think the indentaion was correct before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh! Thank you.
pandas/core/strings.py
Outdated
>>> s | ||
0 [lion, elephant, zebra] | ||
1 [1.1, 2.2, 3.3] | ||
2 [cat, nan, dog] | ||
3 [cow, 4.5, goat] | ||
4 [duck, [swan, fish], guppy] | ||
dtype: object | ||
|
||
Join all lists using an '-', the lists containing object(s) of types other | ||
than str will become a NaN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment was useful, any particular reason to remove it?_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We removed it after misdiagnosing an issue, I will reinstate it. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks for the contribution @feiphoon
@datapythonista thank you for your help! The Travis CI build failed on one job - ci/lint.sh 'Not Linting'. Is there a way to fix or rerun this? |
thanks @feiphoon |
will merge on green |
Thank you @jreback @datapythonista ! |
Checklist for the pandas documentation sprint (ignore this if you are doing an unrelated PR):