Skip to content

DOC: update the Series.str.join docstring #22174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 8, 2018

Conversation

feiphoon
Copy link
Contributor

@feiphoon feiphoon commented Aug 2, 2018

Checklist for the pandas documentation sprint (ignore this if you are doing an unrelated PR):

  • PR title is "DOC: update the docstring"
  • The validation script passes: scripts/validate_docstrings.py
  • The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • The html version looks good: python doc/make.py --single
  • It has been proofread on language by another sprint participant

@pep8speaks
Copy link

pep8speaks commented Aug 2, 2018

Hello @feiphoon! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on August 07, 2018 at 17:24 Hours UTC

@Cheukting
Copy link
Contributor

Looks great, minor problem, the line (Line 1108:80) is too long so may be break it up in 2 lines

@codecov
Copy link

codecov bot commented Aug 2, 2018

Codecov Report

Merging #22174 into master will increase coverage by 0.09%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22174      +/-   ##
==========================================
+ Coverage   92.06%   92.15%   +0.09%     
==========================================
  Files         169      169              
  Lines       50694    51312     +618     
==========================================
+ Hits        46671    47288     +617     
- Misses       4023     4024       +1
Flag Coverage Δ
#multiple 90.58% <ø> (+0.11%) ⬆️
#single 42.37% <ø> (+0.05%) ⬆️
Impacted Files Coverage Δ
pandas/core/strings.py 98.63% <ø> (ø) ⬆️
pandas/tseries/offsets.py 97.15% <0%> (ø) ⬆️
pandas/io/common.py 70.65% <0%> (ø) ⬆️
pandas/core/internals/blocks.py 94.45% <0%> (ø) ⬆️
pandas/core/missing.py 91.7% <0%> (+0.04%) ⬆️
pandas/core/indexes/base.py 96.43% <0%> (+0.05%) ⬆️
pandas/util/testing.py 85.9% <0%> (+0.2%) ⬆️
pandas/core/frame.py 97.51% <0%> (+0.24%) ⬆️
pandas/core/arrays/period.py 91.98% <0%> (+0.76%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6e1e1e4...e03704d. Read the comment docs.

@feiphoon
Copy link
Contributor Author

feiphoon commented Aug 2, 2018

Tagging @sebryan as pairing partner

will be `NaN`.

If the Series does not contain string objects, performing a join will raise
an AttributeError.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's not very clear. Do you mean that if the Series elements are not lists or NaN the join operation will raise an AttributeError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I agree. How about:
"If the supplied Series does not contain string objects, performing a join will raise an AttributeError.

s = pd.Series([1.1, np.nan]) # applying str.join raises an AttributeError
s = pd.Series([np.nan, np.nan]) # ditto
s = pd.Series([1.1, np.nan, ['swan', 'fish']]) # applying str.join produces something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be accurate, I'd say something like If the supplied Series does not contain strings or lists, performing a join will raise an AttributeError.

>>> import pandas
>>> pandas.Series(['foo', 'bar']).str.join('')
0    foo
1    bar
dtype: object

And if you think it's useful, you can add an example with a case when it raises the exception, in the Examples section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌 We asked the question of mentioning exceptions last night - of the two examples below, which is the preferred method here?

Appreciate all your guidance on this, I'm learning a lot! Thank you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't pay so much attention to the Raises section, to not add extra complexity, but it's surely a nice to have. So feel free to add it.

Notes is mainly used for implementation details. In this cases Raises can be more appropriate. I'd use Notes for example to explain that adding a NaN to a int column will convert it to float, and this kind of stuff.

Examples are usually the best way to illustrate things. But we don't want to get crazy with them either. Specially in docstrings, that being mixed with the code, can be annoying if too long. In this case, feel free to choose whether you think it's useful to show with an example a case where the AttributeError is raised. Or to omit it if you don't think it add much value. Up to you.

... ['duck', ['swan', 'fish'], 'guppy']])
... [1.1, 2.2, 3.3],
... ['cat', np.nan, 'dog'],
... ['cow', 4.5, 'goat'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch of the typo, but I think the indentaion was correct before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh! Thank you.

>>> s
0 [lion, elephant, zebra]
1 [1.1, 2.2, 3.3]
2 [cat, nan, dog]
3 [cow, 4.5, goat]
4 [duck, [swan, fish], guppy]
dtype: object

Join all lists using an '-', the lists containing object(s) of types other
than str will become a NaN.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment was useful, any particular reason to remove it?_

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We removed it after misdiagnosing an issue, I will reinstate it. Thank you!

@datapythonista datapythonista added Docs Strings String extension data type and string data labels Aug 2, 2018
@datapythonista datapythonista self-assigned this Aug 2, 2018
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for the contribution @feiphoon

@feiphoon
Copy link
Contributor Author

feiphoon commented Aug 8, 2018

@datapythonista thank you for your help!

The Travis CI build failed on one job - ci/lint.sh 'Not Linting'. Is there a way to fix or rerun this?

@jreback jreback added this to the 0.24.0 milestone Aug 8, 2018
@jreback
Copy link
Contributor

jreback commented Aug 8, 2018

thanks @feiphoon

@jreback
Copy link
Contributor

jreback commented Aug 8, 2018

will merge on green

@datapythonista datapythonista merged commit d7b408e into pandas-dev:master Aug 8, 2018
@feiphoon
Copy link
Contributor Author

feiphoon commented Aug 8, 2018

Thank you @jreback @datapythonista !

@feiphoon feiphoon deleted the docs-str-join-return branch August 8, 2018 14:39
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants