Skip to content

DOC: Update the Series.str.len docstring #22187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Aug 9, 2018

Conversation

IsvenC
Copy link
Contributor

@IsvenC IsvenC commented Aug 3, 2018

[X] PR title is "DOC: update the docstring"
[X] The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
[X] The html version looks good: python doc/make.py --single

@datapythonista datapythonista added Docs Strings String extension data type and string data labels Aug 3, 2018
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @IsvenC, looks great. Just added some minor comments.


Returns
-------
lengths : Series/Index of integer values
Series or Index of integer values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually use Python types, Series or Index of int would be more consistent with the rest of the docstrings.


See Also
--------
str.len : Python built-in function returning the length of an object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding Series.size here? I think some users could come by mistake to this page looking how to check the length of the Series. And it'd be nice to help them find it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Included in new commit.


Examples
--------

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this blank line here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - I have taken it out

--------

Returning a series of integer values as floats when `NaN` is returned as a
result.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a useful comment, but not sure if very relevant in this context (the user checking the examples of the str.len). As this is a general pandas "problem", I'd simply omit it here. But if you think it's important in this page, the Notes section is probably a better place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough! A user can still see that the values are integers even if they happen to be floats.

Returning the length (number of entries) of dictionaries, lists or
tuples as integer values.

>>> s = pd.Series([{'foo':'bar'}, [2,3,5,7], ('one', 'two', 'three')])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing spaces after colon and commas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes - thanks for spotting! Have changed in new commit.

0 1
1 4
2 3
dtype: int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just create a single Series s with the elements of both examples. I think that would make the example more concise, and it'd still illustrate the same cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Changed in new commit.

Returning a series of integer values as floats when `NaN` is returned as a
result.

>>> s = pd.Series(['dog', 5, 'bird', np.nan])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor thing, but feels like if we're trying to confuse the user by not placing the two string values one after the other. And may be I'd show the empty string '' instead of one of the animals, that they are illustrating the same case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have changed the structure of the consolidated example to incorporate an empty string.

@codecov
Copy link

codecov bot commented Aug 3, 2018

Codecov Report

Merging #22187 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #22187   +/-   ##
=======================================
  Coverage   92.06%   92.06%           
=======================================
  Files         169      169           
  Lines       50694    50694           
=======================================
  Hits        46671    46671           
  Misses       4023     4023
Flag Coverage Δ
#multiple 90.47% <100%> (ø) ⬆️
#single 42.32% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/strings.py 98.63% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35dd15b...f395314. Read the comment docs.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, great docstring. Thanks for it @IsvenC

@jreback jreback added this to the 0.24.0 milestone Aug 9, 2018
@jreback jreback merged commit ec58e4e into pandas-dev:master Aug 9, 2018
@jreback
Copy link
Contributor

jreback commented Aug 9, 2018

thanks @IsvenC

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants