Skip to content

DOC: update the pandas.Series.str.startswith docstring #20458

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 25, 2018
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 36 additions & 7 deletions pandas/core/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -328,19 +328,48 @@ def str_contains(arr, pat, case=True, flags=0, na=np.nan, regex=True):

def str_startswith(arr, pat, na=np.nan):
"""
Return boolean Series/``array`` indicating whether each string in the
Series/Index starts with passed pattern. Equivalent to
:meth:`str.startswith`.
Test if the start of each string element matches a pattern.

Equivalent to :meth:`str.startswith`.

Parameters
----------
pat : string
Character sequence
na : bool, default NaN
pat : str
Character sequence. Regular expressions are not accepted.
na : object, default NaN
Object shown if element tested is not a string.

Returns
-------
startswith : Series/array of boolean values
Series or Index of bool
A Series of booleans indicating whether the given pattern matches
the start of each string element.

See Also
--------
str.startswith : Python standard library string method.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is some bug preventing the links to be generated when you build the html with the --single option. If you build the whole documenation doc/make.py html, it will take something like 5 minutes to complete, but I think you should have the links working.

Replacing str_endswith with Series.str.endswith is the right way.

Also, I'd personally add Series.str.contains, which also looks for the pattern, but in any position. So, I think it could be useful for some users visiting the page to know about it too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep thanks.
The entire build one shows the links properly.
Added contains

Series.str.endswith : Same as startswith, but tests the end of string.
Series.str.contains : Tests if string element contains a pattern.

Examples
--------
>>> s = pd.Series(['bat', 'Bear', 'cat', np.nan])
>>> s.str.startswith('b')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I wasn't clear in part of my last comment. The explanation you added looks great, that part is perfect.

But what I meant with adding >>> s is:

>>> s = pd.Series(['bat', 'Bear', 'cat', np.nan])
>>> s
(the user can see the series here)

So, you'd have a first block (ended with a blank line to make it a different box), where the user can see the data you'll be using in both examples.

Then you show the basic usage (line 357 currently), then the explanation of the second example you added. And for the second example you don't need to create the series again, as you had before.

So it'd be simply adding >>> s, its output and a blank linne after L356. And removing L366. That would be the standard way use in most examples, and IMO the clearest.

Sorry my previous comment was confusing.

0 True
1 False
2 False
3 NaN
dtype: object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would reuse the same Series with the NaN for both examples, with the na by default, and with a value. I think the example would be a bit more realistic with na=False.

Also, I think it could help some users with one of the animals starts with a capital B, which is not matched, so they see that this is case-senstitive.


Specifying `na` to be `False` instead of `NaN`.

>>> s = pd.Series(['bat', 'Bear', 'cat', np.nan])
>>> s.str.startswith('b', na=False)
0 True
1 False
2 False
3 False
dtype: bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in more cases we show >>> s after defining it, and we leave a blank line between the different examples, so they are in different boxes in the html. Also, for the last example a short explanation of what you are doing could be useful for users.

You can see an example of what I mean here: https://github.com/dcreekp/pandas/blob/39f76413374109d8c34021a6b61d121d3d05c9a0/pandas/core/strings.py#L1347 probably we don't need many explanations in this case, as the example is quite obvious, but a short sentence for the last case could help users see what's going on faster.

"""
f = lambda x: x.startswith(pat)
return _na_map(f, arr, na, dtype=bool)
Expand Down