-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
DOC: update the pandas.Series.str.strip docstring #20863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Can you post the results of the doc string validator?
pandas/core/strings.py
Outdated
|
||
Parameters | ||
---------- | ||
to_strip : str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_strip : str, optional
pandas/core/strings.py
Outdated
Parameters | ||
---------- | ||
to_strip : str | ||
Specifying the set of characters to be removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just simplify this and say "The set of characters to be removed". It matches the standard Python string method so no need to be too wordy
pandas/core/strings.py
Outdated
|
||
>>> s = pd.Series([' ant', 'bee ', ' cat ']) | ||
|
||
>>> s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is simple enough that you don't need to print the Series as is
pandas/core/strings.py
Outdated
|
||
Examples | ||
-------- | ||
Striping whitespaces for Series |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Striping" is a typo here and below, though these examples are simple enough that I don't think you need to explain them
pandas/core/strings.py
Outdated
|
||
Striping a set of characters for Index | ||
|
||
>>> df = pd.DataFrame(index=['1.ant ','2._bee__','3. cat_']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just create the index directly instead of creating a DataFrame
index = pd.Index(['1.ant ']) ...
pandas/core/strings.py
Outdated
|
||
>>> df = pd.DataFrame(index=['1.ant ','2._bee__','3. cat_']) | ||
|
||
>>> df.index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again simple enough you don't need to print
pandas/core/strings.py
Outdated
2 cat | ||
dtype: object | ||
|
||
Striping a set of characters for Index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo here, but as mentioned above these are basic enough that I personally don't think you need to explain in detail
Hello @Cheukting! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on July 07, 2018 at 20:05 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #20863 +/- ##
=========================================
Coverage ? 91.8%
=========================================
Files ? 153
Lines ? 49411
Branches ? 0
=========================================
Hits ? 45361
Misses ? 4050
Partials ? 0
Continue to review full report at Codecov.
|
############################################################################### ################################## Validation ################################# ############################################################################### Docstring for "pandas.Series.str.strip" correct. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments.
Also, #20628 was working on this too. I think your changes look better, but you can take a look at that PR and see if there is any idea that you can copy here.
pandas/core/strings.py
Outdated
|
||
See Also | ||
-------- | ||
str.slice : Slice substrings from each element in the Series/Index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slice
doesn't seem very related to strip.
Not sure if there is anything else related, but we should have here Series.str.strip
, Series.str.lstrip
and Series.str.rstrip
, so they link each other.
pandas/core/strings.py
Outdated
to_strip : str, optional | ||
Specifying the set of characters to be removed. | ||
All combinations of this set of characters will be stripped. | ||
Default value is None, which means whaitspaces will be removed. | ||
|
||
Returns | ||
------- | ||
stripped : Series/Index of objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you get rid of the name stripped
, just leave the type in that line, and in the next add a description of what is being returned?
pandas/core/strings.py
Outdated
|
||
Parameters | ||
---------- | ||
to_strip : str, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not optional, the default is Ǹone
which means whitespaces are stripped You can add to_strip : str or None, default None (whitespaces are removed)
. And take it out from the description.
pandas/core/strings.py
Outdated
Stripping whitespaces for Series | ||
|
||
>>> s = pd.Series([' ant', 'bee ', ' cat ']) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this blank line is unnecessary.
pandas/core/strings.py
Outdated
Index(['1.ant ', '2._bee__', '3. cat_'], dtype='object') | ||
|
||
>>> idx.str.strip('123._ ') | ||
Index(['ant', 'bee', 'cat'], dtype='object') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a single example, with a single Series
with multiple values that illustrate:
- The method without parameters
s.str.strip()
and how it strips not only spaces but new lines and tabs - The characters to strip in the left, the right and both sides
- The use of the 3 methods being documented
lstrip
,rstrip
,rstrip
- The use of another character to strip (something that looks like a real-world example)
- Stripping more than a single character (also something a bit more real-world if possible would be great.
pandas/core/strings.py
Outdated
Stripping a set of characters for Index | ||
|
||
>>> idx = pd.Index(['1.ant ','2._bee__','3. cat_']) | ||
>>> s.str.strip('123.!? \n\t') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I find this example overcomplicated. I think something more realistic and simple could be:
- Illustrate the default
to_strip
parameter (whitespaces) withrstrip
in a value like ` text \n' lstrip(to_strip='0')
for a value like0010010
)- And then you can show
strip(to_strip=' \n0')
with both values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still wants to keep the 'Animal' theme in the example but I will simplify the lstrip
and rstrip
example
Hi @Cheukting, great work! As @datapythonista pointed out in #20897, it seems that our wires crossed during the sprint and we ended up working on the same thing. Seeing as this PR got in earlier, please feel free to grab anything from our PR that might help address Marc's notes, and we'll close ours. If you don't plan to refine this PR, please let us know and we'll continue on #20897. |
pandas/core/strings.py
Outdated
|
||
Parameters | ||
---------- | ||
to_strip : str or None, default None (whitespaces are removed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the coment about whitespaces are removed should be on the next line with a case, e.g.
If None then whitespaces are removed
pandas/core/strings.py
Outdated
|
||
Examples | ||
-------- | ||
>>> s = pd.Series(['1. Ant. ', '2. Bee!\n', '3. Cat?\t']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a np.nan in the example (in place of an entry or in addition is ok)
Looks great! Will merge on green. |
@mroeschke merge away |
Thanks @Cheukting! |
* DOC: update the pandas.Series.str.strip docstring * DOC: update the pandas.Series.str.strip docstring * typo fixing and improve Index example * Fix PEP8 * Improve parameter explaination * Improve documentation, adding lstrip and rstrip example * simplify lstrip and rstrip * added np.nan example
git diff upstream/master -u -- "*.py" | flake8 --diff