Skip to content

DOC: Improve the docstring of String.str.zfill() #20864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 7, 2018
44 changes: 40 additions & 4 deletions pandas/core/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -2114,19 +2114,55 @@ def rjust(self, width, fillchar=' '):

def zfill(self, width):
"""
Filling left side of strings in the Series/Index with 0.
Equivalent to :meth:`str.zfill`.
Pad strings in the Series/Index by prepending '0' characters.

Strings in the Series/Index are padded with prepending '0' characeters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in characters

(i.e. on the left of the string) to reach a total string length
of `width`. in the Series/Index with length greater than `width`
are unchanged.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something went wrong with that last sentence, two spaces, not starting with a capital letter, and I think it's not that clear, a word or something is missing I guess.


Note: Differs from :meth:`str.zfill` which has special handling
for '+'/'-' in the string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a Notes section in the numpy docstring convention that we can use for this:

Notes
-------

https://python-sprints.github.io/pandas/guide/pandas_docstring.html#section-6-notes


Parameters
----------
width : int
Minimum width of resulting string; additional characters will be
filled with 0
Minimum length of resulting string; strings with length less
than `width` be prepended with '0' characters.

Returns
-------
filled : Series/Index of objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can get rid of the filled and just leave the type (it was usual to name the return, but it was decided to not name them anymore as it doesn't add much value).


See Also
--------
Series.str.rjust: Fills the left side of strings with an arbitrary
character.
Series.str.ljust: Fills the right side of strings with an arbitrary
character.
Series.str.pad: Fills the specified sides of strings with an arbitrary
character.
Series.str.center: Fills boths sides of strings with an arbitrary
character.

Examples
--------
>>> s = pd.Series(['-2', '+5', '10', '127', '423523'])
>>> s.str.zfill(5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great now. Just couple of ideas, feel free to ignore them if you prefer the way it is now.

I think it could be useful to add a short paragraph before this line explaining that the non-string values (127 and NaN) become NaN, and may be reemphasizing what you already have in the body of the docstring, that the values with sign (-2 and +5) get the zeros in the left of the sign, and that 423523 keeps unchanged as it's longer than width.

I also think that it would make things easier/faster to understand if the values look a bit less arbitrary. For example ['-1', '1', 10, np.nan, '1000'] (with a width of 3 in this example).

0 000-2
1 000+5
2 00010
3 00127
4 423523
dtype: object

>>> s = pd.Series([-2, 5], dtype=str)
>>> s.str.zfill(5)
0 000-2
1 00005
dtype: object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the example looks cool, but I think we can make it a bit more compact and clear by:

  • Just creating a single Series (I'd get rid of one of 10 or 127 as they show the same user case, and I'd add a NaN and a number (as number, not as string).
  • After creating the Series I'd show it without modification (i.e. >>> s). This makes the result easier to compare with the original data.
  • We can get rid of the second example if we add the numbers in the first.

"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to add this blank line here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add this blank line.

result = str_pad(self._data, width, side='left', fillchar='0')
return self._wrap_result(result)

Expand Down