DOC: update the Series.str.join docstring #20463

fdroessler · 2018-03-22T21:57:36Z

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

PR title is "DOC: update the docstring"
The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
The html version looks good: python doc/make.py --single <your-function-or-method>
It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:

################################################################################
###################### Docstring (pandas.Series.str.join) ######################
################################################################################

Join lists contained as elements in the Series/Index with passed delimiter.

If the elements of a Series are lists themselves, join the content of these
lists using the delimiter passed to the function.
This function is an equivalent to :meth:`str.join`.

Parameters
----------
sep : str
    Delimiter to use between list entries.

Returns
-------
Series/Index of objects

Notes
-----
If any of the lists does not contain string objects the result of the join
will be `NaN`.

See Also
--------
str.join : Standard library version of this method.
Series.str.split : Split strings around given separator/delimiter.

Examples
--------

Example with a list that contains non-string elements.

>>> s = pd.Series([['lion', 'elephant', 'zebra'],
...                [1.1, 2.2, 3.3],
...                ["cat", np.nan, "dog"],
...                ["cow", 4.5, "goat"]])
>>> s
0    [lion, elephant, zebra]
1            [1.1, 2.2, 3.3]
2            [cat, nan, dog]
3           [cow, 4.5, goat]
dtype: object

Join all lists using an '-', the list of floats will become a NaN.

>>> s.str.join('-')
0    lion-elephant-zebra
1                    NaN
2                    NaN
3                    NaN
dtype: object

################################################################################
################################## Validation ##################################
################################################################################

Docstring for "pandas.Series.str.join" correct. :)

If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.

codecov · 2018-03-23T04:14:56Z

Codecov Report

Merging #20463 into master will increase coverage by 0.04%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #20463      +/-   ##
==========================================
+ Coverage    91.8%   91.85%   +0.04%     
==========================================
  Files         152      152              
  Lines       49223    49231       +8     
==========================================
+ Hits        45191    45220      +29     
+ Misses       4032     4011      -21

Flag	Coverage Δ
#multiple	`90.23% <ø> (+0.04%)`	⬆️
#single	`41.83% <ø> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/strings.py	`98.32% <ø> (ø)`	⬆️
pandas/core/arrays/categorical.py	`96.2% <0%> (-0.02%)`	⬇️
pandas/core/frame.py	`97.18% <0%> (ø)`	⬆️
pandas/core/generic.py	`95.85% <0%> (ø)`	⬆️
pandas/plotting/_core.py	`82.5% <0%> (ø)`	⬆️
pandas/io/formats/csvs.py	`98.13% <0%> (+0.08%)`	⬆️
pandas/io/parsers.py	`95.45% <0%> (+0.12%)`	⬆️
pandas/core/groupby.py	`92.55% <0%> (+0.41%)`	⬆️
pandas/util/testing.py	`84.73% <0%> (+0.61%)`	⬆️
pandas/io/common.py	`70.04% <0%> (+1.26%)`	⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 02477da...323d072. Read the comment docs.

datapythonista

Looks great, just a comment about naming conventions, and an idea to make examples more compact and more illustrative.

datapythonista · 2018-03-23T11:29:09Z

pandas/core/strings.py

+    Examples
+    --------
+
+    >>> df = pd.Series({1: ['dog', 'cat', 'fish'],


can you rename df to s. df is used for DataFrame, and makes it a bit confusing.

datapythonista · 2018-03-23T11:30:22Z

pandas/core/strings.py

+
+    Example with a list that contains non-string elements.
+
+    >>> df = pd.Series({1: [1.1, 2.2, 3.3],


Same as before.

datapythonista · 2018-03-23T11:33:21Z

pandas/core/strings.py

+    1                    NaN
+    2    lion-elephant-zebra
+    3        cow-pig-chicken
+    dtype: object


I think that this second example, the first one is actually a bit redundant. As this exemplifies both cases, strings and not strings.

And I think we could even show in one of the rows the floats, another the strings (both as you did), and use the third one to illustrate a list with a NaN, which I assume it returns a NaN, but it may be not obvious for all users.

fdroessler · 2018-03-23T11:58:02Z

Thanks for the feedback @datapythonista have updated the PR

datapythonista

Looks great, added couple of ideas.

datapythonista · 2018-03-23T12:12:05Z

pandas/core/strings.py

+
+    >>> s = pd.Series({1: ['lion', 'elephant', 'zebra'],
+    ...                2: [1.1, 2.2, 3.3],
+    ...                3: [np.nan, np.nan, np.nan]})


Minor comment, but I think it'd be more useful for users to see that joining ['cat', np.nan, 'dog'] becomes NaN, than that joining some NaNs become NaN. That could also apply in the number example ['cat', 23, 'dog']. Feel free to disagree if you think it's more clear with unique types.

Also, it's something very subtle, but I find slightly distracting using a specific index in the example, that is not used. It surely doesn't make a big difference, but I'd construct the Series with a list of lists, and use the default index, so nobody thinks the index has an impact on joining the elements.

datapythonista · 2018-03-23T12:12:47Z

pandas/core/strings.py

+    3            [nan, nan, nan]
+    dtype: object
+
+    Join all lists using an '-', the list of floats will become a NaN.


I think this comment needs to be updated after adding the NaN?

datapythonista · 2018-03-23T12:13:44Z

pandas/core/strings.py


    Returns
    -------
-    joined : Series/Index of objects
+    Series/Index of objects


I'd use object instead of objects, as this is more a type definition than an explanations (may be one day we can use these types as annotations?)

fdroessler · 2018-03-23T22:51:05Z

good points, thanks a lot 👍

datapythonista

lgtm

TomAugspurger · 2018-03-27T12:10:43Z

Thanks @fdroessler !

privat added 3 commits March 22, 2018 20:17

added example to the str.join function

dd24bc8

fixed pep8 issues and reformated examples

60549cd

changed to animal names

a169347

datapythonista reviewed Mar 23, 2018

View reviewed changes

applied comments on PR, df -> s and compact example

333193e

datapythonista reviewed Mar 23, 2018

View reviewed changes

added mixed type examples

323d072

datapythonista approved these changes Mar 24, 2018

View reviewed changes

jreback added Docs Strings String extension data type and string data labels Mar 26, 2018

TomAugspurger merged commit 89444ad into pandas-dev:master Mar 27, 2018

TomAugspurger added this to the 0.23.0 milestone Mar 27, 2018

javadnoorb pushed a commit to javadnoorb/pandas that referenced this pull request Mar 29, 2018

DOC: update the Series.str.join docstring (pandas-dev#20463)

15b7138

dworvos pushed a commit to dworvos/pandas that referenced this pull request Apr 2, 2018

DOC: update the Series.str.join docstring (pandas-dev#20463)

1072ca7

kornilova203 pushed a commit to kornilova203/pandas that referenced this pull request Apr 23, 2018

DOC: update the Series.str.join docstring (pandas-dev#20463)

680a17f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: update the Series.str.join docstring #20463

DOC: update the Series.str.join docstring #20463

fdroessler commented Mar 22, 2018 •

edited

Loading

codecov bot commented Mar 23, 2018 •

edited

Loading

datapythonista left a comment

datapythonista Mar 23, 2018

datapythonista Mar 23, 2018

datapythonista Mar 23, 2018

fdroessler commented Mar 23, 2018

datapythonista left a comment

datapythonista Mar 23, 2018

datapythonista Mar 23, 2018

datapythonista Mar 23, 2018

fdroessler commented Mar 23, 2018

datapythonista left a comment

TomAugspurger commented Mar 27, 2018


		Example with a list that contains non-string elements.

		>>> df = pd.Series({1: [1.1, 2.2, 3.3],

DOC: update the Series.str.join docstring #20463

DOC: update the Series.str.join docstring #20463

Conversation

fdroessler commented Mar 22, 2018 • edited Loading

codecov bot commented Mar 23, 2018 • edited Loading

Codecov Report

datapythonista left a comment

Choose a reason for hiding this comment

datapythonista Mar 23, 2018

Choose a reason for hiding this comment

datapythonista Mar 23, 2018

Choose a reason for hiding this comment

datapythonista Mar 23, 2018

Choose a reason for hiding this comment

fdroessler commented Mar 23, 2018

datapythonista left a comment

Choose a reason for hiding this comment

datapythonista Mar 23, 2018

Choose a reason for hiding this comment

datapythonista Mar 23, 2018

Choose a reason for hiding this comment

datapythonista Mar 23, 2018

Choose a reason for hiding this comment

fdroessler commented Mar 23, 2018

datapythonista left a comment

Choose a reason for hiding this comment

TomAugspurger commented Mar 27, 2018

fdroessler commented Mar 22, 2018 •

edited

Loading

codecov bot commented Mar 23, 2018 •

edited

Loading