Skip to content

Commit 8f24748

Browse files
mananpal1997jorisvandenbossche
authored andcommitted
DOC: update the pandas.Series.str.split docstring (#20282)
1 parent 74e6c78 commit 8f24748

File tree

1 file changed

+72
-8
lines changed

1 file changed

+72
-8
lines changed

pandas/core/strings.py

+72-8
Original file line numberDiff line numberDiff line change
@@ -1095,24 +1095,88 @@ def str_pad(arr, width, side='left', fillchar=' '):
10951095

10961096
def str_split(arr, pat=None, n=None):
10971097
"""
1098-
Split each string (a la re.split) in the Series/Index by given
1099-
pattern, propagating NA values. Equivalent to :meth:`str.split`.
1098+
Split strings around given separator/delimiter.
1099+
1100+
Split each string in the caller's values by given
1101+
pattern, propagating NaN values. Equivalent to :meth:`str.split`.
11001102
11011103
Parameters
11021104
----------
1103-
pat : string, default None
1104-
String or regular expression to split on. If None, splits on whitespace
1105+
pat : str, optional
1106+
String or regular expression to split on.
1107+
If not specified, split on whitespace.
11051108
n : int, default -1 (all)
1106-
None, 0 and -1 will be interpreted as return all splits
1109+
Limit number of splits in output.
1110+
``None``, 0 and -1 will be interpreted as return all splits.
11071111
expand : bool, default False
1108-
* If True, return DataFrame/MultiIndex expanding dimensionality.
1109-
* If False, return Series/Index.
1112+
Expand the splitted strings into separate columns.
11101113
1111-
return_type : deprecated, use `expand`
1114+
* If ``True``, return DataFrame/MultiIndex expanding dimensionality.
1115+
* If ``False``, return Series/Index, containing lists of strings.
11121116
11131117
Returns
11141118
-------
11151119
split : Series/Index or DataFrame/MultiIndex of objects
1120+
Type matches caller unless ``expand=True`` (return type is DataFrame or
1121+
MultiIndex)
1122+
1123+
Notes
1124+
-----
1125+
The handling of the `n` keyword depends on the number of found splits:
1126+
1127+
- If found splits > `n`, make first `n` splits only
1128+
- If found splits <= `n`, make all splits
1129+
- If for a certain row the number of found splits < `n`,
1130+
append `None` for padding up to `n` if ``expand=True``
1131+
1132+
Examples
1133+
--------
1134+
>>> s = pd.Series(["this is good text", "but this is even better"])
1135+
1136+
By default, split will return an object of the same size
1137+
having lists containing the split elements
1138+
1139+
>>> s.str.split()
1140+
0 [this, is, good, text]
1141+
1 [but, this, is, even, better]
1142+
dtype: object
1143+
>>> s.str.split("random")
1144+
0 [this is good text]
1145+
1 [but this is even better]
1146+
dtype: object
1147+
1148+
When using ``expand=True``, the split elements will
1149+
expand out into separate columns.
1150+
1151+
>>> s.str.split(expand=True)
1152+
0 1 2 3 4
1153+
0 this is good text None
1154+
1 but this is even better
1155+
>>> s.str.split(" is ", expand=True)
1156+
0 1
1157+
0 this good text
1158+
1 but this even better
1159+
1160+
Parameter `n` can be used to limit the number of splits in the output.
1161+
1162+
>>> s.str.split("is", n=1)
1163+
0 [th, is good text]
1164+
1 [but th, is even better]
1165+
dtype: object
1166+
>>> s.str.split("is", n=1, expand=True)
1167+
0 1
1168+
0 th is good text
1169+
1 but th is even better
1170+
1171+
If NaN is present, it is propagated throughout the columns
1172+
during the split.
1173+
1174+
>>> s = pd.Series(["this is good text", "but this is even better", np.nan])
1175+
>>> s.str.split(n=3, expand=True)
1176+
0 1 2 3
1177+
0 this is good text
1178+
1 but this is even better
1179+
2 NaN NaN NaN NaN
11161180
"""
11171181
if pat is None:
11181182
if n is None or n == 0:

0 commit comments

Comments
 (0)