Skip to content

Commit f851699

Browse files
h-vetinariTomAugspurger
authored andcommitted
API: str.cat will align on Series (pandas-dev#20347)
1 parent 3471b98 commit f851699

File tree

4 files changed

+729
-86
lines changed

4 files changed

+729
-86
lines changed

doc/source/text.rst

+70-10
Original file line numberDiff line numberDiff line change
@@ -247,27 +247,87 @@ Missing values on either side will result in missing values in the result as wel
247247
s.str.cat(t)
248248
s.str.cat(t, na_rep='-')
249249
250-
Series are *not* aligned on their index before concatenation:
250+
Concatenating a Series and something array-like into a Series
251+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
252+
253+
.. versionadded:: 0.23.0
254+
255+
The parameter ``others`` can also be two-dimensional. In this case, the number or rows must match the lengths of the calling ``Series`` (or ``Index``).
251256

252257
.. ipython:: python
253258
254-
u = pd.Series(['b', 'd', 'e', 'c'], index=[1, 3, 4, 2])
255-
# without alignment
259+
d = pd.concat([t, s], axis=1)
260+
s
261+
d
262+
s.str.cat(d, na_rep='-')
263+
264+
Concatenating a Series and an indexed object into a Series, with alignment
265+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
266+
267+
.. versionadded:: 0.23.0
268+
269+
For concatenation with a ``Series`` or ``DataFrame``, it is possible to align the indexes before concatenation by setting
270+
the ``join``-keyword.
271+
272+
.. ipython:: python
273+
274+
u = pd.Series(['b', 'd', 'a', 'c'], index=[1, 3, 0, 2])
275+
s
276+
u
256277
s.str.cat(u)
257-
# with separate alignment
258-
v, w = s.align(u)
259-
v.str.cat(w, na_rep='-')
278+
s.str.cat(u, join='left')
279+
280+
.. warning::
281+
282+
If the ``join`` keyword is not passed, the method :meth:`~Series.str.cat` will currently fall back to the behavior before version 0.23.0 (i.e. no alignment),
283+
but a ``FutureWarning`` will be raised if any of the involved indexes differ, since this default will change to ``join='left'`` in a future version.
284+
285+
The usual options are available for ``join`` (one of ``'left', 'outer', 'inner', 'right'``).
286+
In particular, alignment also means that the different lengths do not need to coincide anymore.
287+
288+
.. ipython:: python
289+
290+
v = pd.Series(['z', 'a', 'b', 'd', 'e'], index=[-1, 0, 1, 3, 4])
291+
s
292+
v
293+
s.str.cat(v, join='left', na_rep='-')
294+
s.str.cat(v, join='outer', na_rep='-')
295+
296+
The same alignment can be used when ``others`` is a ``DataFrame``:
297+
298+
.. ipython:: python
299+
300+
f = d.loc[[3, 2, 1, 0], :]
301+
s
302+
f
303+
s.str.cat(f, join='left', na_rep='-')
260304
261305
Concatenating a Series and many objects into a Series
262306
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
263307

264-
List-likes (excluding iterators, ``dict``-views, etc.) can be arbitrarily combined in a list.
265-
All elements of the list must match in length to the calling ``Series`` (resp. ``Index``):
308+
All one-dimensional list-likes can be arbitrarily combined in a list-like container (including iterators, ``dict``-views, etc.):
309+
310+
.. ipython:: python
311+
312+
s
313+
u
314+
s.str.cat([u, pd.Index(u.values), ['A', 'B', 'C', 'D'], map(int, u.index)], na_rep='-')
315+
316+
All elements must match in length to the calling ``Series`` (or ``Index``), except those having an index if ``join`` is not None:
317+
318+
.. ipython:: python
319+
320+
v
321+
s.str.cat([u, v, ['A', 'B', 'C', 'D']], join='outer', na_rep='-')
322+
323+
If using ``join='right'`` on a list of ``others`` that contains different indexes,
324+
the union of these indexes will be used as the basis for the final concatenation:
266325

267326
.. ipython:: python
268327
269-
x = pd.Series([1, 2, 3, 4], index=['A', 'B', 'C', 'D'])
270-
s.str.cat([['A', 'B', 'C', 'D'], s, s.values, x.index])
328+
u.loc[[3]]
329+
v.loc[[-1, 0]]
330+
s.str.cat([u.loc[[3]], v.loc[[-1, 0]]], join='right', na_rep='-')
271331
272332
Indexing with ``.str``
273333
----------------------

doc/source/whatsnew/v0.23.0.txt

+18
Original file line numberDiff line numberDiff line change
@@ -308,6 +308,24 @@ The :func:`DataFrame.assign` now accepts dependent keyword arguments for python
308308

309309
df.assign(A=df.A+1, C= lambda df: df.A* -1)
310310

311+
.. _whatsnew_0230.enhancements.str_cat_align:
312+
313+
``Series.str.cat`` has gained the ``join`` kwarg
314+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
315+
316+
Previously, :meth:`Series.str.cat` did not -- in contrast to most of ``pandas`` -- align :class:`Series` on their index before concatenation (see :issue:`18657`).
317+
The method has now gained a keyword ``join`` to control the manner of alignment, see examples below and in :ref:`here <text.concatenate>`.
318+
319+
In v.0.23 `join` will default to None (meaning no alignment), but this default will change to ``'left'`` in a future version of pandas.
320+
321+
.. ipython:: python
322+
323+
s = pd.Series(['a', 'b', 'c', 'd'])
324+
t = pd.Series(['b', 'd', 'e', 'c'], index=[1, 3, 4, 2])
325+
s.str.cat(t)
326+
s.str.cat(t, join='left', na_rep='-')
327+
328+
Furthermore, meth:`Series.str.cat` now works for ``CategoricalIndex`` as well (previously raised a ``ValueError``; see :issue:`20842`).
311329

312330
.. _whatsnew_0230.enhancements.astype_category:
313331

0 commit comments

Comments
 (0)