@@ -196,9 +196,9 @@ DataFrame with one column per group.
196
196
Elements that do not match return a row filled with ``NaN ``. Thus, a
197
197
Series of messy strings can be "converted" into a like-indexed Series
198
198
or DataFrame of cleaned-up or more useful strings, without
199
- necessitating ``get() `` to access tuples or ``re.match `` objects. The
200
- results dtype always is object, even if no match is found and the
201
- result only contains ``NaN ``.
199
+ necessitating ``get() `` to access tuples or ``re.match `` objects. The
200
+ dtype of the result is always object, even if no match is found and
201
+ the result only contains ``NaN ``.
202
202
203
203
Named groups like
204
204
@@ -275,15 +275,16 @@ Extract all matches in each subject (extractall)
275
275
276
276
.. _text.extractall :
277
277
278
+ .. versionadded :: 0.18.0
279
+
278
280
Unlike ``extract `` (which returns only the first match),
279
281
280
282
.. ipython :: python
281
283
282
284
s = pd.Series([" a1a2" , " b1" , " c1" ], [" A" , " B" , " C" ])
283
285
s
284
- s.str.extract(" [ab](?P<digit>\d)" , expand = False )
285
-
286
- .. versionadded :: 0.18.0
286
+ two_groups = ' (?P<letter>[a-z])(?P<digit>[0-9])'
287
+ s.str.extract(two_groups, expand = True )
287
288
288
289
the ``extractall `` method returns every match. The result of
289
290
``extractall `` is always a ``DataFrame `` with a ``MultiIndex `` on its
@@ -292,30 +293,29 @@ indicates the order in the subject.
292
293
293
294
.. ipython :: python
294
295
295
- s.str.extractall(" [ab](?P<digit>\d) " )
296
+ s.str.extractall(two_groups )
296
297
297
298
When each subject string in the Series has exactly one match,
298
299
299
300
.. ipython :: python
300
301
301
302
s = pd.Series([' a3' , ' b3' , ' c2' ])
302
303
s
303
- two_groups = ' (?P<letter>[a-z])(?P<digit>[0-9])'
304
304
305
305
then ``extractall(pat).xs(0, level='match') `` gives the same result as
306
306
``extract(pat) ``.
307
307
308
308
.. ipython :: python
309
309
310
- extract_result = s.str.extract(two_groups, expand = False )
310
+ extract_result = s.str.extract(two_groups, expand = True )
311
311
extract_result
312
312
extractall_result = s.str.extractall(two_groups)
313
313
extractall_result
314
314
extractall_result.xs(0 , level = " match" )
315
315
316
316
317
317
Testing for Strings that Match or Contain a Pattern
318
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
318
+ ---------------------------------------------------
319
319
320
320
You can check whether elements contain a pattern:
321
321
@@ -355,7 +355,7 @@ Methods like ``match``, ``contains``, ``startswith``, and ``endswith`` take
355
355
s4.str.contains(' A' , na = False )
356
356
357
357
Creating Indicator Variables
358
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
358
+ ----------------------------
359
359
360
360
You can extract dummy variables from string columns.
361
361
For example if they are separated by a ``'|' ``:
0 commit comments