@@ -141,7 +141,7 @@ an axis and broadcasting over the same axis:
141
141
major_mean
142
142
wp.sub(major_mean, axis = ' major' )
143
143
144
- And similarly for axis="items" and axis="minor".
144
+ And similarly for `` axis="items" `` and `` axis="minor" `` .
145
145
146
146
.. note ::
147
147
@@ -369,14 +369,14 @@ index labels with the minimum and maximum corresponding values:
369
369
df1.idxmin(axis = 0 )
370
370
df1.idxmax(axis = 1 )
371
371
372
- When there are multiple rows (or columns) matching the minimum or maximum
372
+ When there are multiple rows (or columns) matching the minimum or maximum
373
373
value, ``idxmin `` and ``idxmax `` return the first matching index:
374
374
375
375
.. ipython :: python
376
376
377
- df = DataFrame([2 , 1 , 1 , 3 , np.nan], columns = [' A' ], index = list (' edcba' ))
378
- df
379
- df [' A' ].idxmin()
377
+ df3 = DataFrame([2 , 1 , 1 , 3 , np.nan], columns = [' A' ], index = list (' edcba' ))
378
+ df3
379
+ df3 [' A' ].idxmin()
380
380
381
381
Value counts (histogramming)
382
382
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -835,6 +835,74 @@ For instance,
835
835
836
836
for r in df2.itertuples(): print r
837
837
838
+ .. _basics.string_methods :
839
+
840
+ Vectorized string methods
841
+ -------------------------
842
+
843
+ Series is equipped (as of pandas 0.8.1) with a set of string processing methods
844
+ that make it easy to operate on each element of the array. Perhaps most
845
+ importantly, these methods exclude missing/NA values automatically. These are
846
+ accessed via the Series's ``str `` attribute and generally have names matching
847
+ the equivalent (scalar) build-in string methods:
848
+
849
+ .. ipython :: python
850
+
851
+ s = Series([' A' , ' B' , ' C' , ' Aaba' , ' Baca' , np.nan, ' CABA' , ' dog' , ' cat' ])
852
+ s.str.lower()
853
+ s.str.upper()
854
+ s.str.len()
855
+
856
+ Methods like ``split `` return a Series of lists:
857
+
858
+ .. ipython :: python
859
+
860
+ s2 = Series([' a_b_c' , ' c_d_e' , np.nan, ' f_g_h' ])
861
+ s2.str.split(' _' )
862
+
863
+ Elements in the split lists can be accessed using ``get `` or ``[] `` notation:
864
+
865
+ .. ipython :: python
866
+
867
+ s2.str.split(' _' ).str.get(1 )
868
+ s2.str.split(' _' ).str[1 ]
869
+
870
+ Methods like ``replace `` and ``findall `` take regular expressions, too:
871
+
872
+ .. ipython :: python
873
+
874
+ s3 = Series([' A' , ' B' , ' C' , ' Aaba' , ' Baca' ,
875
+ ' ' , np.nan, ' CABA' , ' dog' , ' cat' ])
876
+ s3
877
+ s3.str.replace(' ^.a|dog' , ' XX-XX ' , case = False )
878
+
879
+ .. csv-table ::
880
+ :header: "Method", "Description"
881
+ :widths: 20, 80
882
+
883
+ ``cat ``,Concatenate strings
884
+ ``split ``,Split strings on delimiter
885
+ ``get ``,Index into each element (retrieve i-th element)
886
+ ``join ``,Join strings in each element of the Series with passed separator
887
+ ``contains ``,Return boolean array if each string contains pattern/regex
888
+ ``replace ``,Replace occurrences of pattern/regex with some other string
889
+ ``repeat ``,Duplicate values (``s.str.repeat(3) `` equivalent to ``x * 3 ``)
890
+ ``pad ``,"Add whitespace to left, right, or both sides of strings"
891
+ ``center ``,Equivalent to ``pad(side='both') ``
892
+ ``slice ``,Slice each string in the Series
893
+ ``slice_replace ``,Replace slice in each string with passed value
894
+ ``count ``,Count occurrences of pattern
895
+ ``startswith ``,Equivalent to ``str.startswith(pat) `` for each element
896
+ ``endswidth ``,Equivalent to ``str.endswith(pat) `` for each element
897
+ ``findall ``,Compute list of all occurrences of pattern/regex for each string
898
+ ``match ``,"Call ``re.match `` on each element, returning matched groups as list"
899
+ ``len ``,Compute string lengths
900
+ ``strip ``,Equivalent to ``str.strip ``
901
+ ``rstrip ``,Equivalent to ``str.rstrip ``
902
+ ``lstrip ``,Equivalent to ``str.lstrip ``
903
+ ``lower ``,Equivalent to ``str.lower ``
904
+ ``upper ``,Equivalent to ``str.upper ``
905
+
838
906
.. _basics.sorting :
839
907
840
908
Sorting by index and value
0 commit comments