@@ -35,6 +35,23 @@ which is similar to a NumPy array. To construct these from the main pandas data
35
35
df = pd.DataFrame([[1 , 2 ], [3 , 4 ]], dtype = " uint64[pyarrow]" )
36
36
df
37
37
38
+ .. note ::
39
+
40
+ The string alias ``"string[pyarrow]" `` maps to ``pd.StringDtype("pyarrow") `` which is not equivalent to
41
+ specifying ``dtype=pd.ArrowDtype(pa.string()) ``. Generally, operations on the data will behave similarly
42
+ except ``pd.StringDtype("pyarrow") `` can return NumPy-backed nullable types while ``pd.ArrowDtype(pa.string()) ``
43
+ will return :class: `ArrowDtype `.
44
+
45
+ .. ipython :: python
46
+
47
+ import pyarrow as pa
48
+ data = list (" abc" )
49
+ ser_sd = pd.Series(data, dtype = " string[pyarrow]" )
50
+ ser_ad = pd.Series(data, dtype = pd.ArrowDtype(pa.string()))
51
+ ser_ad.dtype == ser_sd.dtype
52
+ ser_sd.str.contains(" a" )
53
+ ser_ad.str.contains(" a" )
54
+
38
55
For PyArrow types that accept parameters, you can pass in a PyArrow type with those parameters
39
56
into :class: `ArrowDtype ` to use in the ``dtype `` parameter.
40
57
@@ -106,6 +123,7 @@ The following are just some examples of operations that are accelerated by nativ
106
123
107
124
.. ipython :: python
108
125
126
+ import pyarrow as pa
109
127
ser = pd.Series([- 1.545 , 0.211 , None ], dtype = " float32[pyarrow]" )
110
128
ser.mean()
111
129
ser + ser
@@ -115,7 +133,7 @@ The following are just some examples of operations that are accelerated by nativ
115
133
ser.isna()
116
134
ser.fillna(0 )
117
135
118
- ser_str = pd.Series([" a" , " b" , None ], dtype = " string[pyarrow] " )
136
+ ser_str = pd.Series([" a" , " b" , None ], dtype = pd.ArrowDtype(pa. string()) )
119
137
ser_str.str.startswith(" a" )
120
138
121
139
from datetime import datetime
0 commit comments