Skip to content

Commit 346f043

Browse files
committed
add docs
1 parent 3e2f096 commit 346f043

File tree

3 files changed

+50
-4
lines changed

3 files changed

+50
-4
lines changed

doc/source/user_guide/integer_na.rst

+8-1
Original file line numberDiff line numberDiff line change
@@ -126,13 +126,20 @@ These dtypes can be merged, reshaped & casted.
126126
pd.concat([df[["A"]], df[["B", "C"]]], axis=1).dtypes
127127
df["A"].astype(float)
128128
129-
Reduction and groupby operations such as 'sum' work as well.
129+
Reduction and groupby operations such as :meth:`~DataFrame.sum` work as well.
130130

131131
.. ipython:: python
132132
133+
df.sum(numeric_only=True)
133134
df.sum()
134135
df.groupby("B").A.sum()
135136
137+
.. versionchanged:: 2.1.0
138+
139+
When doing reduction operations (:meth:`~DataFrame.sum` etc.) on numeric-only data
140+
frames the integer array dtype will be maintained. Previously, the dtype of reduction
141+
result would have been a numpy numeric dtype.
142+
136143
Scalar NA Value
137144
---------------
138145

doc/source/user_guide/pyarrow.rst

+6
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,12 @@ The following are just some examples of operations that are accelerated by nativ
152152
ser_dt = pd.Series([datetime(2022, 1, 1), None], dtype=pa_type)
153153
ser_dt.dt.strftime("%Y-%m")
154154
155+
.. versionchanged:: 2.1.0
156+
157+
When doing :class:`DataFrame` reduction operations (:meth:`~DataFrame.sum` etc.) on
158+
pyarrow data the dtype now will be maintained when possible. Previously, the dtype
159+
of reduction result would have been a numpy numeric dtype.
160+
155161
I/O Reading
156162
-----------
157163

doc/source/whatsnew/v2.1.0.rst

+36-3
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,43 @@ including other versions of pandas.
1414
Enhancements
1515
~~~~~~~~~~~~
1616

17-
.. _whatsnew_210.enhancements.enhancement1:
17+
.. _whatsnew_210.enhancements.reduction_extension_dtypes:
1818

19-
enhancement1
20-
^^^^^^^^^^^^
19+
Reductions maintain extension dtypes
20+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21+
22+
In previous versions of pandas, the results of DataFrame reductions
23+
(:meth:`DataFrameG.sum` :meth:`DataFrame.mean` etc.) has numpy dtypes even when the DataFrames
24+
were of extension dtypes. Pandas can now keep the dtypes when doing reductions over Dataframe
25+
columns with a common dtype (:issue:``52788`).
26+
27+
*New Behavior*
28+
29+
.. code-block:: ipython
30+
31+
In [1]: df = pd.DataFrame({"a": [1, 1, 2, 1], "b": [np.nan, 2.0, 3.0, 4.0]}, dtype="Int64")
32+
In [2]: df.sum()
33+
Out[2]:
34+
a 5
35+
b 9
36+
dtype: int64
37+
In [3]: df = df.astype("int64[pyarrow]")
38+
In [4]: df.sum()
39+
Out[4]:
40+
a 5
41+
b 9
42+
dtype: int64
43+
44+
*New Behavior*
45+
46+
.. ipython:: python
47+
48+
df = pd.DataFrame({"a": [1, 1, 2, 1], "b": [np.nan, 2.0, 3.0, 4.0]}, dtype="Int64")
49+
df.sum()
50+
df = df.astype("int64[pyarrow]")
51+
df.sum()
52+
53+
Notice that the dtype is now a masked dtype and pyarrow dtype, respectively, while previously it was a numpy integer dtype.
2154

2255
.. _whatsnew_210.enhancements.enhancement2:
2356

0 commit comments

Comments
 (0)