Skip to content

Commit a1b1067

Browse files
committed
DOC: Integer NA
Closes pandas-dev#22003
1 parent 06f8568 commit a1b1067

File tree

3 files changed

+37
-14
lines changed

3 files changed

+37
-14
lines changed

doc/source/integer_na.rst

+18-13
Original file line numberDiff line numberDiff line change
@@ -8,27 +8,33 @@
88
99
.. _integer_na:
1010
11-
********************************
12-
Integer Data with Missing Values
13-
********************************
11+
**************************
12+
Nullable Integer Data Type
13+
**************************
1414

1515
.. versionadded:: 0.24.0
1616

17-
In :ref:`missing_data`, we say that pandas primarily uses ``NaN`` to represent
17+
In :ref:`missing_data`, we saw that pandas primarily uses ``NaN`` to represent
1818
missing data. Because ``NaN`` is a float, this forces an array of integers with
1919
any missing values to become floating point. In some cases, this may not matter
2020
much. But if your integer column is, say, and identifier, casting to float can
21-
lead to bad outcomes.
21+
be problematic.
2222

2323
Pandas can represent integer data with missing values with the
2424
:class:`arrays.IntegerArray` array. This is an :ref:`extension types <extending.extension-types>`
25-
implemented within pandas. It is not the default dtype and will not be inferred,
26-
you must explicitly create an :class:`api.extensions.IntegerArray` using :func:`integer_array`.
25+
implemented within pandas. It is not the default dtype for integers, and will not be inferred;
26+
you must explicitly pass the dtype into the :meth:`array` or :class:`Series` method:
2727

2828
.. ipython:: python
2929
30-
arr = integer_array([1, 2, np.nan])
31-
arr
30+
pd.array([1, 2, np.nan], dtype=pd.Int64Dtype())
31+
32+
Or the string alias "Int64" (note the capital ``"I"``, to differentiate from
33+
NumPy's ``'int64'`` dtype:
34+
35+
.. ipython:: python
36+
37+
pd.array([1, 2, np.nan], dtype="Int64")
3238
3339
This array can be stored in a :class:`DataFrame` or :class:`Series` like any
3440
NumPy array.
@@ -37,22 +43,21 @@ NumPy array.
3743
3844
pd.Series(arr)
3945
40-
Alternatively, you can instruct pandas to treat an array-like as an
41-
:class:`api.extensions.IntegerArray` by specifying a dtype with a capital "I".
46+
You can also pass the list-like object to the :class:`Series` constructor
47+
with the dtype.
4248

4349
.. ipython:: python
4450
4551
s = pd.Series([1, 2, np.nan], dtype="Int64")
4652
s
4753
48-
Note that by default (if you don't specify `dtype`), NumPy is used, and you'll end
54+
By default (if you don't specify ``dtype``), NumPy is used, and you'll end
4955
up with a ``float64`` dtype Series:
5056

5157
.. ipython:: python
5258
5359
pd.Series([1, 2, np.nan])
5460
55-
5661
Operations involving an integer array will behave similar to NumPy arrays.
5762
Missing values will be propagated, and and the data will be coerced to another
5863
dtype if needed.

doc/source/missing_data.rst

+16
Original file line numberDiff line numberDiff line change
@@ -760,3 +760,19 @@ However, these can be filled in using :meth:`~DataFrame.fillna` and it will work
760760
761761
reindexed[crit.fillna(False)]
762762
reindexed[crit.fillna(True)]
763+
764+
Pandas provides a nullable integer dtype, but you must explicitly request it
765+
when creating the series or column. Notice that we use a capital "I" in
766+
the ``dtype="Int64"``.
767+
768+
.. ipython:: python
769+
770+
s = pd.Series(np.random.randn(5), index=[0, 2, 4, 6, 7],
771+
dtype="Int64")
772+
s > 0
773+
(s > 0).dtype
774+
crit = (s > 0).reindex(list(range(8)))
775+
crit
776+
crit.dtype
777+
778+
See :ref:`integer_na` for more.

doc/source/whatsnew/v0.24.0.txt

+3-1
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,9 @@ Reduction and groupby operations such as 'sum' work.
9999

100100
.. warning::
101101

102-
The Integer NA support currently uses the captilized dtype version, e.g. ``Int8`` as compared to the traditional ``int8``. This may be changed at a future date.
102+
The Integer NA support currently uses the capitalized dtype version, e.g. ``Int8`` as compared to the traditional ``int8``. This may be changed at a future date.
103+
104+
See :ref:`integer_na` for more.
103105

104106
.. _whatsnew_0240.enhancements.read_html:
105107

0 commit comments

Comments
 (0)