@@ -53,6 +53,32 @@ need to implement certain operations expected by pandas users (for example
53
53
the algorithm used in, ``Series.str.upper ``). That work may be done outside of
54
54
pandas.
55
55
56
+ Consistent missing value handling
57
+ ---------------------------------
58
+
59
+ Currently, pandas handles missing data differently for different data types. We
60
+ use different types to indicate that a value is missing (``np.nan `` for
61
+ floating-point data, ``np.nan `` or ``None `` for object-dtype data -- typically
62
+ strings or booleans -- with missing values, and ``pd.NaT `` for datetimelike
63
+ data). Integer data cannot store missing data or are cast to float. In addition,
64
+ pandas 1.0 introduced a new missing value sentinel, ``pd.NA ``, which is being
65
+ used for the experimental nullable integer, boolean, and string data types.
66
+
67
+ These different missing values have different behaviors in user-facing
68
+ operations. Specifically, we introduced different semantics for the nullable
69
+ data types for certain operations (e.g. propagating in comparison operations
70
+ instead of comparing as False).
71
+
72
+ Long term, we want to introduce consistent missing data handling for all data
73
+ types. This includes consistent behavior in all operations (indexing, arithmetic
74
+ operations, comparisons, etc.). We want to eventually make the new semantics the
75
+ default.
76
+
77
+ This has been discussed at
78
+ `github #28095 <https://github.com/pandas-dev/pandas/issues/28095 >`__ (and
79
+ linked issues), and described in more detail in this
80
+ `design doc <https://hackmd.io/@jorisvandenbossche/Sk0wMeAmB >`__.
81
+
56
82
Apache Arrow interoperability
57
83
-----------------------------
58
84
0 commit comments