Skip to content

Commit 3e5fe8e

Browse files
BUG: Pickle NA objects (#32104)
According to https://docs.python.org/3/library/pickle.html#object.__reduce__, > If a string is returned, the string should be interpreted as the name > of a global variable. It should be the object’s local name relative to > its module; the pickle module searches the module namespace to determine > the object’s module. This behaviour is typically useful for singletons. Closes #31847
1 parent 78c1a74 commit 3e5fe8e

File tree

3 files changed

+29
-0
lines changed

3 files changed

+29
-0
lines changed

doc/source/whatsnew/v1.0.2.rst

+1
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ Bug fixes
7474
**I/O**
7575

7676
- Using ``pd.NA`` with :meth:`DataFrame.to_json` now correctly outputs a null value instead of an empty object (:issue:`31615`)
77+
- Fixed pickling of ``pandas.NA``. Previously a new object was returned, which broke computations relying on ``NA`` being a singleton (:issue:`31847`)
7778
- Fixed bug in parquet roundtrip with nullable unsigned integer dtypes (:issue:`31896`).
7879

7980
**Experimental dtypes**

pandas/_libs/missing.pyx

+3
Original file line numberDiff line numberDiff line change
@@ -364,6 +364,9 @@ class NAType(C_NAType):
364364
exponent = 31 if is_32bit else 61
365365
return 2 ** exponent - 1
366366

367+
def __reduce__(self):
368+
return "NA"
369+
367370
# Binary arithmetic and comparison ops -> propagate
368371

369372
__add__ = _create_binary_propagating_op("__add__")

pandas/tests/scalar/test_na_scalar.py

+25
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
import pickle
2+
13
import numpy as np
24
import pytest
35

@@ -267,3 +269,26 @@ def test_integer_hash_collision_set():
267269
assert len(result) == 2
268270
assert NA in result
269271
assert hash(NA) in result
272+
273+
274+
def test_pickle_roundtrip():
275+
# https://github.com/pandas-dev/pandas/issues/31847
276+
result = pickle.loads(pickle.dumps(pd.NA))
277+
assert result is pd.NA
278+
279+
280+
def test_pickle_roundtrip_pandas():
281+
result = tm.round_trip_pickle(pd.NA)
282+
assert result is pd.NA
283+
284+
285+
@pytest.mark.parametrize(
286+
"values, dtype", [([1, 2, pd.NA], "Int64"), (["A", "B", pd.NA], "string")]
287+
)
288+
@pytest.mark.parametrize("as_frame", [True, False])
289+
def test_pickle_roundtrip_containers(as_frame, values, dtype):
290+
s = pd.Series(pd.array(values, dtype=dtype))
291+
if as_frame:
292+
s = s.to_frame(name="A")
293+
result = tm.round_trip_pickle(s)
294+
tm.assert_equal(result, s)

0 commit comments

Comments
 (0)