Skip to content

BUG: to_dict not converting masked dtype to native python types #50510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -913,6 +913,7 @@ I/O
- Bug in displaying ``string`` dtypes not showing storage option (:issue:`50099`)
- Bug in :meth:`DataFrame.to_string` with ``header=False`` that printed the index name on the same line as the first row of the data (:issue:`49230`)
- Bug in :meth:`DataFrame.to_string` ignoring float formatter for extension arrays (:issue:`39336`)
- Bug in :meth:`DataFrame.to_dict` not converting elements to native Python types for masked arrays (:issue:`34665`)
- Fixed memory leak which stemmed from the initialization of the internal JSON module (:issue:`49222`)
- Fixed issue where :func:`json_normalize` would incorrectly remove leading characters from column names that matched the ``sep`` argument (:issue:`49861`)
- Bug in :meth:`DataFrame.to_json` where it would segfault when failing to encode a string (:issue:`50307`)
Expand Down
8 changes: 4 additions & 4 deletions pandas/core/arrays/masked.py
Original file line number Diff line number Diff line change
Expand Up @@ -250,15 +250,15 @@ def __setitem__(self, key, value) -> None:
def __iter__(self) -> Iterator:
if self.ndim == 1:
if not self._hasna:
for val in self._data:
yield val
for i, val in enumerate(self._data):
yield self._data.item(i)
else:
na_value = self.dtype.na_value
for isna_, val in zip(self._mask, self._data):
for i, (isna_, val) in enumerate(zip(self._mask, self._data)):
if isna_:
yield na_value
else:
yield val
yield self._data.item(i)
else:
for i in range(len(self)):
yield self[i]
Expand Down
11 changes: 11 additions & 0 deletions pandas/tests/frame/methods/test_to_dict.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import pytz

from pandas import (
NA,
DataFrame,
Index,
MultiIndex,
Expand Down Expand Up @@ -458,3 +459,13 @@ def test_to_dict_index_false(self, orient, expected):
df = DataFrame({"col1": [1, 2], "col2": [3, 4]}, index=["row1", "row2"])
result = df.to_dict(orient=orient, index=False)
tm.assert_dict_equal(result, expected)

def test_to_dict_masked_native_python(self):
# GH#34665
df = DataFrame({"a": Series([1, 2], dtype="Int64"), "B": 1})
result = df.to_dict(orient="records")
assert type(result[0]["a"]) is int

df = DataFrame({"a": Series([1, NA], dtype="Int64"), "B": 1})
result = df.to_dict(orient="records")
assert type(result[0]["a"]) is int