Skip to content

BUG: Series.to_dict does not return native Python types #37648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 57 commits into from
Feb 19, 2021
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
a2e3e55
TST: add GH25969 OP
arw2019 Nov 5, 2020
41f0a4b
ENH: add maybe_box_native
arw2019 Nov 5, 2020
5e4edbe
ENH: use maybe_box_native in Series.to_dict
arw2019 Nov 5, 2020
b6967b7
BUG: add scalar check to maybe_box_native
arw2019 Nov 5, 2020
55919e0
BUG: suppress int conversion ValueError in maybe_box_native
arw2019 Nov 5, 2020
759e091
TST: rewrite existing to_dict tests
arw2019 Nov 6, 2020
6f966f2
CLN: use maybe_box_datetimelike -> maybe_box_native in DataFrame.to_dict
arw2019 Nov 6, 2020
1e058ab
Merge remote-tracking branch 'upstream/master' into maybe_box_native
arw2019 Nov 6, 2020
2ebd673
TYP: maybe_box_native
arw2019 Nov 6, 2020
1dc5935
DOC: add docstring to maybe_box_native
arw2019 Nov 6, 2020
1e5e459
DOC: whatsnew
arw2019 Nov 6, 2020
3c6bd7e
TYP: fix input type hint in maybe_box_native
arw2019 Nov 6, 2020
e3cc18f
TST (feedback): add uint testcases
arw2019 Nov 8, 2020
22819b7
TST (feedback): add uint testcases
arw2019 Nov 8, 2020
fb782df
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Nov 9, 2020
cb389d9
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Nov 9, 2020
249968e
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Nov 10, 2020
1f5d442
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Nov 10, 2020
9686035
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Nov 11, 2020
732fb84
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Nov 23, 2020
ec6cbfc
merge master + move to 1.3
arw2019 Dec 27, 2020
4bb1916
fixups
arw2019 Dec 27, 2020
587e592
fix merge error
arw2019 Dec 27, 2020
9d81f54
whatsnew
arw2019 Dec 27, 2020
673da4e
fix typing
arw2019 Dec 27, 2020
ef639c9
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Dec 27, 2020
4036b63
tests
arw2019 Dec 28, 2020
89a841d
add NumericArray path in to_numeric
arw2019 Dec 28, 2020
68420ea
review: tests
arw2019 Jan 4, 2021
8b83e24
add bool check
arw2019 Jan 4, 2021
86b0e04
review: maybe_box_native takes Scalar arg only
arw2019 Jan 4, 2021
efc95b8
merge master
arw2019 Jan 4, 2021
c3b723a
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Jan 5, 2021
d5a9476
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Jan 31, 2021
3e2ea12
review comments: add unit test
arw2019 Jan 31, 2021
a455fcc
CI failures
arw2019 Jan 31, 2021
5541a35
merge master
arw2019 Jan 31, 2021
fcbf705
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Feb 7, 2021
9282ca3
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Feb 11, 2021
a444ef5
silence NumPy deprecation warning (np.int -> int)
arw2019 Feb 11, 2021
2411a70
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Feb 11, 2021
99d7c55
silence NumPy deprecation warning (np.float -> float)
arw2019 Feb 12, 2021
e009a9e
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Feb 12, 2021
eaeb409
whatsnew
arw2019 Feb 13, 2021
a642a6b
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Feb 13, 2021
0467e1b
review: use is_foo instead of is_foo_dtype
arw2019 Feb 16, 2021
5605bbd
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Feb 16, 2021
39ca508
review: remove suppress in int clause to check if anything fails
arw2019 Feb 16, 2021
7fa7503
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Feb 16, 2021
08567b4
pre-commit failure
arw2019 Feb 16, 2021
7c47df2
merge master
arw2019 Feb 16, 2021
618f8ef
merge master
arw2019 Feb 16, 2021
da620f2
review: more examples in unit test
arw2019 Feb 16, 2021
761b728
skip json test with numpy_dev
arw2019 Feb 17, 2021
49acd25
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Feb 17, 2021
86c6aa7
revert changes to JSON test
arw2019 Feb 18, 2021
4650131
Merge branch 'master' of https://github.com/pandas-dev/pandas into ma…
arw2019 Feb 18, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions pandas/core/dtypes/cast.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,20 @@ def maybe_box_datetimelike(value: Scalar, dtype: Optional[Dtype] = None) -> Scal
return value


# TODO: this should be a TypeVar
def maybe_box_native(value: Union[Series, Scalar]) -> Union[ABCSeries, Scalar]:
if not is_scalar(value):
pass
elif is_datetime_or_timedelta_dtype(value):
value = maybe_box_datetimelike(value)
elif is_float_dtype(value):
value = float(value)
elif is_integer_dtype(value):
with suppress(ValueError):
value = int(value)
return value


def maybe_downcast_to_dtype(result, dtype: Union[str, np.dtype]):
"""
try to cast to the specified dtype (e.g. convert back to bool/int
Expand Down
9 changes: 4 additions & 5 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@
find_common_type,
infer_dtype_from_scalar,
invalidate_string_dtypes,
maybe_box_datetimelike,
maybe_box_native,
maybe_cast_to_datetime,
maybe_casted_values,
maybe_convert_platform,
Expand Down Expand Up @@ -1539,15 +1539,15 @@ def to_dict(self, orient="dict", into=dict):
(
"data",
[
list(map(maybe_box_datetimelike, t))
list(map(maybe_box_native, t))
for t in self.itertuples(index=False, name=None)
],
),
)
)

elif orient == "series":
return into_c((k, maybe_box_datetimelike(v)) for k, v in self.items())
return into_c((k, maybe_box_native(v)) for k, v in self.items())

elif orient == "records":
columns = self.columns.tolist()
Expand All @@ -1556,8 +1556,7 @@ def to_dict(self, orient="dict", into=dict):
for row in self.itertuples(index=False, name=None)
)
return [
into_c((k, maybe_box_datetimelike(v)) for k, v in row.items())
for row in rows
into_c((k, maybe_box_native(v)) for k, v in row.items()) for row in rows
]

elif orient == "index":
Expand Down
3 changes: 2 additions & 1 deletion pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@

from pandas.core.dtypes.cast import (
convert_dtypes,
maybe_box_native,
maybe_cast_to_extension_array,
validate_numeric_casting,
)
Expand Down Expand Up @@ -1600,7 +1601,7 @@ def to_dict(self, into=dict):
"""
# GH16122
into_c = com.standardize_mapping(into)
return into_c(self.items())
return into_c((k, maybe_box_native(v)) for k, v in self.items())

def to_frame(self, name=None) -> "DataFrame":
"""
Expand Down
55 changes: 33 additions & 22 deletions pandas/tests/frame/methods/test_to_dict.py
Original file line number Diff line number Diff line change
Expand Up @@ -256,31 +256,42 @@ def test_to_dict_wide(self):
expected = {f"A_{i:d}": i for i in range(256)}
assert result == expected

def test_to_dict_orient_dtype(self):
# GH22620 & GH21256

df = DataFrame(
{
"bool": [True, True, False],
"datetime": [
@pytest.mark.parametrize(
"data,dtype",
(
([True, True, False], bool),
[
[
datetime(2018, 1, 1),
datetime(2019, 2, 2),
datetime(2020, 3, 3),
],
"float": [1.0, 2.0, 3.0],
"int": [1, 2, 3],
"str": ["X", "Y", "Z"],
}
)
Timestamp,
],
[[1.0, 2.0, 3.0], float],
[[1, 2, 3], int],
[["X", "Y", "Z"], str],
),
)
def test_to_dict_orient_dtype(self, data, dtype):
# GH22620 & GH21256

expected = {
"int": int,
"float": float,
"str": str,
"datetime": Timestamp,
"bool": bool,
}
df = DataFrame({"a": data})
d = df.to_dict(orient="records")
assert all(type(record["a"]) is dtype for record in d)

@pytest.mark.parametrize(
"data,dtype",
(
[np.int64(9), int],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add unsigned int as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

[np.float64(1.1), float],
[np.bool_(True), bool],
[np.datetime64("2005-02-25"), Timestamp],
),
)
def test_to_dict_scalar_constructor_orient_dtype(self, data, dtype):
# GH22620 & GH21256

for df_dict in df.to_dict("records"):
result = {col: type(df_dict[col]) for col in list(df.columns)}
assert result == expected
df = DataFrame({"a": data}, index=[0])
d = df.to_dict(orient="records")
assert type(d[0]["a"]) is dtype
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you break this out a bit
result = type(d[0]['a'])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

12 changes: 12 additions & 0 deletions pandas/tests/series/methods/test_to_dict.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import collections

import numpy as np
import pytest

from pandas import Series
Expand All @@ -20,3 +21,14 @@ def test_to_dict(self, mapping, datetime_series):
from_method = Series(datetime_series.to_dict(collections.Counter))
from_constructor = Series(collections.Counter(datetime_series.items()))
tm.assert_series_equal(from_method, from_constructor)

@pytest.mark.parametrize(
"input",
({"a": np.int64(64), "b": 10}, {"a": np.int64(64), "b": 10, "c": "ABC"}),
)
def test_to_dict_return_types(self, input):
# GH25969

d = Series(input).to_dict()
assert isinstance(d["a"], int)
assert isinstance(d["b"], int)