Skip to content

REGR: Series[dt64/td64].astype(string) #41958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions pandas/_libs/lib.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -716,6 +716,14 @@ cpdef ndarray[object] ensure_string_array(
Py_ssize_t i = 0, n = len(arr)

if hasattr(arr, "to_numpy"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont' have access to the ExtnsionTypes here right? so this is just as easy

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right


if hasattr(arr, "dtype") and arr.dtype.kind in ["m", "M"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, yeah the .to_numpy() on an EA is the approparate here.

# dtype check to exclude DataFrame
# GH#41409 TODO: not a great place for this
out = arr.astype(str).astype(object)
out[arr.isna()] = na_value
return out

arr = arr.to_numpy()
elif not isinstance(arr, np.ndarray):
arr = np.array(arr, dtype="object")
Expand Down
17 changes: 11 additions & 6 deletions pandas/tests/frame/methods/test_astype.py
Original file line number Diff line number Diff line change
Expand Up @@ -632,13 +632,9 @@ def test_astype_tz_object_conversion(self, tz):
result = result.astype({"tz": "datetime64[ns, Europe/London]"})
tm.assert_frame_equal(result, expected)

def test_astype_dt64_to_string(self, frame_or_series, tz_naive_fixture, request):
def test_astype_dt64_to_string(self, frame_or_series, tz_naive_fixture):
# GH#41409
tz = tz_naive_fixture
if tz is None:
mark = pytest.mark.xfail(
reason="GH#36153 uses ndarray formatting instead of DTA formatting"
)
request.node.add_marker(mark)

dti = date_range("2016-01-01", periods=3, tz=tz)
dta = dti._data
Expand All @@ -660,6 +656,15 @@ def test_astype_dt64_to_string(self, frame_or_series, tz_naive_fixture, request)
alt = obj.astype(str)
assert np.all(alt.iloc[1:] == result.iloc[1:])

def test_astype_td64_to_string(self, frame_or_series):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parametersize over td, dt, dt w/tz & period

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not nicely; the dt64 one uses tz_naive_fixture

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok can you add that as another test then. just to make sure coverng the bases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add what as another case? we have the td64 case here and the dt64 case right above here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh ok.

period covered? if not pls add (followup ok)

# GH#41409
tdi = pd.timedelta_range("1 Day", periods=3)
obj = frame_or_series(tdi)

expected = frame_or_series(["1 days", "2 days", "3 days"], dtype="string")
result = obj.astype("string")
tm.assert_equal(result, expected)

def test_astype_bytes(self):
# GH#39474
result = DataFrame(["foo", "bar", "baz"]).astype(bytes)
Expand Down