Skip to content

DEPR: undo deprecation of astype(int64) for datetimelike values #45449

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.3.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -811,7 +811,7 @@ Other Deprecations
- Deprecated allowing scalars to be passed to the :class:`Categorical` constructor (:issue:`38433`)
- Deprecated constructing :class:`CategoricalIndex` without passing list-like data (:issue:`38944`)
- Deprecated allowing subclass-specific keyword arguments in the :class:`Index` constructor, use the specific subclass directly instead (:issue:`14093`, :issue:`21311`, :issue:`22315`, :issue:`26974`)
- Deprecated the :meth:`astype` method of datetimelike (``timedelta64[ns]``, ``datetime64[ns]``, ``Datetime64TZDtype``, ``PeriodDtype``) to convert to integer dtypes, use ``values.view(...)`` instead (:issue:`38544`)
- Deprecated the :meth:`astype` method of datetimelike (``timedelta64[ns]``, ``datetime64[ns]``, ``Datetime64TZDtype``, ``PeriodDtype``) to convert to integer dtypes, use ``values.view(...)`` instead (:issue:`38544`). This deprecation was later reverted in pandas 1.4.0.
- Deprecated :meth:`MultiIndex.is_lexsorted` and :meth:`MultiIndex.lexsort_depth`, use :meth:`MultiIndex.is_monotonic_increasing` instead (:issue:`32259`)
- Deprecated keyword ``try_cast`` in :meth:`Series.where`, :meth:`Series.mask`, :meth:`DataFrame.where`, :meth:`DataFrame.mask`; cast results manually if desired (:issue:`38836`)
- Deprecated comparison of :class:`Timestamp` objects with ``datetime.date`` objects. Instead of e.g. ``ts <= mydate`` use ``ts <= pd.Timestamp(mydate)`` or ``ts.date() <= mydate`` (:issue:`36131`)
Expand Down
8 changes: 0 additions & 8 deletions pandas/core/arrays/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -430,14 +430,6 @@ def astype(self, dtype, copy: bool = True):
elif is_integer_dtype(dtype):
# we deliberately ignore int32 vs. int64 here.
# See https://github.com/pandas-dev/pandas/issues/24381 for more.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im OK with un-deprecating for the time being for i8. any objection to maintaining the deprecation for integer dtypes other than i8?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other integer dtypes should IMO certainly use astype instead of view? (eg view(int32) doesn't make much sense since the original dtype is 64-bit)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm saying if the user does arr.astype(np.uint32) it doesn't make sense to give them arr.view(np.int64). i.e. we should keep the deprecation for dtypes other than np.int64.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what I say is that it makes even less sense to ask them to do .view(np.int32) instead of giving them .view(np.int64) ;) (which will actually raise).

I understand that we are ignoring the bit-width here, and that is something we should fix. But I think that is unrelated to the "use view instead" deprecation (I also don't know if that is something we should deprecate first, or simply change).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we might be ignoring bit-width here, but that gets catched elsewhere. Because astyping to something else as int64/uint64 actually already raises an error at the moment (see also the table in #45034 (comment))

At least, that gets catched on the Series level. As you mentioned in that issue, the cast is allowed (bit width is ignored) at the array level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what I say is that it makes even less sense to ask them to do .view(np.int32) instead of giving them .view(np.int64) ;) (which will actually raise).

huh? If a user does dta.astype(np.uint32) im not advocating returning dta.view(np.int32), im advocating raising (or at least deprecating now and raising in the future).

Actually, we might be ignoring bit-width here, but that gets catched elsewhere. Because astyping to something else as int64/uint64 actually already raises an error at the moment (see also the table in #45034 (comment))

Yah we should try to handle these all symmetrically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh? If a user does dta.astype(np.uint32) im not advocating returning dta.view(np.int32), im advocating raising (or at least deprecating now and raising in the future).

I know that you are not advocating for that, but is is what the current deprecation warning implies the user to do (as is says to use view instead of astype).
You asked to "maintain the deprecation" for those cases, and what I am trying to say is that we should indeed fix those cases (in some way, with a different deprecation or breaking change (depending on the future behaviour), to be discussed), but IMO not by maintaining the current deprecation.

warnings.warn(
f"casting {self.dtype} values to int64 with .astype(...) is "
"deprecated and will raise in a future version. "
"Use .view(...) instead.",
FutureWarning,
stacklevel=find_stack_level(),
)

values = self.asi8

if is_unsigned_integer_dtype(dtype):
Expand Down
14 changes: 0 additions & 14 deletions pandas/core/dtypes/astype.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,13 +112,6 @@ def astype_nansafe(

elif is_datetime64_dtype(arr.dtype):
if dtype == np.int64:
warnings.warn(
f"casting {arr.dtype} values to int64 with .astype(...) "
"is deprecated and will raise in a future version. "
"Use .view(...) instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
if isna(arr).any():
raise ValueError("Cannot convert NaT values to integer")
return arr.view(dtype)
Expand All @@ -131,13 +124,6 @@ def astype_nansafe(

elif is_timedelta64_dtype(arr.dtype):
if dtype == np.int64:
warnings.warn(
f"casting {arr.dtype} values to int64 with .astype(...) "
"is deprecated and will raise in a future version. "
"Use .view(...) instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
if isna(arr).any():
raise ValueError("Cannot convert NaT values to integer")
return arr.view(dtype)
Expand Down
16 changes: 4 additions & 12 deletions pandas/tests/arrays/period/test_astype.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,36 +13,28 @@ def test_astype(dtype):
# We choose to ignore the sign and size of integers for
# Period/Datetime/Timedelta astype
arr = period_array(["2000", "2001", None], freq="D")
with tm.assert_produces_warning(FutureWarning):
# astype(int..) deprecated
result = arr.astype(dtype)
result = arr.astype(dtype)

if np.dtype(dtype).kind == "u":
expected_dtype = np.dtype("uint64")
else:
expected_dtype = np.dtype("int64")

with tm.assert_produces_warning(FutureWarning):
# astype(int..) deprecated
expected = arr.astype(expected_dtype)
expected = arr.astype(expected_dtype)

assert result.dtype == expected_dtype
tm.assert_numpy_array_equal(result, expected)


def test_astype_copies():
arr = period_array(["2000", "2001", None], freq="D")
with tm.assert_produces_warning(FutureWarning):
# astype(int..) deprecated
result = arr.astype(np.int64, copy=False)
result = arr.astype(np.int64, copy=False)

# Add the `.base`, since we now use `.asi8` which returns a view.
# We could maybe override it in PeriodArray to return ._data directly.
assert result.base is arr._data

with tm.assert_produces_warning(FutureWarning):
# astype(int..) deprecated
result = arr.astype(np.int64, copy=True)
result = arr.astype(np.int64, copy=True)
assert result is not arr._data
tm.assert_numpy_array_equal(result, arr._data.view("i8"))

Expand Down
9 changes: 2 additions & 7 deletions pandas/tests/arrays/test_datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,18 +77,13 @@ def test_astype_copies(self, dtype, other):
@pytest.mark.parametrize("dtype", [int, np.int32, np.int64, "uint32", "uint64"])
def test_astype_int(self, dtype):
arr = DatetimeArray._from_sequence([pd.Timestamp("2000"), pd.Timestamp("2001")])
with tm.assert_produces_warning(FutureWarning):
# astype(int..) deprecated
result = arr.astype(dtype)
result = arr.astype(dtype)

if np.dtype(dtype).kind == "u":
expected_dtype = np.dtype("uint64")
else:
expected_dtype = np.dtype("int64")

with tm.assert_produces_warning(FutureWarning):
# astype(int..) deprecated
expected = arr.astype(expected_dtype)
expected = arr.astype(expected_dtype)

assert result.dtype == expected_dtype
tm.assert_numpy_array_equal(result, expected)
Expand Down
9 changes: 2 additions & 7 deletions pandas/tests/arrays/test_timedeltas.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,18 +11,13 @@ class TestTimedeltaArray:
@pytest.mark.parametrize("dtype", [int, np.int32, np.int64, "uint32", "uint64"])
def test_astype_int(self, dtype):
arr = TimedeltaArray._from_sequence([Timedelta("1H"), Timedelta("2H")])
with tm.assert_produces_warning(FutureWarning):
# astype(int..) deprecated
result = arr.astype(dtype)
result = arr.astype(dtype)

if np.dtype(dtype).kind == "u":
expected_dtype = np.dtype("uint64")
else:
expected_dtype = np.dtype("int64")

with tm.assert_produces_warning(FutureWarning):
# astype(int..) deprecated
expected = arr.astype(expected_dtype)
expected = arr.astype(expected_dtype)

assert result.dtype == expected_dtype
tm.assert_numpy_array_equal(result, expected)
Expand Down
4 changes: 1 addition & 3 deletions pandas/tests/dtypes/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -743,9 +743,7 @@ def test_astype_nansafe(val, typ):

msg = "Cannot convert NaT values to integer"
with pytest.raises(ValueError, match=msg):
with tm.assert_produces_warning(FutureWarning):
# datetimelike astype(int64) deprecated
astype_nansafe(arr, dtype=typ)
astype_nansafe(arr, dtype=typ)


def test_astype_nansafe_copy_false(any_int_numpy_dtype):
Expand Down
11 changes: 4 additions & 7 deletions pandas/tests/indexes/datetimes/methods/test_astype.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,7 @@ def test_astype(self):
)
tm.assert_index_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
result = idx.astype(int)
result = idx.astype(int)
expected = Int64Index(
[1463356800000000000] + [-9223372036854775808] * 3,
dtype=np.int64,
Expand All @@ -42,8 +41,7 @@ def test_astype(self):
tm.assert_index_equal(result, expected)

rng = date_range("1/1/2000", periods=10, name="idx")
with tm.assert_produces_warning(FutureWarning):
result = rng.astype("i8")
result = rng.astype("i8")
tm.assert_index_equal(result, Index(rng.asi8, name="idx"))
tm.assert_numpy_array_equal(result.values, rng.asi8)

Expand All @@ -53,9 +51,8 @@ def test_astype_uint(self):
np.array([946684800000000000, 946771200000000000], dtype="uint64"),
name="idx",
)
with tm.assert_produces_warning(FutureWarning):
tm.assert_index_equal(arr.astype("uint64"), expected)
tm.assert_index_equal(arr.astype("uint32"), expected)
tm.assert_index_equal(arr.astype("uint64"), expected)
tm.assert_index_equal(arr.astype("uint32"), expected)

def test_astype_with_tz(self):

Expand Down
11 changes: 4 additions & 7 deletions pandas/tests/indexes/interval/test_astype.py
Original file line number Diff line number Diff line change
Expand Up @@ -205,13 +205,10 @@ def index(self, request):
@pytest.mark.parametrize("subtype", ["int64", "uint64"])
def test_subtype_integer(self, index, subtype):
dtype = IntervalDtype(subtype, "right")
with tm.assert_produces_warning(FutureWarning):
result = index.astype(dtype)
expected = IntervalIndex.from_arrays(
index.left.astype(subtype),
index.right.astype(subtype),
closed=index.closed,
)
result = index.astype(dtype)
expected = IntervalIndex.from_arrays(
index.left.astype(subtype), index.right.astype(subtype), closed=index.closed
)
tm.assert_index_equal(result, expected)

def test_subtype_float(self, index):
Expand Down
12 changes: 2 additions & 10 deletions pandas/tests/indexes/interval/test_constructors.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,21 +74,13 @@ def test_constructor(self, constructor, breaks, closed, name):
)
def test_constructor_dtype(self, constructor, breaks, subtype):
# GH 19262: conversion via dtype parameter
warn = None
if subtype == "int64" and breaks.dtype.kind in ["M", "m"]:
# astype(int64) deprecated
warn = FutureWarning

with tm.assert_produces_warning(warn):
expected_kwargs = self.get_kwargs_from_breaks(breaks.astype(subtype))
expected_kwargs = self.get_kwargs_from_breaks(breaks.astype(subtype))
expected = constructor(**expected_kwargs)

result_kwargs = self.get_kwargs_from_breaks(breaks)
iv_dtype = IntervalDtype(subtype, "right")
for dtype in (iv_dtype, str(iv_dtype)):
with tm.assert_produces_warning(warn):

result = constructor(dtype=dtype, **result_kwargs)
result = constructor(dtype=dtype, **result_kwargs)
tm.assert_index_equal(result, expected)

@pytest.mark.parametrize(
Expand Down
11 changes: 4 additions & 7 deletions pandas/tests/indexes/period/methods/test_astype.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,7 @@ def test_astype_conversion(self):
)
tm.assert_index_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
result = idx.astype(np.int64)
result = idx.astype(np.int64)
expected = Int64Index(
[16937] + [-9223372036854775808] * 3, dtype=np.int64, name="idx"
)
Expand All @@ -51,17 +50,15 @@ def test_astype_conversion(self):
tm.assert_index_equal(result, expected)

idx = period_range("1990", "2009", freq="A", name="idx")
with tm.assert_produces_warning(FutureWarning):
result = idx.astype("i8")
result = idx.astype("i8")
tm.assert_index_equal(result, Index(idx.asi8, name="idx"))
tm.assert_numpy_array_equal(result.values, idx.asi8)

def test_astype_uint(self):
arr = period_range("2000", periods=2, name="idx")
expected = UInt64Index(np.array([10957, 10958], dtype="uint64"), name="idx")
with tm.assert_produces_warning(FutureWarning):
tm.assert_index_equal(arr.astype("uint64"), expected)
tm.assert_index_equal(arr.astype("uint32"), expected)
tm.assert_index_equal(arr.astype("uint64"), expected)
tm.assert_index_equal(arr.astype("uint32"), expected)

def test_astype_object(self):
idx = PeriodIndex([], freq="M")
Expand Down
10 changes: 2 additions & 8 deletions pandas/tests/indexes/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,7 @@

from pandas.compat import IS64

from pandas.core.dtypes.common import (
is_integer_dtype,
needs_i8_conversion,
)
from pandas.core.dtypes.common import is_integer_dtype

import pandas as pd
from pandas import (
Expand Down Expand Up @@ -383,10 +380,7 @@ def test_astype_preserves_name(self, index, dtype):
index.name = "idx"

warn = None
if dtype in ["int64", "uint64"]:
if needs_i8_conversion(index.dtype):
warn = FutureWarning
elif (
if (
isinstance(index, DatetimeIndex)
and index.tz is not None
and dtype == "datetime64[ns]"
Expand Down
11 changes: 4 additions & 7 deletions pandas/tests/indexes/timedeltas/methods/test_astype.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,7 @@ def test_astype(self):
)
tm.assert_index_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
result = idx.astype(int)
result = idx.astype(int)
expected = Int64Index(
[100000000000000] + [-9223372036854775808] * 3, dtype=np.int64, name="idx"
)
Expand All @@ -70,8 +69,7 @@ def test_astype(self):
tm.assert_index_equal(result, expected)

rng = timedelta_range("1 days", periods=10)
with tm.assert_produces_warning(FutureWarning):
result = rng.astype("i8")
result = rng.astype("i8")
tm.assert_index_equal(result, Index(rng.asi8))
tm.assert_numpy_array_equal(rng.asi8, result.values)

Expand All @@ -80,9 +78,8 @@ def test_astype_uint(self):
expected = UInt64Index(
np.array([3600000000000, 90000000000000], dtype="uint64")
)
with tm.assert_produces_warning(FutureWarning):
tm.assert_index_equal(arr.astype("uint64"), expected)
tm.assert_index_equal(arr.astype("uint32"), expected)
tm.assert_index_equal(arr.astype("uint64"), expected)
tm.assert_index_equal(arr.astype("uint32"), expected)

def test_astype_timedelta64(self):
# GH 13149, GH 13209
Expand Down
6 changes: 1 addition & 5 deletions pandas/tests/internals/test_internals.py
Original file line number Diff line number Diff line change
Expand Up @@ -540,9 +540,6 @@ def test_astype(self, t):
# coerce all
mgr = create_mgr("c: f4; d: f2; e: f8")

warn = FutureWarning if t == "int64" else None
# datetimelike.astype(int64) deprecated

t = np.dtype(t)
tmgr = mgr.astype(t)
assert tmgr.iget(0).dtype.type == t
Expand All @@ -553,8 +550,7 @@ def test_astype(self, t):
mgr = create_mgr("a,b: object; c: bool; d: datetime; e: f4; f: f2; g: f8")

t = np.dtype(t)
with tm.assert_produces_warning(warn):
tmgr = mgr.astype(t, errors="ignore")
tmgr = mgr.astype(t, errors="ignore")
assert tmgr.iget(2).dtype.type == t
assert tmgr.iget(4).dtype.type == t
assert tmgr.iget(5).dtype.type == t
Expand Down
20 changes: 6 additions & 14 deletions pandas/tests/series/test_constructors.py
Original file line number Diff line number Diff line change
Expand Up @@ -968,9 +968,7 @@ def test_constructor_dtype_datetime64_10(self):
dts = Series(dates, dtype="datetime64[ns]")

# valid astype
with tm.assert_produces_warning(FutureWarning):
# astype(np.int64) deprecated
dts.astype("int64")
dts.astype("int64")

# invalid casting
msg = r"cannot astype a datetimelike from \[datetime64\[ns\]\] to \[int32\]"
Expand All @@ -980,10 +978,8 @@ def test_constructor_dtype_datetime64_10(self):
# ints are ok
# we test with np.int64 to get similar results on
# windows / 32-bit platforms
with tm.assert_produces_warning(FutureWarning):
# astype(np.int64) deprecated
result = Series(dts, dtype=np.int64)
expected = Series(dts.astype(np.int64))
result = Series(dts, dtype=np.int64)
expected = Series(dts.astype(np.int64))
tm.assert_series_equal(result, expected)

def test_constructor_dtype_datetime64_9(self):
Expand Down Expand Up @@ -1494,9 +1490,7 @@ def test_constructor_dtype_timedelta64(self):
assert td.dtype == "timedelta64[ns]"

# valid astype
with tm.assert_produces_warning(FutureWarning):
# astype(int64) deprecated
td.astype("int64")
td.astype("int64")

# invalid casting
msg = r"cannot astype a datetimelike from \[timedelta64\[ns\]\] to \[int32\]"
Expand Down Expand Up @@ -1622,10 +1616,8 @@ def test_constructor_cant_cast_datetimelike(self, index):
# ints are ok
# we test with np.int64 to get similar results on
# windows / 32-bit platforms
with tm.assert_produces_warning(FutureWarning):
# asype(np.int64) deprecated, use .view(np.int64) instead
result = Series(index, dtype=np.int64)
expected = Series(index.astype(np.int64))
result = Series(index, dtype=np.int64)
expected = Series(index.astype(np.int64))
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize(
Expand Down