Skip to content

REGR: resampling DataFrame with DateTimeIndex with empty groups and uint8, uint16 or uint32 columns incorrectly raising RuntimeError #44828

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 10, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.3.5.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Fixed regressions
~~~~~~~~~~~~~~~~~
- Fixed regression in :meth:`Series.equals` when comparing floats with dtype object to None (:issue:`44190`)
- Fixed regression in :func:`merge_asof` raising error when array was supplied as join key (:issue:`42844`)
- Fixed regression when resampling :class:`DataFrame` with :class:`DateTimeIndex` with empty groups and ``uint8``, ``uint16`` or ``uint32`` columns incorrectly raising ``RuntimeError`` (:issue:`43329`)
- Fixed regression in creating a :class:`DataFrame` from a timezone-aware :class:`Timestamp` scalar near a Daylight Savings Time transition (:issue:`42505`)
- Fixed performance regression in :func:`read_csv` (:issue:`44106`)
- Fixed regression in :meth:`Series.duplicated` and :meth:`Series.drop_duplicates` when Series has :class:`Categorical` dtype with boolean categories (:issue:`44351`)
Expand Down
7 changes: 4 additions & 3 deletions pandas/core/groupby/ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -500,9 +500,10 @@ def _call_cython_op(
elif is_bool_dtype(dtype):
values = values.astype("int64")
elif is_integer_dtype(dtype):
# e.g. uint8 -> uint64, int16 -> int64
dtype_str = dtype.kind + "8"
values = values.astype(dtype_str, copy=False)
# GH#43329 If the dtype is explicitly of type uint64 the type is not
# changed to prevent overflow.
if dtype != np.uint64:
values = values.astype(np.int64, copy=False)
elif is_numeric:
if not is_complex_dtype(dtype):
values = ensure_float64(values)
Expand Down
24 changes: 24 additions & 0 deletions pandas/tests/resample/test_datetime_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -1828,3 +1828,27 @@ def test_resample_aggregate_functions_min_count(func):
index=DatetimeIndex(["2020-03-31"], dtype="datetime64[ns]", freq="Q-DEC"),
)
tm.assert_series_equal(result, expected)


def test_resample_unsigned_int(any_unsigned_int_numpy_dtype):
# gh-43329
df = DataFrame(
index=date_range(start="2000-01-01", end="2000-01-03 23", freq="12H"),
columns=["x"],
data=[0, 1, 0] * 2,
dtype=any_unsigned_int_numpy_dtype,
)
df = df.loc[(df.index < "2000-01-02") | (df.index > "2000-01-03"), :]

if any_unsigned_int_numpy_dtype == "uint64":
with pytest.raises(RuntimeError, match="empty group with uint64_t"):
result = df.resample("D").max()
else:
result = df.resample("D").max()

expected = DataFrame(
[1, np.nan, 0],
columns=["x"],
index=date_range(start="2000-01-01", end="2000-01-03 23", freq="D"),
)
tm.assert_frame_equal(result, expected)