Skip to content

BUG: Fix bug in quantile() for resample and groupby with Timedelta #37145

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 31, 2020
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -462,6 +462,7 @@ Groupby/resample/rolling
- Bug in :meth:`RollingGroupby.count` where a ``ValueError`` was raised when specifying the ``closed`` parameter (:issue:`35869`)
- Bug in :meth:`DataFrame.groupby.rolling` returning wrong values with partial centered window (:issue:`36040`).
- Bug in :meth:`DataFrameGroupBy.rolling` returned wrong values with timeaware window containing ``NaN``. Raises ``ValueError`` because windows are not monotonic now (:issue:`34617`)
- Bug in :meth:`DataFrameGroupBy.quantile` and :meth:`DataFrame.resample().quantile()` raised ``TypeError`` when values to calculate quantile over where ``Timedelta`` (:issue:`29485`)

Reshaping
^^^^^^^^^
Expand Down
4 changes: 4 additions & 0 deletions pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ class providing the base-class of operations.
is_numeric_dtype,
is_object_dtype,
is_scalar,
is_timedelta64_dtype,
)
from pandas.core.dtypes.missing import isna, notna

Expand Down Expand Up @@ -2151,6 +2152,9 @@ def pre_processor(vals: np.ndarray) -> Tuple[np.ndarray, Optional[Type]]:
elif is_datetime64_dtype(vals.dtype):
inference = "datetime64[ns]"
vals = np.asarray(vals).astype(float)
elif is_timedelta64_dtype(vals.dtype):
inference = "timedelta64[ns]"
vals = np.asarray(vals).astype(float)

return vals, inference

Expand Down
18 changes: 18 additions & 0 deletions pandas/tests/groupby/test_quantile.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,3 +240,21 @@ def test_groupby_quantile_skips_invalid_dtype(q):
result = df.groupby("a").quantile(q)
expected = df.groupby("a")[["b"]].quantile(q)
tm.assert_frame_equal(result, expected)


def test_groupby_timedelta_quantile():
# GH: 29485
df = pd.DataFrame(
{"value": pd.to_timedelta(np.arange(4), unit="s"), "group": [1, 1, 2, 2]}
)
result = df.groupby("group").quantile(0.99)
expected = pd.DataFrame(
{
"value": [
pd.Timedelta("0 days 00:00:00.990000"),
pd.Timedelta("0 days 00:00:02.990000"),
]
},
index=pd.Index([1, 2], name="group"),
)
tm.assert_frame_equal(result, expected)
19 changes: 19 additions & 0 deletions pandas/tests/resample/test_timedelta.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,3 +165,22 @@ def test_resample_with_timedelta_yields_no_empty_groups():
index=pd.timedelta_range(start="1s", periods=13, freq="3s"),
)
tm.assert_frame_equal(result, expected)


def test_resample_quantile_timedelta():
# GH: 29485
df = pd.DataFrame(
{"value": pd.to_timedelta(np.arange(4), unit="s")},
index=pd.date_range("20200101", periods=4, tz="UTC"),
)
result = df.resample("2D").quantile(0.99)
expected = pd.DataFrame(
{
"value": [
pd.Timedelta("0 days 00:00:00.990000"),
pd.Timedelta("0 days 00:00:02.990000"),
]
},
index=pd.date_range("20200101", periods=2, tz="UTC", freq="2D"),
)
tm.assert_frame_equal(result, expected)