Skip to content

Commit 5fa85e9

Browse files
jbandlowharisbal
authored and
harisbal
committed
BUG: Fix ts precision issue with groupby and NaT (pandas-dev#19526)
closes pandas-dev#19526 Author: Jason Bandlow <[email protected]> Closes pandas-dev#19530 from jbandlow/timestamp_float_conversion and squashes the following commits: 2fb23d6 [Jason Bandlow] merge af37225 [Jason Bandlow] BUG: Fix ts precision issue with groupby and NaT (pandas-dev#19526)
1 parent 25c2f08 commit 5fa85e9

File tree

3 files changed

+20
-2
lines changed

3 files changed

+20
-2
lines changed

doc/source/whatsnew/v0.23.0.txt

+1
Original file line numberDiff line numberDiff line change
@@ -644,6 +644,7 @@ Groupby/Resample/Rolling
644644
- Fixed regression in :func:`DataFrame.groupby` which would not emit an error when called with a tuple key not in the index (:issue:`18798`)
645645
- Bug in :func:`DataFrame.resample` which silently ignored unsupported (or mistyped) options for ``label``, ``closed`` and ``convention`` (:issue:`19303`)
646646
- Bug in :func:`DataFrame.groupby` where tuples were interpreted as lists of keys rather than as keys (:issue:`17979`, :issue:`18249`)
647+
- Bug in :func:`DataFrame.groupby` where aggregation by ``first``/``last``/``min``/``max`` was causing timestamps to lose precision (:issue:`19526`)
647648
- Bug in :func:`DataFrame.transform` where particular aggregation functions were being incorrectly cast to match the dtype(s) of the grouped data (:issue:`19200`)
648649
- Bug in :func:`DataFrame.groupby` passing the `on=` kwarg, and subsequently using ``.apply()`` (:issue:`17813`)
649650

pandas/core/groupby.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2336,7 +2336,7 @@ def _cython_operation(self, kind, values, how, axis, min_count=-1):
23362336
result = self._transform(
23372337
result, values, labels, func, is_numeric, is_datetimelike)
23382338

2339-
if is_integer_dtype(result):
2339+
if is_integer_dtype(result) and not is_datetimelike:
23402340
mask = result == iNaT
23412341
if mask.any():
23422342
result = result.astype('float64')

pandas/tests/groupby/aggregate/test_cython.py

+18-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,8 @@
1212
from numpy import nan
1313
import pandas as pd
1414

15-
from pandas import bdate_range, DataFrame, Index, Series
15+
from pandas import (bdate_range, DataFrame, Index, Series, Timestamp,
16+
Timedelta, NaT)
1617
from pandas.core.groupby import DataError
1718
import pandas.util.testing as tm
1819

@@ -187,3 +188,19 @@ def test_cython_agg_empty_buckets_nanops():
187188
{"a": [1, 1, 1716, 1]},
188189
index=pd.CategoricalIndex(intervals, name='a', ordered=True))
189190
tm.assert_frame_equal(result, expected)
191+
192+
193+
@pytest.mark.parametrize('op', ['first', 'last', 'max', 'min'])
194+
@pytest.mark.parametrize('data', [
195+
Timestamp('2016-10-14 21:00:44.557'),
196+
Timedelta('17088 days 21:00:44.557'), ])
197+
def test_cython_with_timestamp_and_nat(op, data):
198+
# https://github.com/pandas-dev/pandas/issues/19526
199+
df = DataFrame({'a': [0, 1], 'b': [data, NaT]})
200+
index = Index([0, 1], name='a')
201+
202+
# We will group by a and test the cython aggregations
203+
expected = DataFrame({'b': [data, NaT]}, index=index)
204+
205+
result = df.groupby('a').aggregate(op)
206+
tm.assert_frame_equal(expected, result)

0 commit comments

Comments
 (0)