Skip to content

Commit 0d36edf

Browse files
committed
Fix accidental loss-of-precision for to_datetime(str, unit=...)
In Pandas 1.5.3, the `float(val)` cast was inline to the `cast_from_unit` call in `array_with_unit_to_datetime`. This caused the intermediate (unnamed) value to be a Python float. Since pandas-dev#50301, a temporary variable was added to avoid multiple casts, but with explicit type `cdef float`, which defines a _Cython_ float. This type is 32-bit, and causes a loss of precision, and a regression in parsing from 1.5.3. So widen the explicit type of the temporary `fval` variable to (64-bit) `double`, which will not lose precision. Fixes pandas-dev#57051
1 parent 4ed67ac commit 0d36edf

File tree

3 files changed

+10
-1
lines changed

3 files changed

+10
-1
lines changed

doc/source/whatsnew/v2.2.1.rst

+1
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ Fixed regressions
5050
- Fixed regression in :meth:`Series.pct_change` raising a ``ValueError`` for an empty :class:`Series` (:issue:`57056`)
5151
- Fixed regression in :meth:`Series.to_numpy` when dtype is given as float and the data contains NaNs (:issue:`57121`)
5252
- Fixed regression in addition or subtraction of :class:`DateOffset` objects with millisecond components to ``datetime64`` :class:`Index`, :class:`Series`, or :class:`DataFrame` (:issue:`57529`)
53+
- Fixed regression in precision of :func:`to_datetime` with string and ``unit`` input (:issue:`57051`)
5354

5455
.. ---------------------------------------------------------------------------
5556
.. _whatsnew_221.bug_fixes:

pandas/_libs/tslib.pyx

+1-1
Original file line numberDiff line numberDiff line change
@@ -275,7 +275,7 @@ def array_with_unit_to_datetime(
275275
bint is_raise = errors == "raise"
276276
ndarray[int64_t] iresult
277277
tzinfo tz = None
278-
float fval
278+
double fval
279279

280280
assert is_coerce or is_raise
281281

pandas/tests/tools/test_to_datetime.py

+8
Original file line numberDiff line numberDiff line change
@@ -1735,6 +1735,14 @@ def test_unit(self, cache):
17351735
with pytest.raises(ValueError, match=msg):
17361736
to_datetime([1], unit="D", format="%Y%m%d", cache=cache)
17371737

1738+
def test_unit_str(self, cache):
1739+
# GH 57051
1740+
# Test that strs aren't dropping precision to 32-bit accidentally.
1741+
with tm.assert_produces_warning(FutureWarning):
1742+
res = pd.to_datetime(["1704660000"], unit="s", origin="unix")
1743+
expected = pd.to_datetime([1704660000], unit="s", origin="unix")
1744+
tm.assert_index_equal(res, expected)
1745+
17381746
def test_unit_array_mixed_nans(self, cache):
17391747
values = [11111111111111111, 1, 1.0, iNaT, NaT, np.nan, "NaT", ""]
17401748

0 commit comments

Comments
 (0)