-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REGR: to_timedelta precision issues with floating data #25651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
4e36a9b
fb67db4
39b15aa
338a652
5cc3c39
943888b
74c3e32
053df8d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -918,13 +918,9 @@ def sequence_to_td64ns(data, copy=False, unit="ns", errors="raise"): | |
copy = copy and not copy_made | ||
|
||
elif is_float_dtype(data.dtype): | ||
# treat as multiples of the given unit. If after converting to nanos, | ||
# there are fractional components left, these are truncated | ||
# (i.e. NOT rounded) | ||
mask = np.isnan(data) | ||
coeff = np.timedelta64(1, unit) / np.timedelta64(1, 'ns') | ||
data = (coeff * data).astype(np.int64).view('timedelta64[ns]') | ||
data[mask] = iNaT | ||
# object_to_td64ns has custom logic for float -> int conversion | ||
# to avoid precision issues | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does this have the same perf? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it is slower. But it is more correct (it is written specifically to handle this case), and it is what was used before 0.24.0 anyway. I assume we might be able to port the similar logic here to be more performant (to not work element by element, knowing we only have floats), but I would personally leave that for 0.25.0. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. well the prior fix was for performance and this is a very narrow minor case so you are causing a rather large perf regression by changing this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. pls shown asv before / after There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Converting floats to timedelta is a narrow use case anyway. If you care about performance but not care about precision, you can convert to integers yourself. |
||
data = objects_to_td64ns(data, unit=unit, errors=errors) | ||
copy = False | ||
|
||
elif is_timedelta64_dtype(data.dtype): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
loosing -> losing