-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Fix Timestamp rounding #21507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Timestamp rounding #21507
Changes from 8 commits
ee8ba61
50986c6
d1c6e6f
03a42b8
a302074
33335b4
0363a72
7353c2f
82c7db1
559bb50
bfe0d16
4249fd7
2abd8e2
3546c00
342fd40
a1111b3
4423dc9
0550ef7
57c8222
8a51d07
7f2f340
56477a0
ea922c6
a37c3d2
9f9b636
c5a11d6
bcc6b8c
402905d
22d65e1
2908346
0e05de4
159756e
70a3d6d
7d664d5
7463576
f504300
07381bb
e2fb27a
f57e0eb
0921273
d2de507
aa5e1f1
676ae59
8ca172d
28780c2
a231fb2
5f44af0
1427b69
2741967
34b77d1
e788e47
fb16555
84533b1
8cbcafd
c53e001
34d74ce
a10216d
4e6a11f
ca9ce8d
6289c76
b19219d
3814d0c
ea205c0
fd8d6bc
f3a89f3
eb47287
c815e0b
91c9ec3
c019f6d
c03ed38
4668dba
07e161d
15b040c
b21efcc
a7d5bb7
8944d35
8280084
5f956ce
19b78ca
303a279
a4798c3
f901431
8794ef2
6eb0341
289f3ad
a65ea90
13a3d5a
afdcac1
9f6c154
c19017b
5e4882e
b3443da
ebeccfc
db233f3
79d982a
367ce07
8db9303
e8f5ede
58a1a08
c6660f6
7555378
45cfa62
d746bee
476717c
001dc78
59286da
dad1252
b3b047e
242ccbc
8cbfcbf
d07e61b
0829063
da3b903
2d0fa8b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -59,43 +59,58 @@ cdef inline object create_timestamp_from_ts(int64_t value, | |
|
||
|
||
def round_ns(values, rounder, freq): | ||
|
||
""" | ||
Applies rounding function at given frequency | ||
|
||
Parameters | ||
---------- | ||
values : int, :obj:`ndarray` | ||
rounder : function | ||
values : :obj:`ndarray` | ||
rounder : function, eg. 'Ceil', 'Floor', 'round' | ||
freq : str, obj | ||
|
||
Returns | ||
------- | ||
int or :obj:`ndarray` | ||
:obj:`ndarray` | ||
""" | ||
def _round_non_int_multiple(value, unit): | ||
|
||
if unit < 1000: | ||
# for nano rounding, work with the last 6 digits separately | ||
# due to float precision | ||
buff = 1000000 | ||
r = (buff * (value // buff) + unit * | ||
(rounder((value % buff) * (1 / float(unit)))).astype('i8')) | ||
else: | ||
if unit % 1000 != 0: | ||
msg = 'Precision will be lost using frequency: {}' | ||
warnings.warn(msg.format(freq)) | ||
|
||
# GH19206 | ||
# to deal with round-off when unit is large | ||
if unit >= 1e9: | ||
divisor = 10 ** int(np.log10(unit / 1e7)) | ||
else: | ||
divisor = 10 | ||
|
||
r = (unit * rounder((value * (divisor / float(unit))) / divisor) | ||
.astype('i8')) | ||
|
||
return r | ||
|
||
from pandas.tseries.frequencies import to_offset | ||
unit = to_offset(freq).nanos | ||
if unit < 1000: | ||
# for nano rounding, work with the last 6 digits separately | ||
# due to float precision | ||
buff = 1000000 | ||
r = (buff * (values // buff) + unit * | ||
(rounder((values % buff) * (1 / float(unit)))).astype('i8')) | ||
else: | ||
if unit % 1000 != 0: | ||
msg = 'Precision will be lost using frequency: {}' | ||
warnings.warn(msg.format(freq)) | ||
|
||
# GH19206 | ||
# to deal with round-off when unit is large | ||
if unit >= 1e9: | ||
divisor = 10 ** int(np.log10(unit / 1e7)) | ||
else: | ||
divisor = 10 | ||
|
||
r = (unit * rounder((values * (divisor / float(unit))) / divisor) | ||
.astype('i8')) | ||
values = np.copy(values) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is a copy needed here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is to handle the case where 'NaT' exists in the DateTimeIndex. Removing it will cause I found that datetimelike.py There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cc @jreback. Thanks for the investigation. That sounds reasonable; but I am not too familiar with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you are mutating in place, so you DO need to copy. that's fine. though pls use |
||
|
||
return r | ||
# GH21262 If the Timestamp is multiple of the freq str | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i think it simpler to put this condition inside ththe above function then you don’t even need a function ! iow i think you just needed to add that case above (in the original code) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right yep I see what you are saying, just to confirm before I commit, something like this ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep! |
||
# don't apply any rounding | ||
mask = values % unit == 0 | ||
if mask.all(): | ||
return values | ||
values[~mask] = _round_non_int_multiple(values[~mask], unit) | ||
|
||
return values | ||
|
||
|
||
# This is PITA. Because we inherit from datetime, which has very specific | ||
|
@@ -649,7 +664,10 @@ class Timestamp(_Timestamp): | |
else: | ||
value = self.value | ||
|
||
r = round_ns(value, rounder, freq) | ||
value = np.array([value], dtype=np.int64) | ||
|
||
# Will only ever contain 1 element for timestamp | ||
r = round_ns(value, rounder, freq)[0] | ||
result = Timestamp(r, unit='ns') | ||
if self.tz is not None: | ||
result = result.tz_localize(self.tz) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Could you lowercase ceil and floor