Skip to content

FIX BUG: Timestamp __add__/__sub__ DateOffset with nanoseconds lost. #43968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
Dec 17, 2021
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
2208507
fix Timestamp + Dateoffset bugs.
tushushu Oct 11, 2021
ee19a4d
Add doc string and fix UT.
tushushu Oct 16, 2021
3e38ab2
fix UT
tushushu Oct 16, 2021
5718e4f
Trim Trailing Whitespace
tushushu Oct 16, 2021
e1ae149
remove is_dateoffset_obj func
tushushu Oct 17, 2021
326790b
explicitly check the Offset type
tushushu Oct 17, 2021
83453da
fix bugs.
tushushu Oct 17, 2021
ae4b133
fix bugs
tushushu Oct 17, 2021
a43030d
try to fix __sub__ method.
tushushu Oct 17, 2021
22cadbf
fix flake8 check.
tushushu Oct 17, 2021
1c89c03
add tests for dateoffset add method.
tushushu Oct 17, 2021
a2ad274
fix offset init.
tushushu Oct 17, 2021
9bc6e97
fix black code style check.
tushushu Oct 18, 2021
c9ddf14
test sub method.
tushushu Oct 18, 2021
4780316
add an comment.
tushushu Oct 19, 2021
1fecfa6
move the nano handling codes to Offsets.
tushushu Oct 23, 2021
e92c1a9
remove unnecessary codes.
tushushu Oct 23, 2021
006c073
fix bugs
tushushu Oct 24, 2021
ab064b4
more test cases for dateoffset add/sub
tushushu Oct 24, 2021
8e42c64
test radd.
tushushu Oct 24, 2021
5ca3472
code style.
tushushu Oct 25, 2021
37d19b5
remove unused line.
tushushu Oct 30, 2021
5acad37
Merge branch 'pandas-dev:master' into fix-bug-dateoffset-nanoseconds
tushushu Dec 2, 2021
9cda54a
what's new
tushushu Dec 2, 2021
0ee9a6f
remove unnecessary brackets
tushushu Dec 2, 2021
860a62b
what's new.
tushushu Dec 3, 2021
4600a64
makes the comment more detailed.
tushushu Dec 3, 2021
8cf5746
test case as jbrockmendel demand
tushushu Dec 4, 2021
a210baf
trigger the CI without a commit
tushushu Dec 5, 2021
98646cd
Merge branch 'pandas-dev:master' into fix-bug-dateoffset-nanoseconds
tushushu Dec 5, 2021
e899fbe
Merge branch 'pandas-dev:master' into fix-bug-dateoffset-nanoseconds
tushushu Dec 6, 2021
906f546
Merge branch 'pandas-dev:master' into fix-bug-dateoffset-nanoseconds
tushushu Dec 7, 2021
24fe768
split dateoffset test func
tushushu Dec 7, 2021
209ba9e
Merge branch 'fix-bug-dateoffset-nanoseconds' of github.com:tushushu/…
tushushu Dec 7, 2021
f56f52b
Merge branch 'pandas-dev:master' into fix-bug-dateoffset-nanoseconds
tushushu Dec 7, 2021
3f49876
fix typo
tushushu Dec 8, 2021
b7297ef
Merge branch 'fix-bug-dateoffset-nanoseconds' of github.com:tushushu/…
tushushu Dec 8, 2021
5cbae96
Merge branch 'pandas-dev:master' into fix-bug-dateoffset-nanoseconds
tushushu Dec 17, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions pandas/_libs/tslibs/offsets.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ from numpy cimport int64_t


cpdef to_offset(object obj)
cdef bint is_dateoffset_object(object obj)
cdef bint is_offset_object(object obj)
cdef bint is_tick_object(object obj)

Expand Down
6 changes: 6 additions & 0 deletions pandas/_libs/tslibs/offsets.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,12 @@ from .timestamps import Timestamp
# ---------------------------------------------------------------------
# Misc Helpers

cdef bint is_dateoffset_object(object obj):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a docstring, specifically make it clear when to use this vs is_offset_object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment you have here is not useful, add a doc-string not dependent on a particular patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this function is not that necessary. It could be enough by using is_offset_object(obj) and hasattr(obj, "nanoseconds") to determine the need to deal with "nanoseconds".

Copy link
Contributor Author

@tushushu tushushu Oct 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When calling _TimeStamp.__add__(self, other) with other is an Offset object, it can not meet any condition of the if-else chain. And it will call _TimeStamp.__add__(self, other._offset) instead. I cannot find where this logic is implemented, but I guess we could explicitly check the Offset type in the if-else chain.

# Note: This is to fix Timestamp __add__/__sub__ DateOffset with
# nanoseconds lost. See PR #43968
return type(obj) is DateOffset


cdef bint is_offset_object(object obj):
return isinstance(obj, BaseOffset)

Expand Down
9 changes: 6 additions & 3 deletions pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ from pandas._libs.tslibs.np_datetime cimport (
from pandas._libs.tslibs.np_datetime import OutOfBoundsDatetime

from pandas._libs.tslibs.offsets cimport (
is_dateoffset_object,
is_offset_object,
to_offset,
)
Expand Down Expand Up @@ -282,9 +283,12 @@ cdef class _Timestamp(ABCTimestamp):
def __add__(self, other):
cdef:
int64_t nanos = 0

if is_dateoffset_object(other):
if hasattr(other, "nanoseconds"):
nanos += other.nanoseconds
other = other._offset
if is_any_td_scalar(other):
nanos = delta_to_nanoseconds(other)
nanos += delta_to_nanoseconds(other)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any cases that go through the hasattr(other, "nanoseconds") and also through here?

Copy link
Contributor Author

@tushushu tushushu Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a case go through the hasattr(other, "nanoseconds"), it must go through here. For example, Timestampe(0) + DateOffset(nanoseconds=1) will go through both branches. Because line 289 other = other._offset converts other to a td_scalar.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it must go through here [...] Because line 289 other = other._offset converts other to a td_scalar.

Doesn't it sometimes convert to a relativedelta object (which isnt a td_scalar)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel According to the codes below, we can see self._offset is a relativedelta object only if there is no nanosecond / nanoseconds keyword parameter.

    kwds_no_nanos = dict(
        (k, v) for k, v in kwds.items()
        if k not in ('nanosecond', 'nanoseconds')
    )
    # TODO: Are nanosecond and nanoseconds allowed somewhere?

    _kwds_use_relativedelta = ('years', 'months', 'weeks', 'days',
                               'year', 'month', 'week', 'day', 'weekday',
                               'hour', 'minute', 'second', 'microsecond')

    use_relativedelta = False
    if len(kwds_no_nanos) > 0:
        if any(k in _kwds_use_relativedelta for k in kwds_no_nanos):
            offset = relativedelta(**kwds_no_nanos)
            use_relativedelta = True

result = type(self)(self.value + nanos, tz=self.tzinfo)
if result is not NaT:
result._set_freq(self._freq) # avoid warning in constructor
Expand All @@ -307,7 +311,6 @@ cdef class _Timestamp(ABCTimestamp):
elif not isinstance(self, _Timestamp):
# cython semantics, args have been switched and this is __radd__
return other.__add__(self)

return NotImplemented

def __sub__(self, other):
Expand Down