Skip to content

CLN: remove pydt_to_i8 #30854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 24, 2020
Merged

CLN: remove pydt_to_i8 #30854

merged 8 commits into from
Jan 24, 2020

Conversation

jbrockmendel
Copy link
Member

It's only used in one non-test place, and all supported python versions now support datetime.timestamp, so we can use that directly. Its a little bit less performant, but not used often.

@@ -233,7 +233,7 @@ def _to_dt64(dt, dtype='datetime64'):
# numpy.datetime64('2013-05-01T02:00:00.000000+0200')
# Thus astype is needed to cast datetime to datetime64[D]
if getattr(dt, 'tzinfo', None) is not None:
i8 = pydt_to_i8(dt)
i8 = dt.timestamp() * 10**9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance of losing precision doing it this way? I think can be somewhere around the 2 ** 60 range for floats

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, pydt obects only have 6 digits after the decimal, so we should be OK. Besides, later in this function we case to datetime64[D]

@gfyoung gfyoung added Clean Timedelta Timedelta data type labels Jan 9, 2020
@WillAyd
Copy link
Member

WillAyd commented Jan 16, 2020

@jbrockmendel can you merge master?

@jbrockmendel
Copy link
Member Author

rebased+green

@@ -233,7 +233,7 @@ def _to_dt64D(dt):
# numpy.datetime64('2013-05-01T02:00:00.000000+0200')
# Thus astype is needed to cast datetime to datetime64[D]
if getattr(dt, 'tzinfo', None) is not None:
i8 = pydt_to_i8(dt)
i8 = np.int64(dt.timestamp() * 10**9)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm still a little hung up on this. Just to illustrate:

dt1 = datetime.datetime(2020, 1, 17, 16, 1, 9, 136686)
>>> dt2 = pd.Timestamp(np.int64(dt1.timestamp() * 10**9))
>>> dt2.nanosecond
80

I suppose not totally equivalent given the later cast to datetime[D] but this does lose some precision right? If so could lead to some hard to find bugs, but could also misunderstand

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right to be hung up on this, but for a different reason. I had forgotten that the stdlib datetime.timestamp behaves different from pd.Timestamp.timestamp for tz-naive datetimes. I'll give this another look.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i take it back, i was right the first time. this usage is only in the tzaware case, so we're all good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't do what convet_datetime_to_tsobject does? this doesn't appear to have the same timezone behavior

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't appear to have the same timezone behavior

On the previous line we are checking that we are only doing this in the tzaware case, in which datetime.timestamp and pd.Timestamp.timestamp behave the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as i said i think you should re-use convert_datetime_to_tsobject rather than doing this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this, but it gets appreciably more verbose because we need special handling for datetime vs Timestamp. The version in the PR ATM is correct for both datetime and Timestmap.

@@ -233,7 +233,7 @@ def _to_dt64D(dt):
# numpy.datetime64('2013-05-01T02:00:00.000000+0200')
# Thus astype is needed to cast datetime to datetime64[D]
if getattr(dt, 'tzinfo', None) is not None:
i8 = pydt_to_i8(dt)
i8 = np.int64(dt.timestamp() * 10**9)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't do what convet_datetime_to_tsobject does? this doesn't appear to have the same timezone behavior

@jreback jreback added this to the 1.1 milestone Jan 18, 2020
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jreback jreback merged commit f6ef10a into pandas-dev:master Jan 24, 2020
@jreback
Copy link
Contributor

jreback commented Jan 24, 2020

thanks

@jbrockmendel jbrockmendel deleted the cln-pydt_to_i8 branch January 24, 2020 04:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants