-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: args offset in Timestamp.__new__ causes bugs #52344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: args offset in Timestamp.__new__ causes bugs #52344
Conversation
WIP. The same tests as in #52221, different solution. |
@@ -48,6 +48,9 @@ def time_from_datetime_aware(self): | |||
def time_from_pd_timestamp(self): | |||
Timestamp(self.ts) | |||
|
|||
def time_from_positional(self): | |||
Timestamp(2020, 1, 1, 0, 0, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An additional benchmark for future regressions might be helpful.
@mroeschke @jbrockmendel @jreback Had to rewrite most of the constructor, because naively switching to Please review. Any suggestions on improving the performance of the positional arguments case would be appreciated. |
+1 in this (or allow only fully qualified positionals - eg ymdhms required plus kwarg for example) |
@jbrockmendel @jreback The way to disallow passing args as kwargs would be to use the Python>3.8 args-only syntax to make everything up to tzinfo positional-only. A couple of issues I see:
Because of these points, I didn't think this option was on the table. Let's come up with an approach that you will be able to accept if it's decently done, and I will implement it. But I don't want to redo this from scratch a fourth time (I'll be happy to go back to one of the previous options though if we don't like the third solution). I believe that the current arg offset in Timestamp is a buggy PITA that we took advantage of to maintain performance and keep fragile code readable. The fix will either be a kludge, be harder to maintain, or break a bunch of code. I don't see a solution that does neither of those but would be happy to be wrong. |
Unfortunately, I don't think there is a way of fixing this outside of using |
I'm back from a work trip. I will test out the args-only approach and report on the performance benchmarks here. Any thoughts on any of the points I raised above? |
@AlexKirko you make goods points, thanks for thinking through the idea that i clearly didn't. I'll give this a close look. Thanks for being patient. |
unit=None, | ||
fold=None, | ||
): | ||
def __new__(cls, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this get rendered in the docs somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexKirko any idea about this?
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
A bit sick at the moment, will come back to it in a couple days once I'm better. |
Back among the living. The illness and the job change have cost me some time. |
@@ -368,6 +368,8 @@ Datetimelike | |||
- Bug in :meth:`Timestamp.round` with values close to the implementation bounds returning incorrect results instead of raising ``OutOfBoundsDatetime`` (:issue:`51494`) | |||
- Bug in :meth:`arrays.DatetimeArray.map` and :meth:`DatetimeIndex.map`, where the supplied callable operated array-wise instead of element-wise (:issue:`51977`) | |||
- Bug in constructing a :class:`Series` or :class:`DataFrame` from a datetime or timedelta scalar always inferring nanosecond resolution instead of inferring from the input (:issue:`52212`) | |||
- Bug in :class:`Timestamp` raising an error when passing fold when constructing from positional arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you put "fold" in quotes/ticks
Is there any viable path to splitting this into smaller pieces? This seems like a very difficult problem and is easy for me and other reviewers to be lazy about reviewing |
The bigger problem, to be honest, is me having been either swamped with work or sick this past month. In my opinion, splitting this into multiple PRs that make sense is possible, only if we stop caring about the performance of the initial implementation, and put the necessary performance improvements into separate PRs. It would make this easier to review but lead to an initial PR that is very slow in some use cases. |
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
I have merged in main, and I will try another approach next week, so I would like to reopen the request for now. The problem here is that it's a "I will hate it when I see it" situation. There are three solutions that come to mind, and I've tried and submitted for review two. The third is terrible, in my opinion.
No one has suggested another solution, and I'm out of ideas. I personally think that option 2 (the current one) is fine. Yes, it's ugly, but it doesn't slow things down too much or break old code and datetime compatibilty. I'm well aware that all the approaches have significant downsides, but either someone needs to suggest an idea I haven't considered, or we need to choose. I'll try out the positional-only solution next week for illustration purposes, but I think it's the worst. Breaking API compatibility with current code to fix a bug is just wrong. We can also leave the bug in if fixing it is worse than the current state of things, but that's also not great, to put it in mildly. If I had the authority to take the responsibility for this decision, I'd just pick the current solution and move on. I'd be happy to hear your thoughts, @mroeschke . You have more experience, so maybe you see a way out here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
option 2 (current approach) looks like a good soln
the code complexity is not too bad - commented pretty well already
Okay, then I'll focus on cleaning up the current solution instead. Will try to make the changes as painless as possible to support. |
I apologize for being glacially slow on this. Finally had an opportunity to review. The MR is in decent shape except for the docs issue. We change the signature to |
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
Added type annotations to new arguments/methods/functions.doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.This is the promised refactor for the Timestamp constructor. Detailed comments are below.
ASV benchmark results for Timestamp construction:
Replaces #52221