-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: to_datime fastpath for %Y%m%d is slower #17410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
something changed recently? with this; the path now excepts every time (inside the _attemptYYMMDD). this should be much faster. |
I suspect, for |
One change, though not especially recent, is that the iso 8601 path will now handle |
I looked into this briefly in https://github.com/mroeschke/pandas/tree/non_object_parsing and was able to get it down ~2x from master by getting rid of casting to object, but it's still slower than not providing a format:
Given that this branch can provide a slowdown, it might be worth removing this path for now. |
this used to be way faster than naive parsing |
This patch makes the speed almost the same. But since its just duplicating a lot of paths, maybe just easier to simplify this (e.g. if format is %Y%m%d then just remove the format and parse). I think this got way slower because of floating point math was introduced (e.g.
|
of course for this example, |
This path is buggy anyway, I'd suggest just removing it #50054 |
Sending it down the same path that it would go down if In [5]: to_datetime('199934', format='%Y%m%d')
Out[5]: Timestamp('1999-03-04 00:00:00')
In [6]: to_datetime('199934')
---------------------------------------------------------------------------
ParserError: month must be in 1..12: 199934 present at position 0 I still think we should remove this path though, even if it means going down the |
We have a check for whether
format == '%Y%m%d'
, but this actually seems to be slower:The text was updated successfully, but these errors were encountered: