Skip to content

PERF: _return_parsed_timezone_results #50168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

lukemanley
Copy link
Member

@lukemanley lukemanley commented Dec 10, 2022

@MarcoGorelli - thanks for the ping. I think this gets most of the bottleneck you highlighted:

import pandas as pd

dates = pd.date_range('1900', '2000').tz_localize('+01:00').strftime('%Y-%d-%m %H:%M:%S%z').tolist()
dates.append('2020-01-01 00:00:00+02:00')

%timeit pd.to_datetime(dates, format='%Y-%d-%m %H:%M:%S%z')

# 529 ms ± 24.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)   <- main
# 174 ms ± 977 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)  <- PR

@lukemanley lukemanley added Datetime Datetime data dtype Performance Memory or execution speed performance labels Dec 10, 2022
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done, thanks! Looks good to me pending green, and nice to see that this could be solved without having to Cythonize

@MarcoGorelli MarcoGorelli added this to the 2.0 milestone Dec 10, 2022
@MarcoGorelli MarcoGorelli merged commit eb23512 into pandas-dev:main Dec 10, 2022
@lukemanley lukemanley deleted the perf-return-parsed-tz-results branch December 20, 2022 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PERF:cythonize _return_parsed_timezone_results?
2 participants