Skip to content

Performance drop when using timezone-aware DateTimeIndex #10192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
adrien-pain-01 opened this issue May 22, 2015 · 3 comments
Closed

Performance drop when using timezone-aware DateTimeIndex #10192

adrien-pain-01 opened this issue May 22, 2015 · 3 comments
Labels
Performance Memory or execution speed performance Timezones Timezone data dtype

Comments

@adrien-pain-01
Copy link

It seems that pandas.DataFrame operations on Index with timezone-aware dates is order of magnitude slower than on regular datetimes.

for a 500k datetimes created with pandas.date_range, and using DataFrame.shift() to compute deltas between dates, timings goes from 17ms for standard datetimes to 16seconds for timezone-aware datetimes.

I don't understand why it is so slow with timezones objects.

I already posted a complete message related to this behavior on stackoverflow yesterday :
http://stackoverflow.com/questions/30385481/performance-of-timezone-aware-pandas-datetimeindex

I'm using latest pandas 0.16.1 from Anaconda, and latest numpy 1.9.2

@jreback jreback added Performance Memory or execution speed performance Timezones Timezone data dtype labels May 22, 2015
@jreback jreback added this to the Next Major Release milestone May 22, 2015
@jreback
Copy link
Contributor

jreback commented May 22, 2015

see #8260 as this is a known issue.

datetimes with tz's are represented as object dtype, rather than datetime64[ns for datetimes. The fix is to implement #8260 which not difficult is a bit in depth as it requires learning a bit about the internals.

@adrien-pain-01
Copy link
Author

@jreback:
thanks for your explanation and quick reply.
will wait then for the next major release to see the fix :-)

and really good work on pandas, awesome library !

@shoyer shoyer modified the milestones: Someday, Next Major Release Jun 22, 2015
@jreback
Copy link
Contributor

jreback commented Oct 23, 2015

closed by #10477

@jreback jreback closed this as completed Oct 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

3 participants