Skip to content

PERF: DatetimeIndex.get_indexer with mismatched tz #39332

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 25, 2021

Conversation

jbrockmendel
Copy link
Member

Avoid casting to object dtype.

In [2]: dti = pd.date_range("2016-01-01", periods=10000, tz="US/Pacific")
In [3]: dti2 = dti.tz_convert("UTC")

In [4]: %timeit dti.get_indexer(dti2)
127 ms ± 4.69 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)  # <-- master
352 µs ± 10.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)  # <-- PR

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Timezones Timezone data dtype labels Jan 24, 2021
@jreback jreback added this to the 1.3 milestone Jan 24, 2021
@@ -5178,7 +5178,16 @@ def _maybe_promote(self, other: Index):
if we can upcast the object-dtype one to improve performance.
"""

if self.inferred_type == "date" and isinstance(other, ABCDatetimeIndex):
if isinstance(self, ABCDatetimeIndex) and isinstance(other, ABCDatetimeIndex):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why should this not raise?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is relevant in three places: get_indexer, get_indexer_non_unique, and join. In the first two, we're talking about comparisons, which always allow mixed tz. In the third, that is just The Way It Is, xref #39328

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i see, yeah generally i find this weird that we allow indexing on mixed tz's but i guess that boat has sailed.

@jreback
Copy link
Contributor

jreback commented Jan 24, 2021

do we have any asv's on user facing, e.g. loc with mixed tz's?

@jbrockmendel
Copy link
Member Author

do we have any asv's on user facing, e.g. loc with mixed tz's?

i dont think so, this implements one that directly measures get_indexer, with a comment that it reached via Series.__getitem__

1 similar comment
@jbrockmendel
Copy link
Member Author

do we have any asv's on user facing, e.g. loc with mixed tz's?

i dont think so, this implements one that directly measures get_indexer, with a comment that it reached via Series.__getitem__

@jreback jreback merged commit 1dafea5 into pandas-dev:master Jan 25, 2021
@jreback
Copy link
Contributor

jreback commented Jan 25, 2021

thanks

@jbrockmendel jbrockmendel deleted the ref-maybe_promote-get_indexer branch January 25, 2021 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants