-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: 3x speedup in Series of dicts with datetime keys by not having error message scale with input #24743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/dtypes/dtypes.py
Outdated
# TODO(py3): Change this pass to `raise TypeError(msg) from e` | ||
pass | ||
raise TypeError(msg.format(string)) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don’t need an else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this caught? should be checking the error messge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's primarily caught by pandas.core.dtypes.base._DtypeOpsMixin.is_dtype()
, which doesn't care about error message at all:
try:
return cls.construct_from_string(dtype) is not None
except TypeError:
return False
I'll check for direct calls that might be sensitive to the error message contents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing any calls reliant on the specific error message, so should be safe to change.
Codecov Report
@@ Coverage Diff @@
## master #24743 +/- ##
==========================================
+ Coverage 92.38% 92.38% +<.01%
==========================================
Files 166 166
Lines 52358 52360 +2
==========================================
+ Hits 48373 48375 +2
Misses 3985 3985
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #24743 +/- ##
==========================================
+ Coverage 92.38% 92.38% +<.01%
==========================================
Files 166 166
Lines 52358 52360 +2
==========================================
+ Hits 48373 48375 +2
Misses 3985 3985
Continue to review full report at Codecov.
|
5c82eab
to
65b2f13
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u add some tests that hit this (and the original)
65b2f13
to
b0b5006
Compare
Hello @qwhelan! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on January 13, 2019 at 00:56 Hours UTC |
…error message scale with input
b0b5006
to
e98fb51
Compare
@jreback Done - original already had a test. |
thanks @qwhelan |
…error message scale with input (pandas-dev#24743)
…error message scale with input (pandas-dev#24743)
Surprisingly, the majority of the time spent in constructing a
Series
from a dict withdatetime
-like keys is spent formatting the keys into strings for an error message that gets suppressed.As there's a test ensuring the string case makes it into the error message, we preserve the behavior there. In non-string cases, we mirror how the other
Dtypes
handle this case and do not include the input.git diff upstream/master -u -- "*.py" | flake8 --diff