-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: lazify type-check import #28342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/io/formats/format.py
Outdated
@@ -1553,7 +1554,7 @@ def _is_dates_only( | |||
|
|||
def _format_datetime64( | |||
x: Union[NaTType, Timestamp], | |||
tz: Optional[Union[tzfile, tzutc]] = None, | |||
tz: Optional[Union["tzfile", "tzutc"]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not obvious to me why these only care about dateutil tzinfos and not e.g. stdlib or pytz versions. @simonjayhawkins any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were the types seen by MonkeyType. I think they are resolving to Any.
i've since abandoned using MonkeyType for a couple of reasons..
MonkeyType uses nominal types and in many cases these resolve to Any due to unfollowed imports.
MonkeyType only adds nominal types whereas we'd probably prefer structural types.
feel free to change or remove. The order of typing priority from high to low should probably match the order used in isort. so I consider these to be low priority type hints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I would be shocked if these weren't supposed to be tzinfo
, will look more closely and update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but if tzinfo is an unfollowed import, it'll just resolve to Any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what determines whether something is an unfollowed import? tzinfo is stdlib, seems like mypy should know what it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were the types seen by MonkeyType. I think they are resolving to Any.
Revealed type for tz
is Union[Any, dateutil.tz.tz.tzutc, None]
so it is only the dateutil.zoneinfo.tzfile
that is unfollowed.
its not obvious to me why these only care about dateutil tzinfos and not e.g. stdlib or pytz versions.
pytz.tzinfo.DstTzInfo
is also unknown to mypy.
tzinfo is stdlib, seems like mypy should know what it is
since all three (dateutil.zoneinfo.tzfile
, dateutil.tz.tz.tzutc
and pytz.tzinfo.DstTzInfo
) inherit from datetime.tzinfo
, and datetime.tzinfo
is known to mypy through the stdlib, then it probably does make sense to use datetime.tzinfo
here.
However, within the function, tz
is only used in Timestamp(x).tz_convert(tz)
and Timestamp(x).tz_localize(tz)
and Timestamp resolves to any due to an unfollowed import. (needs stub #28195) so no actual type checking is being performed.
so probably best just to delete the type hint for now.
x: Union[NaTType, Timestamp], | ||
tz: Optional[Union["tzfile", "tzutc"]] = None, | ||
nat_rep: str = "NaT", | ||
x: Union[NaTType, Timestamp], tz: Optional[tzinfo] = None, nat_rep: str = "NaT" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so your determined not to remove!
in timestamps.pyx:
tz_convert: tz : str, pytz.timezone, dateutil.tz.tzfile or None
tz_localize: tz : str, pytz.timezone, dateutil.tz.tzfile or None
in tz_convert, tz on used in Timestamp constructor:
return Timestamp(self.value, tz=tz, freq=self.freq)
Timestamp: tz : str, pytz.timezone, dateutil.tz.tzfile or None
in tz_localize, tz used in Timestamp constructor and also:
maybe_get_tz(tz): (Maybe) Construct a timezone object from a string. If tz is a string, use
it to construct a timezone object. Otherwise, just return tz.
it'll be alot easier once the libs are annotated, mypy will do the checks for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at removing, but it seemed weird to have the function have everything but that one thing annotated. I'll defer to you if you think removing is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because in testing, MonkeyType only saw "tzfile", "tzutc", then tzinfo is probably fine. it depends how the lib gets annotated, otherwise mypy will raise errors here if/when we add the stub and add the types.
strictly speaking this function could also take a string, but as used internally MonkeyType didn't see this function being called with that type.
i'm ok with tzinfo as it helps document the function for now.
thanks @jbrockmendel |
These imports in io.formats.format take about 1.6ms, out of a total of about 470ms and (7.8ms total for formats.format). So its not massive, but it is easy to avoid and we are running out of lower-hanging fruit.