-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TYP: stubs for tslibs #40433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TYP: stubs for tslibs #40433
Conversation
jbrockmendel
commented
Mar 14, 2021
- closes #xxxx
- tests added / passed
- Ensure all linting tests pass, see here for how to run them
- whatsnew entry
@@ -0,0 +1,13 @@ | |||
|
|||
DAYS: list[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does list[str]
work in Python < 3.9? (asking because I don't know)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i decided to try this out bc i think ive seen @simonjayhawkins using this pattern using it recently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does
list[str]
work in Python < 3.9? (asking because I don't know)
pyi files don't go through the Python interpreter. so have the advantage of being able to use the latest typing features understood by mypy/pyre/pyright/pytype. list[str]
is PEP 585 compliant.
side note: we are in the process of attempting to inline the microsoft pyright stubs. inline types not only need to satisfy the python interpreter supported (i.e. python 3.7 without typing extensions) but the code in the annotated functions also needs to be consistent with the in-line function annotations.
so (hopefully without sounding too negative) I imagine that not only will that take a while but may also not even be possible in some cases.
For type checking user code against the public api, I expect using stubs (eg. bundled with pyright or 3rd party such as https://github.com/predictive-analytics-lab/data-science-types (development ceased 16 Feb 2021)) will be the better option for the foreseeable future.
However, internal consistency/robustness is achieved though adding inline types and checking with mypy (which has excellent features that support the gradual typing process.)
we do yet have a good way of generating type stubs from cython files, so in order to have types available for the compiled code, we probably need to manually curate these. So being able to use the pyright stubs here would be beneficial.
@jbrockmendel Are these stubs manually curated or forked from pyright? I believe @Dr-Irv is coordinating between pandas and pyright and may have some input on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these stubs manually curated or forked from pyright?
Manually curated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think doing this is OK, as adding stubs may reveal problems in the python code that may need to be fixed. So doing this before pulling in the pyright stubs could be advantageous and make the migration smoother.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel Are these stubs manually curated or forked from pyright? I believe @Dr-Irv is coordinating between pandas and pyright and may have some input on this.
I wouldn't say I'm "coordinating", but the work that has been done by the Microsoft team provides a good starting point (and possible reference) for pandas type stubs. See https://github.com/microsoft/python-type-stubs/tree/main/pandas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jbrockmendel generally lgtm.
|
||
def array_strptime( | ||
values: np.ndarray, # np.ndarray[object] | ||
fmt: Optional[str], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
str | None
and then don't need the import.
def array_strptime( | ||
values: np.ndarray, # np.ndarray[object] | ||
fmt: Optional[str], | ||
exact: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't add the defaults. just ... less to keep in sync
values: np.ndarray, # np.ndarray[object] | ||
fmt: Optional[str], | ||
exact: bool = True, | ||
errors: str = "raise" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
Optional, | ||
Union, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could lose these imports using the newer syntax
import numpy as np | ||
|
||
# imported from dateutil.tz | ||
dateutil_gettz: Callable[[str], tzinfo] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could import from dateutil.tz directly in the stub.
@@ -80,6 +81,8 @@ cpdef inline object get_timezone(tzinfo tz): | |||
the tz name. It needs to be a string so that we can serialize it with | |||
UJSON/pytables. maybe_get_tz (below) is the inverse of this process. | |||
""" | |||
if tz is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you've defined def get_timezone(tz: tzinfo) -> Union[tzinfo, str]: ...
. tz shouldn't be None? (from python code anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with cython semantics tzinfo tz
doesn't exclude None, so we have to do it ourselves
@@ -364,6 +367,8 @@ cpdef bint tz_compare(tzinfo start, tzinfo end): | |||
elif is_utc(end): | |||
# Ensure we don't treat tzlocal as equal to UTC when running in UTC | |||
return False | |||
elif start is None or end is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably same answer as above. can start or end be None?
def tz_localize_to_utc( | ||
vals: np.ndarray, # np.ndarray[np.int64] | ||
tz: Optional[tzinfo], | ||
ambiguous: Optional[Union[str, bool, Iterable[bool]]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... for default
vals: np.ndarray, # np.ndarray[np.int64] | ||
tz: Optional[tzinfo], | ||
ambiguous: Optional[Union[str, bool, Iterable[bool]]] = None, | ||
nonexistent: Optional[Union[str, timedelta, np.timedelta64]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
Optional, | ||
Union, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mind if i apply these comments in the next pass? already have a branch on deck
in setup.cfg we have we this PR have 393 errors, 162 of these are from pandas._libs on master we have 407 errors (176 pandas._libs) so a good start. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mind if i apply these comments in the next pass? already have a branch on deck
sure. another comment for followup and an observation.
tzinfo, | ||
) | ||
from typing import ( | ||
Iterable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from https://github.com/python/typeshed/blob/master/CONTRIBUTING.md#stub-file-coding-style
in Python 3 stubs, import collections (Mapping, Iterable, etc.) from collections.abc instead of typing;
|
||
def maybe_get_tz(tz: Optional[Union[str, int, np.int64, tzinfo]]) -> Optional[tzinfo]: ... | ||
|
||
def get_timezone(tz: tzinfo) -> Union[tzinfo, str]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again from https://github.com/python/typeshed/blob/master/CONTRIBUTING.md#stub-file-coding-style
avoid Union return types: python/mypy#1693
although not sure this is possible without a refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yah id like to change this behavior too, but should be separate from typing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm - great to see these types coming in
Thanks @jbrockmendel |