Skip to content

TYP: stubs for tslibs #40433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 16, 2021
Merged

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx
  • tests added / passed
  • Ensure all linting tests pass, see here for how to run them
  • whatsnew entry

@@ -0,0 +1,13 @@

DAYS: list[str]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does list[str] work in Python < 3.9? (asking because I don't know)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i decided to try this out bc i think ive seen @simonjayhawkins using this pattern using it recently

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does list[str] work in Python < 3.9? (asking because I don't know)

pyi files don't go through the Python interpreter. so have the advantage of being able to use the latest typing features understood by mypy/pyre/pyright/pytype. list[str] is PEP 585 compliant.


side note: we are in the process of attempting to inline the microsoft pyright stubs. inline types not only need to satisfy the python interpreter supported (i.e. python 3.7 without typing extensions) but the code in the annotated functions also needs to be consistent with the in-line function annotations.

so (hopefully without sounding too negative) I imagine that not only will that take a while but may also not even be possible in some cases.

For type checking user code against the public api, I expect using stubs (eg. bundled with pyright or 3rd party such as https://github.com/predictive-analytics-lab/data-science-types (development ceased 16 Feb 2021)) will be the better option for the foreseeable future.

However, internal consistency/robustness is achieved though adding inline types and checking with mypy (which has excellent features that support the gradual typing process.)


we do yet have a good way of generating type stubs from cython files, so in order to have types available for the compiled code, we probably need to manually curate these. So being able to use the pyright stubs here would be beneficial.

@jbrockmendel Are these stubs manually curated or forked from pyright? I believe @Dr-Irv is coordinating between pandas and pyright and may have some input on this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these stubs manually curated or forked from pyright?

Manually curated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think doing this is OK, as adding stubs may reveal problems in the python code that may need to be fixed. So doing this before pulling in the pyright stubs could be advantageous and make the migration smoother.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel Are these stubs manually curated or forked from pyright? I believe @Dr-Irv is coordinating between pandas and pyright and may have some input on this.

I wouldn't say I'm "coordinating", but the work that has been done by the Microsoft team provides a good starting point (and possible reference) for pandas type stubs. See https://github.com/microsoft/python-type-stubs/tree/main/pandas

@simonjayhawkins simonjayhawkins added the Typing type annotations, mypy/pyright type checking label Mar 15, 2021
Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jbrockmendel generally lgtm.


def array_strptime(
values: np.ndarray, # np.ndarray[object]
fmt: Optional[str],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str | None and then don't need the import.

def array_strptime(
values: np.ndarray, # np.ndarray[object]
fmt: Optional[str],
exact: bool = True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't add the defaults. just ... less to keep in sync

values: np.ndarray, # np.ndarray[object]
fmt: Optional[str],
exact: bool = True,
errors: str = "raise"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment on lines +7 to +8
Optional,
Union,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could lose these imports using the newer syntax

import numpy as np

# imported from dateutil.tz
dateutil_gettz: Callable[[str], tzinfo]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could import from dateutil.tz directly in the stub.

@@ -80,6 +81,8 @@ cpdef inline object get_timezone(tzinfo tz):
the tz name. It needs to be a string so that we can serialize it with
UJSON/pytables. maybe_get_tz (below) is the inverse of this process.
"""
if tz is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you've defined def get_timezone(tz: tzinfo) -> Union[tzinfo, str]: .... tz shouldn't be None? (from python code anyway)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with cython semantics tzinfo tz doesn't exclude None, so we have to do it ourselves

@@ -364,6 +367,8 @@ cpdef bint tz_compare(tzinfo start, tzinfo end):
elif is_utc(end):
# Ensure we don't treat tzlocal as equal to UTC when running in UTC
return False
elif start is None or end is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably same answer as above. can start or end be None?

def tz_localize_to_utc(
vals: np.ndarray, # np.ndarray[np.int64]
tz: Optional[tzinfo],
ambiguous: Optional[Union[str, bool, Iterable[bool]]] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... for default

vals: np.ndarray, # np.ndarray[np.int64]
tz: Optional[tzinfo],
ambiguous: Optional[Union[str, bool, Iterable[bool]]] = None,
nonexistent: Optional[Union[str, timedelta, np.timedelta64]] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment on lines +7 to +8
Optional,
Union,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as before

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind if i apply these comments in the next pass? already have a branch on deck

@simonjayhawkins
Copy link
Member

in setup.cfg we have ignore_missing_imports = True. This is probably the next strictness flag we should consider changing.

we this PR have 393 errors, 162 of these are from pandas._libs

on master we have 407 errors (176 pandas._libs)

so a good start.

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind if i apply these comments in the next pass? already have a branch on deck

sure. another comment for followup and an observation.

tzinfo,
)
from typing import (
Iterable,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from https://github.com/python/typeshed/blob/master/CONTRIBUTING.md#stub-file-coding-style

in Python 3 stubs, import collections (Mapping, Iterable, etc.) from collections.abc instead of typing;


def maybe_get_tz(tz: Optional[Union[str, int, np.int64, tzinfo]]) -> Optional[tzinfo]: ...

def get_timezone(tz: tzinfo) -> Union[tzinfo, str]: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again from https://github.com/python/typeshed/blob/master/CONTRIBUTING.md#stub-file-coding-style

avoid Union return types: python/mypy#1693

although not sure this is possible without a refactor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yah id like to change this behavior too, but should be separate from typing

@simonjayhawkins simonjayhawkins added this to the 1.3 milestone Mar 15, 2021
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - great to see these types coming in

@simonjayhawkins simonjayhawkins merged commit 940695b into pandas-dev:master Mar 16, 2021
@simonjayhawkins
Copy link
Member

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the typ-libs-2 branch March 16, 2021 14:26
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants