Skip to content

Commit 97f9bbf

Browse files
authored
Contributing Guide for Type Hints (#27050)
1 parent 09ab18f commit 97f9bbf

File tree

1 file changed

+130
-0
lines changed

1 file changed

+130
-0
lines changed

doc/source/development/contributing.rst

+130
Original file line numberDiff line numberDiff line change
@@ -699,6 +699,136 @@ You'll also need to
699699

700700
See :ref:`contributing.warnings` for more.
701701

702+
.. _contributing.type_hints:
703+
704+
Type Hints
705+
----------
706+
707+
*pandas* strongly encourages the use of :pep:`484` style type hints. New development should contain type hints and pull requests to annotate existing code are accepted as well!
708+
709+
Style Guidelines
710+
~~~~~~~~~~~~~~~~
711+
712+
Types imports should follow the ``from typing import ...`` convention. So rather than
713+
714+
.. code-block:: python
715+
716+
import typing
717+
718+
primes = [] # type: typing.List[int]
719+
720+
You should write
721+
722+
.. code-block:: python
723+
724+
from typing import List, Optional, Union
725+
726+
primes = [] # type: List[int]
727+
728+
``Optional`` should be used where applicable, so instead of
729+
730+
.. code-block:: python
731+
732+
maybe_primes = [] # type: List[Union[int, None]]
733+
734+
You should write
735+
736+
.. code-block:: python
737+
738+
maybe_primes = [] # type: List[Optional[int]]
739+
740+
In some cases in the code base classes may define class variables that shadow builtins. This causes an issue as described in `Mypy 1775 <https://github.com/python/mypy/issues/1775#issuecomment-310969854>`_. The defensive solution here is to create an unambiguous alias of the builtin and use that without your annotation. For example, if you come across a definition like
741+
742+
.. code-block:: python
743+
744+
class SomeClass1:
745+
str = None
746+
747+
The appropriate way to annotate this would be as follows
748+
749+
.. code-block:: python
750+
751+
str_type = str
752+
753+
class SomeClass2:
754+
str = None # type: str_type
755+
756+
In some cases you may be tempted to use ``cast`` from the typing module when you know better than the analyzer. This occurs particularly when using custom inference functions. For example
757+
758+
.. code-block:: python
759+
760+
from typing import cast
761+
762+
from pandas.core.dtypes.common import is_number
763+
764+
def cannot_infer_bad(obj: Union[str, int, float]):
765+
766+
if is_number(obj):
767+
...
768+
else: # Reasonably only str objects would reach this but...
769+
obj = cast(str, obj) # Mypy complains without this!
770+
return obj.upper()
771+
772+
The limitation here is that while a human can reasonably understand that ``is_number`` would catch the ``int`` and ``float`` types mypy cannot make that same inference just yet (see `mypy #5206 <https://github.com/python/mypy/issues/5206>`_. While the above works, the use of ``cast`` is **strongly discouraged**. Where applicable a refactor of the code to appease static analysis is preferable
773+
774+
.. code-block:: python
775+
776+
def cannot_infer_good(obj: Union[str, int, float]):
777+
778+
if isinstance(obj, str):
779+
return obj.upper()
780+
else:
781+
...
782+
783+
With custom types and inference this is not always possible so exceptions are made, but every effort should be exhausted to avoid ``cast`` before going down such paths.
784+
785+
Syntax Requirements
786+
~~~~~~~~~~~~~~~~~~~
787+
788+
Because *pandas* still supports Python 3.5, :pep:`526` does not apply and variables **must** be annotated with type comments. Specifically, this is a valid annotation within pandas:
789+
790+
.. code-block:: python
791+
792+
primes = [] # type: List[int]
793+
794+
Whereas this is **NOT** allowed:
795+
796+
.. code-block:: python
797+
798+
primes: List[int] = [] # not supported in Python 3.5!
799+
800+
Note that function signatures can always be annotated per :pep:`3107`:
801+
802+
.. code-block:: python
803+
804+
def sum_of_primes(primes: List[int] = []) -> int:
805+
...
806+
807+
808+
Pandas-specific Types
809+
~~~~~~~~~~~~~~~~~~~~~
810+
811+
Commonly used types specific to *pandas* will appear in `pandas._typing <https://github.com/pandas-dev/pandas/blob/master/pandas/_typing.py>`_ and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas.
812+
813+
For example, quite a few functions in *pandas* accept a ``dtype`` argument. This can be expressed as a string like ``"object"``, a ``numpy.dtype`` like ``np.int64`` or even a pandas ``ExtensionDtype`` like ``pd.CategoricalDtype``. Rather than burden the user with having to constantly annotate all of those options, this can simply be imported and reused from the pandas._typing module
814+
815+
.. code-block:: python
816+
817+
from pandas._typing import Dtype
818+
819+
def as_type(dtype: Dtype) -> ...:
820+
...
821+
822+
This module will ultimately house types for repeatedly used concepts like "path-like", "array-like", "numeric", etc... and can also hold aliases for commonly appearing parameters like `axis`. Development of this module is active so be sure to refer to the source for the most up to date list of available types.
823+
824+
Validating Type Hints
825+
~~~~~~~~~~~~~~~~~~~~~
826+
827+
*pandas* uses `mypy <http://mypy-lang.org>`_ to statically analyze the code base and type hints. After making any change you can ensure your type hints are correct by running
828+
829+
.. code-block:: shell
830+
831+
mypy pandas
702832
703833
.. _contributing.ci:
704834

0 commit comments

Comments
 (0)