-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TYP: enable strict_equality to prohibit comparisons of non-overlappin… #30619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -59,6 +59,7 @@ class BooleanDtype(ExtensionDtype): | |||
>>> pd.BooleanDtype() | |||
BooleanDtype | |||
""" | |||
name = "boolean" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed for consistency with IntervalDType and CatergoricalDType. ExtensionDtype subclasses with class attribute for name do not need to override construct_from_string
on master:
>>> import pandas as pd; pd.__version__
'0.26.0.dev0+1554.g0913ed04d'
>>>
>>> pd.core.dtypes.dtypes.CategoricalDtype.name
'category'
>>>
>>> pd.core.dtypes.dtypes.IntervalDtype.name
'interval'
>>>
>>> pd.core.arrays.boolean.BooleanDtype.name
<property object at 0x0000026B6A278E58>
>>>
>>> pd.core.arrays.string_.StringDtype.name
<property object at 0x0000026B6A550908>
>>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to do this the other way? I think property is a little nicer than class attributes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
construct_from_string
is a class method, so I think it makes more sense to make name
a class attribute.
(should probably consider making PeriodDType.name and DatetimeTZDtype.name class attributes for consistency)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the only downside is non-dynamic names? I don't think we can make PeriodDtype.name
a class attribute, since it depends on the freq.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the static ones this seems fine
@@ -638,7 +638,7 @@ def _sub_period(self, other): | |||
return new_data | |||
|
|||
def _addsub_int_array( | |||
self, other: np.ndarray, op: Callable[[Any], Any], | |||
self, other: np.ndarray, op: Callable[[Any, Any], Any], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
binop needs two arguments
@@ -156,7 +156,7 @@ def _get_cell_value(self, cell, convert_float: bool) -> Scalar: | |||
# GH5394 | |||
cell_value = float(cell.attributes.get((OFFICENS, "value"))) | |||
|
|||
if cell_value == 0.0 and str(cell) != cell_value: # NA handling | |||
if cell_value == 0.0: # NA handling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove code always True
@@ -231,7 +231,7 @@ def __init__( | |||
self, | |||
series: "Series", | |||
buf: Optional[IO[str]] = None, | |||
length: bool = True, | |||
length: Union[bool, str] = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
length takes 'truncate'. used by options but not part of public api
@@ -450,7 +450,7 @@ def _get_adjustment() -> TextAdjustment: | |||
|
|||
class TableFormatter: | |||
|
|||
show_dimensions: bool | |||
show_dimensions: Union[bool, str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
show_dimensions can also accept "truncate"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool looks good. Some comments
@@ -59,6 +59,7 @@ class BooleanDtype(ExtensionDtype): | |||
>>> pd.BooleanDtype() | |||
BooleanDtype | |||
""" | |||
name = "boolean" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to do this the other way? I think property is a little nicer than class attributes
@@ -236,6 +236,10 @@ def construct_from_string(cls, string: str): | |||
""" | |||
if not isinstance(string, str): | |||
raise TypeError(f"Expects a string, got {type(string).__name__}") | |||
|
|||
# error: Non-overlapping equality check (left operand type: "str", right | |||
# operand type: "Callable[[ExtensionDtype], str]") [comparison-overlap] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this use the self._ensure_type ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think that's applicable here.
the assert is to avoid an ignore on the following line (the comparison).
the comparison would still fail strict_equality even if all subclasses override this method/override name attribute. This is because the type checker is checking this method using the name attribute in the base class. i.e. checking the base class isolation.
I think this is related to "Python language's lack of notation for abstract class attributes" see python/mypy#7760
so without the assert an ignore would be required here.
However, the assert prevents creating subclasses that call this method when not using a class attribute for name.
This could instead be enforced by mypy instead of the assert and i believe the implementation of name in the base class should be...
diff --git a/pandas/core/dtypes/base.py b/pandas/core/dtypes/base.py
index 1b4e7062b..5235aac91 100644
--- a/pandas/core/dtypes/base.py
+++ b/pandas/core/dtypes/base.py
@@ -1,6 +1,6 @@
"""Extend pandas with custom array types"""
from typing import Any, List, Optional, Tuple, Type
-
+import abc
import numpy as np
from pandas.errors import AbstractMethodError
@@ -8,7 +8,7 @@ from pandas.errors import AbstractMethodError
from pandas.core.dtypes.generic import ABCDataFrame, ABCIndexClass, ABCSeries
-class ExtensionDtype:
+class ExtensionDtype(metaclass=abc.ABCMeta):
"""
A custom data type, to be paired with an ExtensionArray.
@@ -162,13 +162,15 @@ class ExtensionDtype:
return "O"
@property
- def name(self) -> str:
+ @classmethod
+ @abc.abstractmethod
+ def name(cls) -> str:
"""
A string identifying the data type.
Will be used for display in, e.g. ``Series.dtype``
"""
- raise AbstractMethodError(self)
+ pass
@property
def names(self) -> Optional[List[str]]:
However, mypy would give the following errors...
pandas\core\dtypes\base.py:241: error: Non-overlapping equality check (left operand type: "str", right operand type: "Callable[[], str]")
pandas\core\dtypes\dtypes.py:754: error: Signature of "name" incompatible with supertype "ExtensionDtype"
pandas\core\dtypes\dtypes.py:895: error: Signature of "name" incompatible with supertype "ExtensionDtype"
pandas\tests\extension\test_common.py:44: error: Cannot instantiate abstract class 'DummyDtype' with abstract attribute 'name'
the first is due to python/mypy#7760, the next two are because PeriodDType.name and DatetimeTZDtype.name are not implemented using class attributes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm ex comments of others. generally would be open to stricter settings as you come across them @simonjayhawkins
thanks @simonjayhawkins |
Master seems to be failing now with |
…g types
By default, mypy allows always-false comparisons like 42 == 'no'. Use this flag to prohibit such comparisons of non-overlapping types, and similar identity and container checks