Skip to content

CLN: core.computation, mostly typing #29436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions pandas/core/computation/align.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ def _zip_axes_from_type(typ, new_axes):
return axes


def _any_pandas_objects(terms):
def _any_pandas_objects(terms) -> bool:
"""Check a sequence of terms for instances of PandasObject."""
return any(isinstance(term.value, PandasObject) for term in terms)

Expand Down Expand Up @@ -144,7 +144,8 @@ def _reconstruct_object(typ, obj, axes, dtype):
obj : object
The value to use in the type constructor
axes : dict
The axes to use to construct the resulting pandas object
The axes to use to construct the resulting pandas object.
dtype : numpy dtype

Returns
-------
Expand Down
24 changes: 11 additions & 13 deletions pandas/core/computation/engines.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ def _check_ne_builtin_clash(expr):

Parameters
----------
terms : Term
expr : Term
Terms can contain
"""
names = expr.names
Expand All @@ -46,8 +46,9 @@ def __init__(self, expr):
self.aligned_axes = None
self.result_type = None

def convert(self):
"""Convert an expression for evaluation.
def convert(self) -> str:
"""
Convert an expression for evaluation.

Defaults to return the expression as a string.
"""
Expand Down Expand Up @@ -75,10 +76,9 @@ def evaluate(self):
)

@property
def _is_aligned(self):
def _is_aligned(self) -> bool:
return self.aligned_axes is not None and self.result_type is not None

@abc.abstractmethod
def _evaluate(self):
"""
Return an evaluated expression.
Expand All @@ -93,18 +93,19 @@ def _evaluate(self):
-----
Must be implemented by subclasses.
"""
pass
# mypy complains if we use @abc.abstractmethod, so we do use
# AbstractMethodError instead
from pandas.errors import AbstractMethodError

raise AbstractMethodError(self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy can afford additional checking since it has some understanding of abstract classes. so I'm not 100% happy with this, if it sets a precedent for dealing with Cannot instantiate abstract class 'AbstractEngine' with abstract attribute '_evaluate' style mypy errors.

in this case,

_engines: Dict[str, Type[Union[NumExprEngine, PythonEngine]]] = {
    "numexpr": NumExprEngine,
    "python": PythonEngine,
}

also passes mypy without changes to the code and only type annotations added.

It also means we retain the modern Python functionality for dealing with abstract methods instead of homegrown.

wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also means we retain the modern Python functionality for dealing with abstract methods instead of homegrown.

The homegrown doesn't really bother me since we use it in a lot of places. But annotations is an area where I'm going to defer to you pretty consistently.

_engines: Dict[str, Type[Union[NumExprEngine, PythonEngine]]]

Is this supported in py35? I rebel at this pattern because I always think "mypy should be able to figure this out" (which is only true if this were some kind of frozen_dict, but still).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supported in py35?

#29212 seems to have gone stale. so will need to use the py3.5 syntax

I rebel at this pattern because I always think "mypy should be able to figure this out"

It is required in a lot of places and in this case on my branch I currently have

_engines: Dict[str, Type[AbstractEngine]] = {
    "numexpr": NumExprEngine,
    "python": PythonEngine,
}

but can't remember why it was needed, but it looks like it will need to be updated to account for the abstract method if not done in this PR.



class NumExprEngine(AbstractEngine):
"""NumExpr engine class"""

has_neg_frac = True

def __init__(self, expr):
super().__init__(expr)

def convert(self):
def convert(self) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could also remove this method since super().convert returns a str. could this be a remnant of py2 compat?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll double-check. I tried removing all of the encoding kwargs and found that io.pytables still relies on them.

return str(super().convert())

def _evaluate(self):
Expand Down Expand Up @@ -137,9 +138,6 @@ class PythonEngine(AbstractEngine):

has_neg_frac = False

def __init__(self, expr):
super().__init__(expr)

def evaluate(self):
return self.expr()

Expand Down
23 changes: 11 additions & 12 deletions pandas/core/computation/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from pandas.io.formats.printing import pprint_thing


def _check_engine(engine):
def _check_engine(engine) -> str:
"""
Make sure a valid engine is passed.

Expand Down Expand Up @@ -64,7 +64,7 @@ def _check_engine(engine):
return engine


def _check_parser(parser):
def _check_parser(parser: str):
"""
Make sure a valid parser is passed.

Expand Down Expand Up @@ -97,14 +97,13 @@ def _check_resolvers(resolvers):
)


def _check_expression(expr):
def _check_expression(expr: str):
"""
Make sure an expression is not an empty string

Parameters
----------
expr : object
An object that can be converted to a string
expr : str

Raises
------
Expand All @@ -115,7 +114,7 @@ def _check_expression(expr):
raise ValueError("expr cannot be an empty string")


def _convert_expression(expr):
def _convert_expression(expr) -> str:
"""
Convert an object to an expression.

Expand Down Expand Up @@ -144,7 +143,7 @@ def _convert_expression(expr):
return s


def _check_for_locals(expr, stack_level, parser):
def _check_for_locals(expr, stack_level: int, parser):
from pandas.core.computation.expr import tokenize_string

at_top_of_stack = stack_level == 0
Expand All @@ -168,15 +167,15 @@ def _check_for_locals(expr, stack_level, parser):

def eval(
expr,
parser="pandas",
parser: str = "pandas",
engine=None,
truediv=True,
local_dict=None,
global_dict=None,
resolvers=(),
level=0,
level: int = 0,
target=None,
inplace=False,
inplace: bool = False,
):
"""
Evaluate a Python expression as a string using various backends.
Expand All @@ -192,7 +191,7 @@ def eval(

Parameters
----------
expr : str or unicode
expr : str
The expression to evaluate. This string cannot contain any Python
`statements
<https://docs.python.org/3/reference/simple_stmts.html#simple-statements>`__,
Expand Down Expand Up @@ -232,7 +231,7 @@ def eval(
``DataFrame.index`` and ``DataFrame.columns``
variables that refer to their respective :class:`~pandas.DataFrame`
instance attributes.
level : int, optional
level : int, default 0
The number of prior stack frames to traverse and add to the current
scope. Most users will **not** need to change this parameter.
target : object, optional, default None
Expand Down
43 changes: 29 additions & 14 deletions pandas/core/computation/expr.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import itertools as it
import operator
import tokenize
from typing import Type
from typing import Optional, Type

import numpy as np

Expand Down Expand Up @@ -40,7 +40,7 @@
import pandas.io.formats.printing as printing


def tokenize_string(source):
def tokenize_string(source: str):
"""
Tokenize a Python source code string.

Expand Down Expand Up @@ -68,7 +68,8 @@ def tokenize_string(source):


def _rewrite_assign(tok):
"""Rewrite the assignment operator for PyTables expressions that use ``=``
"""
Rewrite the assignment operator for PyTables expressions that use ``=``
as a substitute for ``==``.

Parameters
Expand All @@ -86,7 +87,8 @@ def _rewrite_assign(tok):


def _replace_booleans(tok):
"""Replace ``&`` with ``and`` and ``|`` with ``or`` so that bitwise
"""
Replace ``&`` with ``and`` and ``|`` with ``or`` so that bitwise
precedence is changed to boolean precedence.

Parameters
Expand All @@ -110,7 +112,8 @@ def _replace_booleans(tok):


def _replace_locals(tok):
"""Replace local variables with a syntactically valid name.
"""
Replace local variables with a syntactically valid name.

Parameters
----------
Expand All @@ -135,7 +138,8 @@ def _replace_locals(tok):


def _clean_spaces_backtick_quoted_names(tok):
"""Clean up a column name if surrounded by backticks.
"""
Clean up a column name if surrounded by backticks.

Backtick quoted string are indicated by a certain tokval value. If a string
is a backtick quoted token it will processed by
Expand Down Expand Up @@ -303,7 +307,8 @@ def f(self, *args, **kwargs):


def disallow(nodes):
"""Decorator to disallow certain nodes from parsing. Raises a
"""
Decorator to disallow certain nodes from parsing. Raises a
NotImplementedError instead.

Returns
Expand All @@ -324,16 +329,17 @@ def disallowed(cls):


def _op_maker(op_class, op_symbol):
"""Return a function to create an op class with its symbol already passed.
"""
Return a function to create an op class with its symbol already passed.

Returns
-------
f : callable
"""

def f(self, node, *args, **kwargs):
"""Return a partial function with an Op subclass with an operator
already passed.
"""
Return a partial function with an Op subclass with an operator already passed.

Returns
-------
Expand Down Expand Up @@ -813,18 +819,27 @@ class Expr:
parser : str, optional, default 'pandas'
env : Scope, optional, default None
truediv : bool, optional, default True
level : int, optional, default 2
level : int, optional, default 0
"""

def __init__(
self, expr, engine="numexpr", parser="pandas", env=None, truediv=True, level=0
self,
expr,
engine: str = "numexpr",
parser: str = "pandas",
env=None,
truediv: bool = True,
level: int = 0,
):
self.expr = expr
self.env = env or Scope(level=level + 1)
self.engine = engine
self.parser = parser
self.env.scope["truediv"] = truediv
self._visitor = _parsers[parser](self.env, self.engine, self.parser)
self._visitor = _parsers[parser](
self.env, self.engine, self.parser
) # type: Optional[BaseExprVisitor]
assert isinstance(self._visitor, BaseExprVisitor), type(self._visitor)
Comment on lines +841 to +842
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the type should be Optional or that the type annotation should be added here.

adding type annotations directly to _parsers...

_parsers: Dict[str, Type[BaseExprVisitor]] = {
    "python": PythonExprVisitor,
    "pandas": PandasExprVisitor,
}

negates the need for the assert, but does mean that the type annotations added later in this PR in pandas/core/computation/pytables.py need to be revisted.

self.terms = self.parse()

@property
Expand All @@ -837,7 +852,7 @@ def __call__(self):
def __repr__(self) -> str:
return printing.pprint_thing(self.terms)

def __len__(self):
def __len__(self) -> int:
return len(self.expr)

def parse(self):
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/computation/expressions.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ def _where_numexpr(cond, a, b):
set_use_numexpr(get_option("compute.use_numexpr"))


def _has_bool_dtype(x):
def _has_bool_dtype(x) -> bool:
if isinstance(x, ABCDataFrame):
return "bool" in x.dtypes
try:
Expand Down
Loading