Skip to content

[TYPING] pandas.io.formats.format._binify annotation is inconsistent with DataFrameFormatter.__init__ #28843

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zero323 opened this issue Oct 8, 2019 · 13 comments · Fixed by #28908
Labels
Typing type annotations, mypy/pyright type checking
Milestone

Comments

@zero323
Copy link

zero323 commented Oct 8, 2019

Problem description

pd.io.formats.format._binify defines line_width as Union[np.int32, int]:

def _binify(cols: List[np.int32], line_width: Union[np.int32, int]) -> List[int]:

however it is inconsistent with DataFrameFormatter.__init__

class DataFrameFormatter(TableFormatter):

and DataFrameFormatter._join_multiline

col_bins = _binify(col_widths, lwidth)

which take and pass as-is Optional[int].

This leads to Mypy error:

pandas/io/formats/format.py:871: error: Argument 2 to "_binify" has incompatible type "Optional[int]"; expected "Union[int32, int]"

Providing default in _join_multiline, i.e.

lwidth = self.line_width or DEFAULT_LINE_WIDTH

could be reasonable fix. One could also ignore the call:

col_bins = _binify(col_widths, lwidth)  # type: ignore

but that leaves possible correctness problem.

Expected Output

Passing Mypy type check

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit           : bee17d5f4c99e6d1e451cf9a7b6a1780aa25988d
python           : 3.7.3.final.0
...

Additionally

mypy==0.730

CC @WillAyd

@WillAyd
Copy link
Member

WillAyd commented Oct 8, 2019

Cool thanks for taking a look! @simonjayhawkins as well

Does the function actually fail if line_width is None?

@simonjayhawkins
Copy link
Member

Does the function actually fail if line_width is None?

if i recall these types were from using MonkeyType. so if line_width is typed as Union[np.int32, int], then we have no tests that pass None.

@zero323
Copy link
Author

zero323 commented Oct 8, 2019

Does the function actually fail if line_width is None?

Haven't tested it, but I think it is safe to safe that it does fail on >

wrap = curr_width + 1 > line_width and i > 0

and

wrap = curr_width + 2 > line_width and i > 0

@WillAyd
Copy link
Member

WillAyd commented Oct 8, 2019

Does np.int32 actually resolve to anything? The following yield Any for me:

import numpy as np
a_np_int = np.int32(32)
reveal_type(a_np_int)

Wonder if we shouldn't just defer dealing with numpy types until they are actually annotated. Could just simplify this signature to be int for now

@WillAyd WillAyd added the Typing type annotations, mypy/pyright type checking label Oct 8, 2019
@simonjayhawkins
Copy link
Member

@WillAyd that's probably unrelated, but yes, all np.int to be replaced with python int is subsequent pass.

@simonjayhawkins
Copy link
Member

i'm using 0.730, but not getting that particular mypy error show up yet on my branch.

since there are many mypy errors, probably best to determine if it is an actual bug in the code, since we have mypy on the ci, and for the current options and version we have a passing build.

typing errors will be resolved as more types are added and more strictness flags are enabled.

@zero323
Copy link
Author

zero323 commented Oct 8, 2019

i'm using 0.730, but not getting that particular mypy error show up yet on my branch.

Interesting... I downgraded Mypy to 0.730 and still get this one. Do you use any specific flags?

@simonjayhawkins
Copy link
Member

Do you use any specific flags?

see #28339

@simonjayhawkins
Copy link
Member

changing to def _binify(cols: List[int], line_width: int) -> List[int]:

and I get pandas\io\formats\format.py:873:40: error: Argument 2 to "_binify" has incompatible type "Optional[int]"; expected "int" [arg-type]

so the Any is swallowing the None.

@zero323 do you have numpy stubs in your environment?

@simonjayhawkins
Copy link
Member

@WillAyd that's probably unrelated, but yes, all np.int to be replaced with python int is subsequent pass.

i was wrong.

@zero323
Copy link
Author

zero323 commented Oct 8, 2019

@zero323 do you have numpy stubs in your environment?

Yes, I do.

Edit:

And once removed this error disappears hidden by some missing import.

@simonjayhawkins
Copy link
Member

so I guess that adding stubs for all the missing imports becomes more important with PEP 561 compatibility since other users could have stubs for the other packages.

@zero323
Copy link
Author

zero323 commented Oct 8, 2019

From that point of view the problem with PEP 561 is that:

Stubs or Python source manually put in the beginning of the path. Type checkers SHOULD provide this to allow the user complete control of which stubs to use, and to patch broken stubs/inline types from packages. In mypy the $MYPYPATH environment variable can be used for this.

So effective outcome is rather hard to predict.

Not to mention that all signs show that mature stubs for scientific stack are nowhere near.

One possible workaround is to keep dynamic mock stubs that cover the part of the 3rd party API that you're interested in, and use these for internal testing. I had some luck with such approach in pyspark-stubs (mocking Pandas), but exposure there is relatively small.

One way or another ignoring missing imports and type: ignore are nuclear options and it is really hard to evaluate their real impact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants