Skip to content

TYP: Improve clip() return and argument typing. #40860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 11, 2021

Conversation

DriesSchaumont
Copy link
Member

@DriesSchaumont DriesSchaumont commented Apr 9, 2021

@DriesSchaumont DriesSchaumont marked this pull request as draft April 9, 2021 20:58
@DriesSchaumont DriesSchaumont marked this pull request as ready for review April 9, 2021 22:22
@MarcoGorelli MarcoGorelli self-requested a review April 10, 2021 09:04
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @DriesSchaumont

This is off to a good start!

You have overloads for:

  • inplace: Literal[False] = ...
  • inplace: bool = ...
  • inplace: Literal[True]:
    • when lower, upper, and axis are passed
    • when lower and upper, but not axis, are passed

You're missing:

  • inplace: Literal[True]:
    • when lower and axis, but not upper, are passed
    • when upper and axis, but not lower, are passed
    • when lower, but not axis or upper, is passed
    • when upper, but not axis or lower, is passed
    • when axis, but not lower or upper, is passed
    • when none of axis, lower, or upper are passed

@DriesSchaumont
Copy link
Member Author

DriesSchaumont commented Apr 10, 2021

Hi @DriesSchaumont

This is off to a good start!

You have overloads for:

  • white_check_mark inplace: Literal[False] = ...

  • white_check_mark inplace: bool = ...

  • white_check_mark inplace: Literal[True]:

    • when lower, upper, and axis are passed
    • when lower and upper, but not axis, are passed

You're missing:

  • inplace: Literal[True]:

    • when lower and axis, but not upper, are passed
    • when upper and axis, but not lower, are passed
    • when lower, but not axis or upper, is passed
    • when upper, but not axis or lower, is passed
    • when axis, but not lower or upper, is passed
    • when none of axis, lower, or upper are passed

Thank you for this nice overview! I will try to add these cases. I am a bit confused on how to handle the syntax for: "when upper, but not axis or lower, is passed." Is this:

def clip(self: FrameOrSeries, lower=None, upper=..., *, inplace: Literal[True], **kwargs) -> None: correct? How do I indicate that lower is not specified?

@MarcoGorelli
Copy link
Member

can you try

    @overload
    def clip(
        self: FrameOrSeries,
        *
        upper,
        inplace: Literal[True],
        *args,
        **kwargs,
    ) -> None:
        ...

?

@DriesSchaumont
Copy link
Member Author

can you try

    @overload
    def clip(
        self: FrameOrSeries,
        *
        upper,
        inplace: Literal[True],
        *args,
        **kwargs,
    ) -> None:
        ...

?

Thanks! I think I got them all now. Could you check if I did everything right? I am quite new to type overloading.

@MarcoGorelli
Copy link
Member

Thanks! I think I got them all now. Could you check if I did everything right? I am quite new to type overloading.

Congrats for having braved this not-so-simple issue then!

To check if you've done it correctly, could you make a new file t.py (which you won't stage/commit, it'll just be there locally to check) in which you do something like what I did in #40737 (comment) ? You'll need to replace reset_index with clip, and you'll need to try with the various options of arguments which precede inplace being preset or absent, and then you should check that running mypy t.py produces the types which you were expecting

If you could do that, and paste here the content of t.py as the output from running mypy t.py, then if it all looks right then this'll be good to go

@MarcoGorelli MarcoGorelli added the Typing type annotations, mypy/pyright type checking label Apr 10, 2021
@MarcoGorelli MarcoGorelli added this to the 1.3 milestone Apr 10, 2021
@MarcoGorelli
Copy link
Member

I think the last one are correct - if inplace is just bool, then we don't know if the return type is DataFrame or None, so their union seems correct

If you repeat the first block but put inplace=False, then I'd expect you'll get all DataFrame only

@DriesSchaumont
Copy link
Member Author

Yes, just as I left my last comment (which I then deleted), I realized my mistake. Here is the file and the output!

import pandas as pd

inplace: bool

reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, upper=4, axis=1, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).clip(upper=4, axis=1, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).clip(axis=1, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).clip(inplace=True))
reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, axis=1, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).clip(upper=0, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).clip(upper=4, axis=1, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, upper=1, inplace=True))


reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, upper=4, axis=1, inplace=False))
reveal_type(pd.DataFrame([1,2,3]).clip(upper=4, axis=1, inplace=False))
reveal_type(pd.DataFrame([1,2,3]).clip(axis=1, inplace=False))
reveal_type(pd.DataFrame([1,2,3]).clip(inplace=False))
reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, axis=1, inplace=False))
reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, inplace=False))
reveal_type(pd.DataFrame([1,2,3]).clip(upper=0, inplace=False))
reveal_type(pd.DataFrame([1,2,3]).clip(upper=4, axis=1, inplace=False))
reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, upper=1, inplace=False))

reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, upper=4, axis=1, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).clip(upper=4, axis=1, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).clip(axis=1, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).clip(inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, axis=1, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).clip(upper=0, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).clip(upper=4, axis=1, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).clip(lower=0, upper=1, inplace=inplace))
t.py:6: note: Revealed type is 'None'
t.py:7: note: Revealed type is 'None'
t.py:8: note: Revealed type is 'None'
t.py:9: note: Revealed type is 'None'
t.py:10: note: Revealed type is 'None'
t.py:11: note: Revealed type is 'None'
t.py:12: note: Revealed type is 'None'
t.py:13: note: Revealed type is 'None'
t.py:14: note: Revealed type is 'None'
t.py:17: note: Revealed type is 'pandas.core.frame.DataFrame*'
t.py:18: note: Revealed type is 'pandas.core.frame.DataFrame*'
t.py:19: note: Revealed type is 'pandas.core.frame.DataFrame*'
t.py:20: note: Revealed type is 'pandas.core.frame.DataFrame*'
t.py:21: note: Revealed type is 'pandas.core.frame.DataFrame*'
t.py:22: note: Revealed type is 'pandas.core.frame.DataFrame*'
t.py:23: note: Revealed type is 'pandas.core.frame.DataFrame*'
t.py:24: note: Revealed type is 'pandas.core.frame.DataFrame*'
t.py:25: note: Revealed type is 'pandas.core.frame.DataFrame*'
t.py:27: note: Revealed type is 'Union[pandas.core.frame.DataFrame*, None]'
t.py:28: note: Revealed type is 'Union[pandas.core.frame.DataFrame*, None]'
t.py:29: note: Revealed type is 'Union[pandas.core.frame.DataFrame*, None]'
t.py:30: note: Revealed type is 'Union[pandas.core.frame.DataFrame*, None]'
t.py:31: note: Revealed type is 'Union[pandas.core.frame.DataFrame*, None]'
t.py:32: note: Revealed type is 'Union[pandas.core.frame.DataFrame*, None]'
t.py:33: note: Revealed type is 'Union[pandas.core.frame.DataFrame*, None]'
t.py:34: note: Revealed type is 'Union[pandas.core.frame.DataFrame*, None]'
t.py:35: note: Revealed type is 'Union[pandas.core.frame.DataFrame*, None]'

@DriesSchaumont
Copy link
Member Author

@MarcoGorelli thank you for the guidance and the explanations!

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one, thanks @DriesSchaumont ! Just left a small comment about how you've typed axis in some overloads but not in others, other than that this looks good to me (cc @simonjayhawkins in case you have further comments)

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

self: FrameOrSeries,
lower=...,
upper=...,
axis: Axis | None = ...,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users of the pandas api should not pass None.

axis: int or str axis name, optional

for the default (should be listed in the docstring instead of optional - not for this PR though), the argument should not be passed.

@MarcoGorelli what are you thoughts about not including None in the overloads and also now that we are using PEP 604, removing no_implicit_optional = True from setup.cfg and excluding None actual function signatures too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @simonjayhawkins

+1 on removing no_implicit_optional

Not sure what you mean about not including None in overloads - if axis=None is passed internally, then won't it need to be in one of the overloads for type checking to work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the question is (in general) should None ever be passed internally or should kwargs always be (in general) validated at the top level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we have had issues reported where None is passed by users and when we change defaults (I think bool True/False in a specific case I remember recently) this is results in a perceived regression. We also sometime use lib.nodefault to detect if a kwarg has been specified. So I was thinking to that discouraging users passing None could be worthwhile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins I looked into the no_implicit_optional = True, but it seems it's only possible to omit if you only have two possible types.

e.g. the following works:

from __future__ import annotations
from typing import Tuple

def foo(
    a: str | int | None  = None,
    b: str = None,    
): ...

but the following doesn't:

from __future__ import annotations
from typing import Tuple

def foo(
    a: str | int  = None,
    b: str = None,    
): ...

, we get

t.py:5: error: Incompatible default for argument "a" (default has type "None", argument has type "Union[str, int]")  [assignment]
Found 1 error in 1 file (checked 1 source file)

So I think it's cleaerer to keep no_implicit_optional

@MarcoGorelli MarcoGorelli merged commit 9ab55b4 into pandas-dev:master Apr 11, 2021
@DriesSchaumont DriesSchaumont deleted the fix-40829 branch April 11, 2021 11:44
@MarcoGorelli MarcoGorelli mentioned this pull request Apr 12, 2021
3 tasks
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
* Improve clip() typing.

* Update typing.

* More clip overloads.

* Add missing axis types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TYP return type of DataFrame.clip not correct
3 participants