Skip to content

TYP overload fillna #40737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MarcoGorelli opened this issue Apr 1, 2021 · 11 comments · Fixed by #40887
Closed

TYP overload fillna #40737

MarcoGorelli opened this issue Apr 1, 2021 · 11 comments · Fixed by #40887
Assignees
Labels
good first issue Typing type annotations, mypy/pyright type checking

Comments

@MarcoGorelli
Copy link
Member

fillna has a different return type according to whether inplace is True or False. We could overload it to provide a better typing experience

See #40197 for a very similar issue

Task here is:

  1. read through TYP: use overload to refine return type of set_axis #40197, make sure you understand it
  2. try doing the same for fillna
  3. check it works. See the above PR for an example, but it should be something like the following (but for fillna instead of reset_index - some of the other arguments probably need changing too)
# t.py

import pandas as pd

inplace: bool

reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0))
reveal_type(pd.DataFrame([1,2,3]).reset_index(inplace=False))
reveal_type(pd.DataFrame([1,2,3]).reset_index(drop=True, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, drop=False, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, inplace=True, col_fill='a'))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, col_fill='a'))
reveal_type(pd.DataFrame([1,2,3]).reset_index(inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).reset_index(drop=True, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, drop=False, inplace=inplace))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, inplace=inplace, col_fill='a'))
reveal_type(pd.DataFrame([1,2,3]).reset_index(inplace=inplace))

output:

$ mypy t.py 
t.py:5: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:6: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:7: note: Revealed type is 'None'
t.py:8: note: Revealed type is 'None'
t.py:9: note: Revealed type is 'None'
t.py:10: note: Revealed type is 'None'
t.py:11: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:12: note: Revealed type is 'None'
t.py:13: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:14: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:15: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:16: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:17: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'
t.py:18: note: Revealed type is 'Union[pandas.core.frame.DataFrame, None]'

I've marked this as 'good first issue' as it doesn't require any knowledge of the pandas core codebase, though it's probably not the easiest if you're not at least somewhat familiar with mypy / typing

@MarcoGorelli MarcoGorelli added good first issue Typing type annotations, mypy/pyright type checking labels Apr 1, 2021
@MarcoGorelli
Copy link
Member Author

This issue would unblock some of the typing difficulties in #38594

@LarWong
Copy link
Contributor

LarWong commented Apr 4, 2021

take

@LarWong
Copy link
Contributor

LarWong commented Apr 5, 2021

@MarcoGorelli Hi! I'm a first-time contributor, and this is my first attempt at an issue. So forgive me if this takes a few days to do while I get used to this codebase and to the rules associated with this project. Thanks!

@LarWong
Copy link
Contributor

LarWong commented Apr 6, 2021

Hi @MarcoGorelli. After reading #40197 and the pandas documentation, I am under the impression that I would need to add overloads to the following files: pandas/core/frame.py, pandas/core/series.py, and pandas/core/generic.py. Is this correct?

@MarcoGorelli
Copy link
Member Author

Hi @LarWong - yup, that's right! This one's a bit more complicated, as there's some extra arguments, but please do ping me if you get stuck

@LarWong
Copy link
Contributor

LarWong commented Apr 6, 2021

@MarcoGorelli Will do!

@LarWong
Copy link
Contributor

LarWong commented Apr 7, 2021

@AlPalll I am still working on this.

@LarWong
Copy link
Contributor

LarWong commented Apr 11, 2021

@MarcoGorelli Hi. I'm getting a redundant cast error after adding overloads to series.py.
This is the error I'm getting:

pandas/core/strings/object_array.py:374: error: Redundant cast to "Series"  [redundant-cast]
pandas/core/strings/object_array.py:376: error: Redundant cast to "Series"  [redundant-cast]

and this the part of the code that the warning refers to:

370: arr = Series(self).fillna("")
371: try:
372:     arr = sep + arr + sep
373: except TypeError:
374:     arr = cast(Series, arr)
375:     arr = sep + arr.astype(str) + sep
376:  arr = cast(Series, arr)

I am new to Python typing, so I am a bit confused on how to resolve this. Thanks

LarWong pushed a commit to LarWong/pandas that referenced this issue Apr 11, 2021
@AlPalll AlPalll removed their assignment Apr 11, 2021
@LarWong
Copy link
Contributor

LarWong commented Apr 12, 2021

@MarcoGorelli Hi. I'm getting a redundant cast error after adding overloads to series.py.
This is the error I'm getting:

pandas/core/strings/object_array.py:374: error: Redundant cast to "Series"  [redundant-cast]
pandas/core/strings/object_array.py:376: error: Redundant cast to "Series"  [redundant-cast]

and this the part of the code that the warning refers to:

370: arr = Series(self).fillna("")
371: try:
372:     arr = sep + arr + sep
373: except TypeError:
374:     arr = cast(Series, arr)
375:     arr = sep + arr.astype(str) + sep
376:  arr = cast(Series, arr)

I am new to Python typing, so I am a bit confused on how to resolve this. Thanks

It seems like the following overload caused the error above:

    @overload
    def fillna(
        self,
        value: Scalar | dict | Series | DataFrame | None = ...,
        method: str | None = ...,
        axis: Axis | None = ...,
        inplace: Literal[False] = ...,
        limit: int | None = ...,
        downcast: dict | str | None = ...,
    ) -> Series:
        ...

Adding None to the return type seems to have solved the issue above:

    @overload
    def fillna(
        self,
        value: Scalar | dict | Series | DataFrame | None = ...,
        method: str | None = ...,
        axis: Axis | None = ...,
        inplace: Literal[False] = ...,
        limit: int | None = ...,
        downcast: dict | str | None = ...,
    ) -> Series | None:
        ...

However, this defeats the purpose of the overload signature.

@MarcoGorelli
Copy link
Member Author

Hi @LarWong ,

Now that mypy can correctly infer the type of arr = Series(self).fillna("") (because of your overloads), then arr = cast(Series, arr) should no longer be necessary. I think it should be possible to just remove that line - could you try that please?

@LarWong
Copy link
Contributor

LarWong commented Apr 12, 2021

@MarcoGorelli Hi, I removed the cast lines, and the errors have been resolved. Thanks!

MarcoGorelli added a commit that referenced this issue Apr 15, 2021
* TYP: Added overloads for fillna() in frame.py and series.py

* TYP: Added overloads for fillna() in frame.py and series.py #40737

* TYP: Added fillna() overloads to generic.py #40727

* TYP: removed generic overloads #40737

* fixed redundant cast error

* reverting prior changes

* remove cast again

* removed unnecessary overloads in frame.py and series.py

* fixed overloads

* reverted value typing

* remove extra types (lets keep this to overloads)

Co-authored-by: Marco Gorelli <[email protected]>
yeshsurya pushed a commit to yeshsurya/pandas that referenced this issue Apr 21, 2021
* TYP: Added overloads for fillna() in frame.py and series.py

* TYP: Added overloads for fillna() in frame.py and series.py pandas-dev#40737

* TYP: Added fillna() overloads to generic.py pandas-dev#40727

* TYP: removed generic overloads pandas-dev#40737

* fixed redundant cast error

* reverting prior changes

* remove cast again

* removed unnecessary overloads in frame.py and series.py

* fixed overloads

* reverted value typing

* remove extra types (lets keep this to overloads)

Co-authored-by: Marco Gorelli <[email protected]>
yeshsurya pushed a commit to yeshsurya/pandas that referenced this issue May 6, 2021
* TYP: Added overloads for fillna() in frame.py and series.py

* TYP: Added overloads for fillna() in frame.py and series.py pandas-dev#40737

* TYP: Added fillna() overloads to generic.py pandas-dev#40727

* TYP: removed generic overloads pandas-dev#40737

* fixed redundant cast error

* reverting prior changes

* remove cast again

* removed unnecessary overloads in frame.py and series.py

* fixed overloads

* reverted value typing

* remove extra types (lets keep this to overloads)

Co-authored-by: Marco Gorelli <[email protected]>
JulianWgs pushed a commit to JulianWgs/pandas that referenced this issue Jul 3, 2021
* TYP: Added overloads for fillna() in frame.py and series.py

* TYP: Added overloads for fillna() in frame.py and series.py pandas-dev#40737

* TYP: Added fillna() overloads to generic.py pandas-dev#40727

* TYP: removed generic overloads pandas-dev#40737

* fixed redundant cast error

* reverting prior changes

* remove cast again

* removed unnecessary overloads in frame.py and series.py

* fixed overloads

* reverted value typing

* remove extra types (lets keep this to overloads)

Co-authored-by: Marco Gorelli <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants