Skip to content

TYP fixup overloads for reset_index #40200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 10, 2021

Conversation

MarcoGorelli
Copy link
Member

@MarcoGorelli MarcoGorelli commented Mar 3, 2021

  • Ensure all linting tests pass, see here for how to run them

Took me longer than I'm proud to admit, but I think I've finally understood the deal with overload and default types.

We need one overload for the default case (here, inplace=False), and then an extra overload of each combination of keyword arguments which precede inplace.

So here, we need:

  • one case for the defaults
  • one case for when there all all arguments and inplace=True
  • one case for when there's no level, and inplace=True
  • one case for when there's no drop, and inplace=True
  • one case for when there's no level, no drop, and inplace=True

Then, mypy will be able to correctly infer the return type for reset_index based on the value of inplace.

This kind of thing will help resolve the mypy issues seen in #38594, where the return type from fillna couldn't be determined. It also means not needing to add the ignore error from #37137


Some examples:

# t.py

import pandas as pd

reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0))
reveal_type(pd.DataFrame([1,2,3]).reset_index(inplace=False))
reveal_type(pd.DataFrame([1,2,3]).reset_index(drop=True, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, drop=False, inplace=True))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, inplace=True, col_fill='a'))
reveal_type(pd.DataFrame([1,2,3]).reset_index(level=0, col_fill='a'))
reveal_type(pd.DataFrame([1,2,3]).reset_index(inplace=True))
$ mypy t.py 
t.py:3: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:4: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:5: note: Revealed type is 'None'
t.py:6: note: Revealed type is 'None'
t.py:7: note: Revealed type is 'None'
t.py:8: note: Revealed type is 'None'
t.py:9: note: Revealed type is 'pandas.core.frame.DataFrame'
t.py:10: note: Revealed type is 'None'

@simonjayhawkins
Copy link
Member

MarcoGorelli requested a review from simonjayhawkins 15 minutes ago

watch this space!

@simonjayhawkins
Copy link
Member

MarcoGorelli requested a review from simonjayhawkins 15 minutes ago

watch this space!

see #40202

@gfyoung gfyoung added the Typing type annotations, mypy/pyright type checking label Mar 4, 2021
...

def reset_index(
self, level=None, drop=False, inplace=False, col_level=0, col_fill=""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing the type annotations from the function means that we lose the type checking for the body of the function.

so if you reveal_type on say level you will get Revealed type is 'Any'

IIUC correctly, the type annotations on the function are used to check the function itself, while the overloads are used when checking other functions that call this one.

so both are needed.

col_level: Hashable = ...,
col_fill: Hashable = ...,
) -> None:
...
Copy link
Member

@simonjayhawkins simonjayhawkins Mar 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the last overload needed now? won't it fall back to the function definition if no overloads match?

updated: comment should have been on the next one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure it's needed, see here for a toy example (uncomment the commented-out part and it'll pass)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting out the last overload leads to the same errors observed on master in the typing test from #40202 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, and Revealed type is 'Any'. I thought that it fell back to the non-overloaded function with no matching overloads. maybe something changed in mypy or only does so under different conditions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only does so under different conditions

If I've understood correctly, this is it - see #39501 (comment)

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MarcoGorelli it would be great to have some testing in place before merging this, not because there could be anything wrong with the additions, but because susequent changes could easily break things. but we are where we are, so merge when ready.

@MarcoGorelli
Copy link
Member Author

Thanks - OK, merging then as I'm not sure how long it'll take for #40202 to be ready, have opened #40349 to make sure that adding tests for this isn't forgotten

@MarcoGorelli MarcoGorelli added this to the 1.3 milestone Mar 10, 2021
@MarcoGorelli MarcoGorelli merged commit 0e6a3b8 into pandas-dev:master Mar 10, 2021
@MarcoGorelli MarcoGorelli deleted the fixup-reset-index branch March 10, 2021 11:26
@simonjayhawkins
Copy link
Member

have opened #40349 to make sure that adding tests for this isn't forgotten

great. we can add the reveals from your comments here to the tests in #40202

@simonjayhawkins
Copy link
Member

just one more thing!

these extra overloads are needed because of an issue with mypy. we have now lost that link to the issue.

although the mypy issue was initially dismissed, it appears that there is acknowledgement that the workaround can becomes impractical with many arguments. There therefore may be a fix on the mypy side in the future. We somehow need to track that issue.

jbrockmendel pushed a commit to jbrockmendel/pandas that referenced this pull request Mar 11, 2021
* fixup overloads for reset_index

* another overload

* add defaults for bool case

* add types back to function signature
@MarcoGorelli
Copy link
Member Author

The (*, level, inplace) overload should probably have been (level, *, inplace), but I'll check tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants