Skip to content

TYP: level parameter of DataFrame.groupby has incorrect type hint (should accept sequence) #47548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
sebwills opened this issue Jun 29, 2022 · 3 comments · Fixed by #47560
Closed
3 tasks done
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member Typing type annotations, mypy/pyright type checking

Comments

@sebwills
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
df = pd.DataFrame([[1, 2, 3, 4], [5, 6, 7, 8]], columns=['a', 'b', 'c', 'd']).set_index(['a', 'b', 'c'])

print(df.groupby(level=[0, 1]).sum())

Issue Description

The above example runs, demonstrating that the level argument to DataFrame.groupby can usefully take a sequence.

This is backed up by the documentation at https://github.com/pandas-dev/pandas/blob/main/pandas/core/shared_docs.py#L106 which states a sequence can be passed in.

But the type annotation on this argument does not accept a sequence:

        level: IndexLabel | None = None,

https://github.com/pandas-dev/pandas/blob/main/pandas/core/groupby/groupby.py#L900

And indeed this causes mypy to complain about the above repro snippet:

mypy report.py
report.py:4: error: Argument "level" to "groupby" of "DataFrame" has incompatible type "List[int]"; expected "Union[Hashable, int, None]"
 [arg-type]

Expected Behavior

Type annotation should probably accept a Sequence[IndexLabel] too?

Installed Versions

INSTALLED VERSIONS

commit : e8093ba
python : 3.9.9.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19044
machine : AMD64
processor : Intel64 Family 6 Model 85 Stepping 7, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United Kingdom.1252
pandas : 1.4.3
numpy : 1.22.4
pytz : 2022.1
dateutil : 2.8.2
setuptools : 58.1.0
pip : 21.2.4
Cython : None
pytest : 7.1.2
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : 2.9.3
jinja2 : 3.1.2
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
brotli : 1.0.9
fastparquet : None
fsspec : None
gcsfs : None
markupsafe : 2.1.1
matplotlib : 3.5.2
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : 1.4.36
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None

@sebwills sebwills added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 29, 2022
@sebwills
Copy link
Author

@twoertwein this may be of interest to you?

@twoertwein twoertwein added the Typing type annotations, mypy/pyright type checking label Jun 29, 2022
@twoertwein
Copy link
Member

Main already includes any Sequence:

level: IndexLabel | None = None,

IndexLabel = Union[Hashable, Sequence[Hashable]]

but the issue is that the groupby method one DataFrame (and probably Series) uses narrower a type annotation:

level: Level | None = None,

@twoertwein
Copy link
Member

@sebwills If pandas-stubs has the same/similar issue, feel free to open a PR/issue there :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants