-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Add dtype-support for pandas' type-hinting #34248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks @erezinman for the report. Adding this would certainly be easier with stubs, but we have not yet reached a decision on stubs, xref #28142. To do this with type annotations in the code, we would need to use typing.Generic Thanks for the link. I suspect we would follow NumPy conventions for the type parameters. https://github.com/numpy/numpy-stubs
In pandas we would also need to allow Series to be backed by pandas Extension Arrays for the values and index, so I suspect the type parameters would be numpy/pandas extension dtypes and not Python types so maybe something like pd.Series[pd.Int64Dtype(),pd.StringDtype()]. This would be cumbersome so having pd.Series[int, str]) represent the same thing would certainly be welcome from a user perspective. Further investigation and PRs welcome. |
fixed in pandas-dev/pandas-stubs@ba7aa5f |
Hi, @simonjayhawkins |
IIUC Series and Index are now generic wrt dtype in https://github.com/pandas-dev/pandas-stubs @Dr-Irv is this documented anywhere how to use these? For DataFrame (not explicitly mentioned in this issue), there is an open issue pandas-dev/pandas-stubs#295 |
Right now, only
What we have written up is here: https://github.com/pandas-dev/pandas-stubs/blob/main/docs/philosophy.md#use-of-generic-types Note that if you want to specify an annotation like It's worth mentioning that the pandera project supports the generic types at both a typing level and at runtime. See https://pandera.readthedocs.io/en/stable/schema_models.html . I haven't used this, but was made aware of it by the pandera authors when they reported some issues with |
It would be great if you could annotate a series, for example with -
to indicate that the function should accept a string-valued series with an integer index, and output
int-valued series with an integer index.
It could be even better if you would build on that such that the series member's types could be inferred using
TypeVar
s (for example,pd.Series[int,str].iloc
would accept an integer and return a string), but that's not necessary - just a bonus or a later milestone.To do that, you could add a new module (maybe "pandas.typing"?) that would contain these type-hints and would require minimal integration (if any) into the pandas' infrastructure. There's a similar package for "numpy" that's external to it called nptyping that could be used as a reference.
The text was updated successfully, but these errors were encountered: