Skip to content

[ArrowStringDtype] Make it already a StringDtype subclass #41312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jorisvandenbossche
Copy link
Member

@simonjayhawkins this will be redundant after #39908, but in the meantime, ArrowStringDtype being a subclass would help for writing robust code related to StringDtype in downstream packages (for those testing against pandas master)

@jorisvandenbossche jorisvandenbossche added the Strings String extension data type and string data label May 4, 2021
@simonjayhawkins
Copy link
Member

Thanks @jorisvandenbossche lgtm

we could clean-up the various isinstance(..., StringDtype) -> isinstance(..., (StringDtype, ArrowStringDtype)) checks. do that here or as a follow-up?

@simonjayhawkins simonjayhawkins added this to the 1.3 milestone May 4, 2021
@jorisvandenbossche
Copy link
Member Author

we could clean-up the various isinstance(..., StringDtype) -> isinstance(..., (StringDtype, ArrowStringDtype)) checks. do that here or as a follow-up?

Did that here

@simonjayhawkins
Copy link
Member

pandas/core/arrays/string_arrow.py:113: error: Return type "Type[ArrowStringArray]" of "construct_array_type" incompatible with return type "Type[StringArray]" in supertype "StringDtype"  [override]
pandas/core/arrays/string_arrow.py:129: error: Return type "ArrowStringArray" of "__from_arrow__" incompatible with return type "StringArray" in supertype "StringDtype"  [override]

@jorisvandenbossche
Copy link
Member Author

jorisvandenbossche commented May 4, 2021

@simonjayhawkins is there an easy solution for that? (sorry, currently don't have time to dive into this, otherwise will add some ignores)

@simonjayhawkins
Copy link
Member

in #39908 we add ArrowStringArray to the return types and import it in a TYPE_CHECKING block

@jorisvandenbossche
Copy link
Member Author

Would you be fine with ignoring it for now, and then properly fixing it in #39908 ?
I could copy the approach from #39908, but that also seems a bit strange to do right now since the StringDtype itself doesn't yet return ArrowStringArray.

@simonjayhawkins
Copy link
Member

sure

@jorisvandenbossche jorisvandenbossche merged commit 0677491 into pandas-dev:master May 5, 2021
@jorisvandenbossche jorisvandenbossche deleted the string-dtype-subclass branch May 5, 2021 06:55
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants