Skip to content

Make ExtensionDtype.construct_array_type a regular method #36126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomAugspurger opened this issue Sep 4, 2020 · 3 comments
Closed

Make ExtensionDtype.construct_array_type a regular method #36126

TomAugspurger opened this issue Sep 4, 2020 · 3 comments
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Refactor Internal refactoring of code Strings String extension data type and string data

Comments

@TomAugspurger
Copy link
Contributor

Currently ExtensionDtype.consruct_array_type is a classmethod. This makes it hard to use a single dtype with multiple array classes where the array type depends on some parameter of the dtype.

I'd like to change construct_array_type to be a regular method rather than a classmethod. AFAICT, this is fine for pandas since everywhere we use construct_array_type we have an instance rather than a type. It should be fine for 3rd party arrays to continue to use classmethods since you can call a classmethod on an instance. The API breaking change would be for 3rd-party code relying on ExtensionDtype.construct_array_type() working (i.e. calling it on the class).

My motivation is for allowing a parametrized StringDtype that can be used for either the current StringArray and an Arrow-backed StringArray.

@TomAugspurger TomAugspurger added API Design Strings String extension data type and string data ExtensionArray Extending pandas with custom dtypes or arrays. labels Sep 4, 2020
@jbrockmendel
Copy link
Member

+1

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Sep 5, 2020
This allows a single dtype to support multiple array classes.
For arrow-backed strings, we'll likely want a separate array class
for ease of implementation, clarity. But we'll have a parametrized
dtype.

```python
class StringDtype:
    def __init__(self, storage="python"):
        self.storage = storage

    def construct_array_type(self):  # regular method
        if self.storage == "python":
            return StringArray
        else:
            return ArrowStringArray
```

Closes pandas-dev#36126
@jreback jreback added this to the 1.2 milestone Sep 5, 2020
@jreback jreback modified the milestones: 1.2, 1.3 Nov 20, 2020
@simonjayhawkins
Copy link
Member

removing milestone

@simonjayhawkins simonjayhawkins removed this from the 1.3 milestone Jun 11, 2021
@mroeschke mroeschke added Refactor Internal refactoring of code and removed API Design labels Aug 13, 2021
@jbrockmendel
Copy link
Member

This has been done, not sure when. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Refactor Internal refactoring of code Strings String extension data type and string data
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants