Handle ExtensionArrays in cut

Followup to https://github.com/pandas-dev/pandas/pull/31290. Currently `pd.cut` doesn't play nicely with all extension arrays. To support them, I think we'll need one addition to the interface.

We need an array of integers to pass to searchsorted in https://github.com/pandas-dev/pandas/blob/4edcc5541ff3f6470f5e3c083cb83136119e6f0c/pandas/core/reshape/tile.py#L394. I think the only requirement is that the integer-encoded values need to have the same ordering as the original values. (I forget the math term for this type of mapping).

It doesn't matter what value is used for missing values, as long as it's distinct.

We can't quite use `factorize(arr)[0]` since it doesn't have the ordering requirement.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Handle ExtensionArrays in cut #31389

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Handle ExtensionArrays in cut #31389

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions