You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When grouping a dataframe and applying the rank function on a column with data type double[pyarrow] I get the following error: TypeError: rank is not supported for double[pyarrow] dtype
However, applying the rank function without groupby works. This leads me to believe that the error message is misleading and that in fact the rank function does support data type double[pyarrow].
Expected Behavior
The rank function works in combination with groupby for data type double[pyarrow].
Installed Versions
Replace this line with the output of pd.show_versions()
The text was updated successfully, but these errors were encountered:
Looks like for rank specifically we prevent casting to the original data type, probably because rank should be numeric and we don't want to cast the type back to a non-numeric type.
# i.e. how in ["rank"], since other cast_blocklist methods don't go
# through cython_operation
returnres_values
In the short term, maybe an exception should be make here if the original dtype isn't numeric? In the longer term, I think we'll integrate a way to dispatch to pyarrrows groupby aggregation methods.
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
When grouping a dataframe and applying the
rank
function on a column with data typedouble[pyarrow]
I get the following error:TypeError: rank is not supported for double[pyarrow] dtype
However, applying the rank function without
groupby
works. This leads me to believe that the error message is misleading and that in fact therank
function does support data typedouble[pyarrow]
.Expected Behavior
The
rank
function works in combination withgroupby
for data typedouble[pyarrow]
.Installed Versions
Replace this line with the output of pd.show_versions()
The text was updated successfully, but these errors were encountered: