Skip to content

REF: Series.argsort should dispatch to EA.argsort #43840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
jbrockmendel opened this issue Oct 1, 2021 · 6 comments · Fixed by #58232
Closed
3 tasks done

REF: Series.argsort should dispatch to EA.argsort #43840

jbrockmendel opened this issue Oct 1, 2021 · 6 comments · Fixed by #58232
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Refactor Internal refactoring of code
Milestone

Comments

@jbrockmendel
Copy link
Member

jbrockmendel commented Oct 1, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

N/A

Issue Description

cc @jorisvandenbossche unless I'm missing something? ive misunderstood the design/behavior of EA.argsort in the past.

Expected Behavior

NA

Installed Versions

Replace this line with the output of pd.show_versions()

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 1, 2021
@mroeschke mroeschke added ExtensionArray Extending pandas with custom dtypes or arrays. Refactor Internal refactoring of code and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 2, 2021
@attack68 attack68 mentioned this issue Oct 2, 2021
2 tasks
@mzeitlin11
Copy link
Member

somewhat xref #35178

@jbrockmendel
Copy link
Member Author

gentle ping @jorisvandenbossche

1 similar comment
@jbrockmendel
Copy link
Member Author

gentle ping @jorisvandenbossche

@jorisvandenbossche
Copy link
Member

Sorry for the slow reply here. Maybe the more general question is: why does Series.argsort not use nargsort? (which then will dispatch to EA.argsort)
Series.argsort seems like a more limited version reimplementing the missing value handling of nargsort?


I think in previous discussions related to EAs and argsort, it was mostly about the relationship between nargsort, EA.argsort and EA._values_for_argsort (which can call which?), and I don't directly remember discussing Series.argsort specifically.
The original PR adding EA.argsort (#19957) mentions that it enables Series.argsort (although the initial end goal was sorting), but it didn't actually do that (it didn't change Series.argsort to use it).

@jbrockmendel
Copy link
Member Author

Looking at this, Series.argsort looks really sketchy.

dti = pd.date_range("2016-01-01", periods=4)
ser = pd.Series(dti)
ser[1] = pd.NaT

>>> ser.argsort().values
array([ 0, -1,  1,  2])

>>> ser.values.argsort()
array([0, 2, 3, 1])

>>> pd.core.sorting.nargsort(ser.values)
array([0, 2, 3, 1])

What would the ser.argsort() be useful for? Passing it to something like take produces weird results (which vary depending on which take-like thing you use).

@jbrockmendel
Copy link
Member Author

Once the deprecation in #54219 is enforced, we will be able to refactor in such a way as to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants