Skip to content

REF (string): avoid copy in StringArray factorize #59551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 22, 2024

Conversation

jbrockmendel
Copy link
Member

@jbrockmendel jbrockmendel commented Aug 19, 2024

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Initial motivation was to avoid the ugly overrides of _from_backing_data. Turns out that override is needed for NumpyExtensionArray bc of the masking currently done in NumpyEA._values_for_factorize. That masking is in turn necessary because of incorrect comparisons in the hashtable cython code.

update 2024-08022 adding "(string)" to the title since this now fixes an infer-string xfail.

@jbrockmendel jbrockmendel requested a review from WillAyd as a code owner August 19, 2024 17:28
@mroeschke mroeschke added Refactor Internal refactoring of code Strings String extension data type and string data labels Aug 21, 2024
@jbrockmendel jbrockmendel changed the title REF: avoid copy in StringArray factorize REF (string): avoid copy in StringArray factorize Aug 22, 2024
@mroeschke mroeschke added this to the 2.3 milestone Aug 22, 2024
@mroeschke mroeschke merged commit 0c24b20 into pandas-dev:main Aug 22, 2024
47 checks passed
@mroeschke
Copy link
Member

Thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the bug-factorize branch August 22, 2024 20:55
matiaslindgren pushed a commit to matiaslindgren/pandas that referenced this pull request Aug 25, 2024
* REF: avoid copy in StringArray factorize

* mypy fixup

* un-xfail
jorisvandenbossche pushed a commit to jorisvandenbossche/pandas that referenced this pull request Oct 10, 2024
* REF: avoid copy in StringArray factorize

* mypy fixup

* un-xfail
jorisvandenbossche pushed a commit that referenced this pull request Oct 10, 2024
* REF: avoid copy in StringArray factorize

* mypy fixup

* un-xfail
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backported Refactor Internal refactoring of code Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants