Skip to content

[ArrowStringArray] use pyarrow.compute.match_substring_regex if available #41217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 2, 2021

Conversation

simonjayhawkins
Copy link
Member

follow-up to #41025

[ 50.00%] ··· strings.Contains.time_contains                                                                                                              ok
[ 50.00%] ··· ============== ========== ==========
              --                     regex        
              -------------- ---------------------
                  dtype         True      False   
              ============== ========== ==========
                   str        22.0±0ms   15.0±0ms 
                  string      17.2±0ms   10.9±0ms 
               arrow_string   6.84±0ms   2.37±0ms 
              ============== ========== ==========

@simonjayhawkins simonjayhawkins added Performance Memory or execution speed performance Strings String extension data type and string data labels Apr 29, 2021
@simonjayhawkins simonjayhawkins added this to the 1.3 milestone Apr 29, 2021
return super()._str_contains(pat, case, flags, na, regex)

if regex:
if hasattr(pc, "match_substring_regex") and case:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be good to add a comment like "added in pyarrow x.x", so we can more easily clean up those hasattr checks if we later raise the minimum version

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. I was thinking the same to address #41219 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants