-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: don't allow users to move from an interned string #14494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: don't allow users to move from an interned string #14494
Conversation
I read somewhere that PY3 only interns short strings. Is that not right? |
how did you find this issue (out of curiosity)? IOW, what broke? |
I am not sure what gets interned by default, but users may call |
Also, it looks like only one of the py3.4 builds failed: I am not sure how my change would have affected this, and only in the particular configuration. Does anyone have any ideas? |
194fb53
to
1dbb494
Compare
Current coverage is 85.28% (diff: 100%)@@ master #14494 diff @@
==========================================
Files 140 140
Lines 50693 50693
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
Hits 43234 43234
Misses 7459 7459
Partials 0 0
|
1dbb494
to
6a83871
Compare
this looks fine. wait till we merge #14631 then push again to force rebuilds. |
6a83871
to
ae8aa64
Compare
Tests are now passing after a rebase |
ping |
thanks! |
Thank you! |
…terned string (pandas-dev#14494) (cherry picked from commit 726efc7)
git diff upstream/master | flake8 --diff
I was thinking about code paths that give away borrowed references to strings and realized that interned strings can be accessed even when their refcount is 1. This is because the intern table holds a "weak ref" to the string. This allows the string to be garbage collected normally; however, if a user creates an interned string with exactly one reference and feeds it to move, you could later get a second reference to the same string object by calling intern with a copy of the old string. The data backing the old string may have been mutated so you will get very strange results. I am almost certain that no on has run into this since
move_into_mutable_buffer
was added but it doesn't hurt to add this quick check to prevent broken usage like this in the future.This only affects python 2 because bytes objects cannot be interned in python 3.