Skip to content

Commit dabaf6f

Browse files
Keep rval refs alive in StringHashTable._unique
Address a Heisenbug caused by `v = get_c_string(<str>repr(val))` potentially pointed to a string that is unreferenced the next time an exception is raised. (Two exceptions are raised in succession in `pandas/tests/base/test_unique.py test_unique_bad_unicode`. Signed-off-by: Michael Tiemann <[email protected]>
1 parent d98e6f0 commit dabaf6f

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

pandas/_libs/hashtable_class_helper.pxi.in

+4-1
Original file line numberDiff line numberDiff line change
@@ -1128,6 +1128,7 @@ cdef class StringHashTable(HashTable):
11281128
use_na_value = na_value is not None
11291129

11301130
# assign pointers and pre-filter out missing (if ignore_na)
1131+
keep_rval_refs = []
11311132
vecs = <const char **>malloc(n * sizeof(char *))
11321133
for i in range(n):
11331134
val = values[i]
@@ -1144,7 +1145,9 @@ cdef class StringHashTable(HashTable):
11441145
try:
11451146
v = get_c_string(<str>val)
11461147
except UnicodeEncodeError:
1147-
v = get_c_string(<str>repr(val))
1148+
rval = <str>repr(val)
1149+
keep_rval_refs.append(rval)
1150+
v = get_c_string(rval)
11481151
vecs[i] = v
11491152

11501153
# compute

0 commit comments

Comments
 (0)