-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Setup ASAN in CI #52990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
For anyone that looks at the logs forgot to mention I intentionally introduce a memory leak in the JSON code by commenting out the below implementation: void NpyArr_iterEnd(JSOBJ obj, JSONTypeContext *tc) {
NpyArrContext *npyarr = GET_TC(tc)->npyarr;
/*
if (npyarr) {
NpyArr_freeItemValue(obj, tc);
PyObject_Free(npyarr);
}
*/
} So our memory leaks aren't as bad as the attached log would indicate |
The biggest leak ASAN found is below (truncated for output): Direct leak of 1610180 byte(s) in 31 object(s) allocated from:
#0 0x7fc0846d2c7e in __interceptor_malloc build-llvm/tools/clang/stage2-bins/runtimes/runtimes-bins/compiler-rt/lib/asan/asan_malloc_linux.cpp:69:3
#1 0x56105bf4cabf in _PyMem_RawMalloc /usr/local/src/conda/python-3.11.3/Objects/obmalloc.c:101:12
#2 0x56105bf4cabf in PyMem_RawMalloc /usr/local/src/conda/python-3.11.3/Objects/obmalloc.c:586:12
#3 0x56105bf4cabf in _PyObject_Malloc /usr/local/src/conda/python-3.11.3/Objects/obmalloc.c:2003:11
#4 0x56105bf4cabf in _PyObject_Malloc /usr/local/src/conda/python-3.11.3/Objects/obmalloc.c:1996:1
#5 0x56105bf4cabf in PyObject_Malloc /usr/local/src/conda/python-3.11.3/Objects/obmalloc.c:712:12
#6 0x56105bf4cabf in PyUnicode_New /usr/local/src/conda/python-3.11.3/Objects/unicodeobject.c:1425:24
#7 0x56105bf4cabf in unicode_decode_utf8 /usr/local/src/conda/python-3.11.3/Objects/unicodeobject.c:5117:19
#8 0x7fc010c5ccba in objToJSON /home/willayd/clones/pandas/pandas/_libs/src/ujson/python/objToJSON.c:2130:14
#9 0x56105bf873c6 in cfunction_call /usr/local/src/conda/python-3.11.3/Objects/methodobject.c:542:18 That line in objToJSON points to: PyObject *objToJSON(PyObject *Py_UNUSED(self), PyObject *args,
PyObject *kwargs) {
...
newobj = PyUnicode_FromString(ret);
if (ret != buffer) {
encoder->free(ret);
}
return newobj;
} I think |
I'll take a look. Last time I tried Asan on macOS, I ended up having some issues (it wasn't catching leaks properly). |
Is the idea to enable this during pytest runs or a separate job? My recollection of valgrind is its more of a "leave running over the weekend" thing. |
I think during pytest. Yea Valgrind can definitely slow your program down, but ASAN doesn't add nearly as much overhead. Arrow might have ASAN set up in their CI; I know they do in arrow-adbc and its pretty cool, though that library has a very small test suite at the moment |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
We have no way of automatically detecting memory leaks in our library
Feature Description
ASAN can be used to detect memory leaks with a relatively low cost. The following code can show how to build pandas with ASAN and find leaks. Note that
fast_unwind_on_malloc=0
severely slows down execution at the cost of a more informative traceback. This also reports leaks that likely cannot be controlled by pandas.This produces the following output, which contains items like:
So from running the JSON tests ASAN thinks we leak 9878368 bytes with 24 being leaked by an interaction between properties.pyx -> index.pyx -> hashtable.pyx -> khash
Many of the arguments pieced together to make the above work are taken from:
google/sanitizers/issues/918
https://stackoverflow.com/questions/48833176/get-location-of-libasan-from-gcc-clang
https://clang.llvm.org/docs/AddressSanitizer.html
leaks.txt
Alternative Solutions
Valgrind is another great tool for detecting memory leaks (amongst other things), but has larger overhead compared to the sanitizers, which may make it unsuitable for CI
https://github.com/google/sanitizers/wiki/AddressSanitizerComparisonOfMemoryTools
Additional Context
cc @lithomas1
The text was updated successfully, but these errors were encountered: