Skip to content

ENH: use constant f32 eps, not np.finfo() during import #15691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

jzwinck
Copy link
Contributor

@jzwinck jzwinck commented Mar 15, 2017

NumPy docs for np.finfo() say not to call it during import (at module scope).
It's a relatively expensive call, and it modifies the GIL state.
Now we just hard-code it, because it is always the value anyway.
This avoids touching the GIL at import, which helps avoid deadlocks in practice.

NumPy docs for `np.finfo()` say not to call it during import (at module scope).
It's a relatively expensive call, and it modifies the GIL state.
Now we just hard-code it, because it is always the value anyway.
This avoids touching the GIL at import, which helps avoid deadlocks in practice.

Closes pandas-dev#14641.
@jreback
Copy link
Contributor

jreback commented Mar 15, 2017

@jzwinck can you confirm that this 'solves' the issue (not asking for a test in the codebase as might be tricky). just some verification.

can you add a whatsnew note.

@jreback jreback added the Compat pandas objects compatability with Numpy or Python functions label Mar 15, 2017
@codecov
Copy link

codecov bot commented Mar 15, 2017

Codecov Report

Merging #15691 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #15691      +/-   ##
==========================================
- Coverage   91.02%      91%   -0.03%     
==========================================
  Files         143      143              
  Lines       49409    49409              
==========================================
- Hits        44977    44967      -10     
- Misses       4432     4442      +10
Impacted Files Coverage Δ
pandas/core/indexing.py 94.08% <100%> (ø)
pandas/io/gbq.py 25% <0%> (-58.34%)
pandas/tools/merge.py 93.68% <0%> (-0.34%)
pandas/core/frame.py 97.87% <0%> (-0.1%)
pandas/core/window.py 96.18% <0%> (ø)
pandas/core/common.py 91.3% <0%> (+0.33%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2cad4dd...dadb97c. Read the comment docs.

@jreback jreback added this to the 0.20.0 milestone Mar 15, 2017
@jzwinck
Copy link
Contributor Author

jzwinck commented Mar 15, 2017

@jreback It "fixed" the hang on my system when I did import pandas in a new thread Python didn't know about and which did not hold the GIL. There could still be other ways to hang Python when the GIL is not used properly, but this patch sidesteps the one I encountered.

I have added a note in whatsnew 0.20.0 now.

As for testing, here is a C program which deadlocks before my patch and runs successfully after. Please note I believe it to be an erroneous program w.r.t. its multithreaded use of the Python C API--just to be clear this is not good code:

#include <Python.h>
#include <assert.h>
#include <pthread.h>

static void* import_pandas(void* arg)
{
    (void)arg;
    return PyImport_ImportModule("pandas");
}

int main(void)
{
    Py_Initialize();

    /* try to import pandas without the GIL
     * probably not a good idea, but... */
    pthread_t id;
    int rc = pthread_create(&id, 0, &import_pandas, 0);
    assert(rc == 0);

    /* if we time out, SIGALRM will kill us */
    alarm(10);

    void* obj;
    rc = pthread_join(id, &obj);
    assert(rc == 0);
    assert(obj);

    return 0;
}

If you want, you could add this as a unit test, but I'm not sure exactly how it fits in to the project given the existing tests all seem to be written in Python, and apart from using CFFI there's probably no way to implement the above in Python.

@jreback
Copy link
Contributor

jreback commented Mar 15, 2017

@jzwinck is there a way to do this directly using threading? IOW simulate importing inside the different threads.

@jzwinck
Copy link
Contributor Author

jzwinck commented Mar 15, 2017 via email

@jreback
Copy link
Contributor

jreback commented Mar 15, 2017

hmm, so how are that pure-python libraries that are simply doing imports failing? must be a race condition on the import itself in various threads.?

@jzwinck
Copy link
Contributor Author

jzwinck commented Mar 15, 2017 via email

@jreback
Copy link
Contributor

jreback commented Mar 15, 2017

@jzwinck I c. ok will merge this and cross-fingers. thanks!

@jreback jreback closed this in 6821291 Mar 15, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017
NumPy docs for np.finfo() say not to call it during import (at
module scope).  It's a relatively expensive call, and it modifies the
GIL state.  Now we just hard-code it, because it is always the value
anyway.  This avoids touching the GIL at import, which helps avoid
deadlocks in practice.

closes pandas-dev#14641

Author: John Zwinck <[email protected]>

Closes pandas-dev#15691 from jzwinck/patch-1 and squashes the following commits:

dadb97c [John Zwinck] DOC: mention pandas-dev#14641 in 0.20.0 whatsnew
e565230 [John Zwinck] ENH: use constant f32 eps, not np.finfo() during import
mattip pushed a commit to mattip/pandas that referenced this pull request Apr 3, 2017
NumPy docs for np.finfo() say not to call it during import (at
module scope).  It's a relatively expensive call, and it modifies the
GIL state.  Now we just hard-code it, because it is always the value
anyway.  This avoids touching the GIL at import, which helps avoid
deadlocks in practice.

closes pandas-dev#14641

Author: John Zwinck <[email protected]>

Closes pandas-dev#15691 from jzwinck/patch-1 and squashes the following commits:

dadb97c [John Zwinck] DOC: mention pandas-dev#14641 in 0.20.0 whatsnew
e565230 [John Zwinck] ENH: use constant f32 eps, not np.finfo() during import
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

import pandas hanging Flask 0.11.1 / Apache 2.4.18
2 participants