-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
import pandas hanging Flask 0.11.1 / Apache 2.4.18 #14641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
pls show a code example |
from flask import Flask
import pandas as pd
app = Flask(__name__)
@app.route("/")
def hello():
return 'Hello World!'
if __name__ == "__main__":
app.run() When import pandas as pd is commented out the script will run fine and the website will load. Interesting enough when pandas imported in the python console it works flawlessly -- this is purely an issue with pandas 0.19.1 and apache. In the example here I'm using a environment, I've also tested this outside of a environment and still have the same issue. Here's my apache configuration file: <VirtualHost *:80>
# The ServerName directive sets the request scheme, hostname and port that
# the server uses to identify itself. This is used when creating
# redirection URLs. In the context of virtual hosts, the ServerName
# specifies what hostname must appear in the request's Host: header to
# match this virtual host. For the default virtual host (this file) this
# value is not decisive as it is used as a last resort host regardless.
# However, you must set it for any further virtual host explicitly.
#ServerName www.example.com
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
WSGIDaemonProcess flaskapp user=flask group=www threads=5
WSGIScriptAlias / /var/www/html/flaskapp/flaskapp.wsgi
<Directory flaskapp>
WSGIProcessGroup flaskapp
WSGIApplicationGroup %{GLOBAL}
Order deny,allow
Allow from all
</Directory> Here's the .wsgi file: import os
import sys
import site
# Add virtualenv site packages
site.addsitedir(os.path.join(os.path.dirname(__file__), 'env/local/lib64/python2.7/site-packages'))
# Path of execution
sys.path.append('/var/www/html/flaskapp')
# Fired up virtualenv before include application
activate_env = os.path.expanduser(os.path.join(os.path.dirname(__file__), 'env/bin/activate_this.py'))
execfile(activate_env, dict(__file__=activate_env))
from main import app as application |
you are doing odd path manipulation you are probably picking up different versions of pandas from different envs (and/or it's deps) so you need to fix that closing as not a pandas issue |
Same issue here with pandas >= 0.19.0. However, in this case I don't have different versions of pandas from different envs (and/or it's deps), because I'm running this web server inside a docker container freshly built each time. Commenting out from flask import Flask
import pandas
app = Flask(__name__)
@app.route("/")
def hello():
return 'Hello World!'
if __name__ == "__main__":
app.run() The .wsgi file is: import sys
sys.path.insert(0, '/app')
from myapplication import app as application |
not really sure what you are actually running. But this works fine. If you can show a reproducible example, pls comment.
|
The above script has to be run by Apache, with mod_wsgi, exactly like it was first reported. I've included this piece if __name__ == "__main__":
app.run() to show that the script runs fine outside of Apache, but makes it hang otherwise. |
Just to clarify, I was using a brand new instance with only 0.19.1 freshly installed in my environment. There were no other versions of pandas or even environments installed system-wide. |
you can try running with |
It'll take someone with more technical knowledge than a salmon biologist to figure out how to make mod_wsgi run python in verbose. 😞. Since I first posted this I've tested this on multiple clean instances, the only resolution is to downgrade to 18.1. |
I have also had this issue. I found the solution here http://stackoverflow.com/questions/25782912/pandas-and-numpy-thread-safety but have only ever had issues for the latest version of pandas if that helps the salmon people figure it out. |
Hi all, in case it's useful. I had a similar issue: My solution was to move the imports from the top of my views.py file and into the functions that needed them and all was well. However, it may be worth noting that I think bokeh also had the same problem, and bokeh doesn't have a dependency on pandas any more. I will need to confirm this though if it's a useful avenue. |
@jreback, you closed this issue on Nov 12 with the reason "closing as not a pandas issue" - i don't know where else I'd file this bug and look for progress/insight on it - suggestions welcome. |
@birdsarah I would try doing your import of pandas IN the function you need it (rather than at the top of the module). If you have |
I do not have numexpr installed but will try moving the imports when I get a chance. Any idea if the problem comes from numpy instead of pandas? |
@Qblack no idea. it sounds like an initialization problem. |
That's what I'm doing - and it works. But feels like a workaround as I've got commented pandas imports dotted all around my codebase - it's not ideal. Will have a look at |
I have encountered a similar hang, when doing
And indeed, if I just import numpy and do that myself instead of importing Pandas, it hangs as well, seemingly trying to manage the GIL. @Ty-WDFW if you have a chance, perhaps you can try replacing the above-mentioned line in indexing.py with this approximation:
And see if that fixes the hang. Or just try doing In my case, the problem seems to be that |
@jzwinck Thanks for looking into that! |
@jorisvandenbossche No, it is not a NumPy bug, though it is pretty strange/annoying that The only true bug that I believe exists here is in the outer application, which in my case and probably in all the cases here called If Pandas wants to make life easier on future folks who could get screwed up by this, it is probably possible to check if the GIL is held by the thread which imports pandas, by importing a small Cython module at the top of pandas.py with code as here: http://stackoverflow.com/questions/11366556/how-can-i-check-whether-a-thread-currently-holds-the-gil It could really save some people a lot of time (as evidenced by this issue; it took me perhaps two hours to debug my own occurrence of the same). And it isn't a lot of code...just needs a careful hand to get it right, and a willingness to add a bit more Cython, which I have previously been told is not desirable in Pandas generally. Then again, this diagnostic could equally be applied to Python's very own |
@jreback and @jorisvandenbossche: I just noticed that the NumPy docs explicitly say that |
@jzwinck yes, I think that is certainly OK (want to do a PR?) Would that actually solve this issue at the same time? Or would it just postpone the hanging until an indexing operation is done? |
@jzwinck we could just hard code this. it is barely used. |
NumPy docs for `np.finfo()` say not to call it during import (at module scope). It's a relatively expensive call, and it modifies the GIL state. Now we just hard-code it, because it is always the value anyway. This avoids touching the GIL at import, which helps avoid deadlocks in practice. Closes pandas-dev#14641.
NumPy docs for np.finfo() say not to call it during import (at module scope). It's a relatively expensive call, and it modifies the GIL state. Now we just hard-code it, because it is always the value anyway. This avoids touching the GIL at import, which helps avoid deadlocks in practice. closes pandas-dev#14641 Author: John Zwinck <[email protected]> Closes pandas-dev#15691 from jzwinck/patch-1 and squashes the following commits: dadb97c [John Zwinck] DOC: mention pandas-dev#14641 in 0.20.0 whatsnew e565230 [John Zwinck] ENH: use constant f32 eps, not np.finfo() during import
NumPy docs for np.finfo() say not to call it during import (at module scope). It's a relatively expensive call, and it modifies the GIL state. Now we just hard-code it, because it is always the value anyway. This avoids touching the GIL at import, which helps avoid deadlocks in practice. closes pandas-dev#14641 Author: John Zwinck <[email protected]> Closes pandas-dev#15691 from jzwinck/patch-1 and squashes the following commits: dadb97c [John Zwinck] DOC: mention pandas-dev#14641 in 0.20.0 whatsnew e565230 [John Zwinck] ENH: use constant f32 eps, not np.finfo() during import
pandas 0.19.1 is hanging apache on the import in the python script, the website times out. Downgrading to 0.18.1 resolves the issue. Tested this on a fresh EC2 instance Ubuntu 16.04.
Apache log:
INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-45-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.19.1
nose: None
pip: 9.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.11.2
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: