-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ValueError: numpy.dtype has the wrong size, try recompiling #11383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
you are compiling numpy with a different version that pandas. Not really sure how you are installing. FYI it is MUCH easier to simply use |
closing, but you can still comment here. |
What I'm trying to understand is why numpy and pandas are compiled with different versions. As detailed above they are installed on a clean system, in a virtualenv, via one |
you must have a system install of python somewhere on your path that has numpy. |
yes, but no
It also doesn't make sense to me why immediately reinstalling pandas into the same virtualenv should fix the problem (though it does)... surely if it linked against the wrong version of numpy (even though there is no other numpy) the first time then when I reinstall it would make the same mistake? |
as I said this is an issue with you system. it is likely cached. or pip might be at fault. IOW it tries to complete the install of pandas, then installs numpy, but its not installed so pandas can't find it. you can try to install numpy first. then with a new pip command install pandas. |
as I detailed above though, it is not cached if I trust the output from |
oh, just noticed you are using 0.15.2. pandas is not forward compat like that. you will need pandas 0.16.2 (or numpy 1.8.1) |
hmm, but why would we've been using these two specific pinned versions successfully so far in our app the app is deployed to Heroku (which does a straightforward the problem started occurring when I built the new Jenkins server... my best guess is because the Jenkins docker image is built on Debian, while Heroku Cedar is Ubuntu-based our previous CI server was a Mac Mini and it worked fine just in a virtualenv on OSX too |
no idea. this is why you should simply use conda. no compiling required. but i get that you have your way of doing things. |
I'm able to reproduce this issue inside a docker Ubuntu instance, following much the same approach as the OP; I create a new virtualenv, and install
|
As an example of what I and the OP mean, here is a
the corresponding
If you drop those two files into a directory, and attempt to build the image with
|
@thanatos you will need to provide more output (the full log of the sequence of commands and its output) to be able to say more. But as Jeff already pointed out above, this error means that somehow the installed pandas is compiled against a wrong version of numpy (or later another version of numpy is installed) |
What more output are you looking for, exactly, that isn't either directly in the above comment, or can be obtained from running the (The entire output is considerably long. I was trying to avoid spamming this issue with largely irrelevant output that can be obtained by simply running the above commands.)
While you might be technically correct that this error arises from Docker makes it easy to inspect the resultant machine, which would allow you to prove to yourself in whatever manner you please that there are no other versions of pandas installed. But it's not the output that matters; it's the above commands: my understanding is that Pandas should install on a fresh system from a I've modified the above slightly, so as to hopefully convince you further that the environment of a brand new Ubuntu install is sane; there are no versions of
FROM ubuntu:xenial
SHELL ["/bin/bash", "-c"]
RUN apt-get update
RUN apt-get install --yes \
gcc \
g++ \
git \
libc-dev \
make \
python-dev \
python-virtualenv
RUN [[ -d /srv ]] || mkdir /srv
COPY show_installed.py /show_installed.py
RUN /show_installed.py
RUN mkdir /srv/example
RUN virtualenv /srv/example/env
COPY requirements.txt /srv/example/requirements.txt
RUN /srv/example/env/bin/pip install -r /srv/example/requirements.txt
RUN /srv/example/env/bin/python -c 'from pandas import hashtable'
#!/usr/bin/env python
try:
import numpy
except ImportError:
print("numpy is not installed.")
try:
import pandas
except ImportError:
print("pandas is not installed.") The output of running
Further, I've toyed around with this a bit more. The error only happens if you specify Also, it seems like you end up compiling the entirety of
and again here:
both of those take a long time to complete, and top seems to indicate that a compile is happening. (My understanding is that |
@thanatos can you show this happening with 0.19.2? any other release is frozen so all of this is irrelevant. Note that there are wheels since 0.19.0 IIRC so you won't be compiling in any event (except on maybe a particular flavor of linux, note that ubuntu/debian the wheels work). This used to be a problem when people compiled, thats exactly the reason people use a sane package manager. |
further showing a dockerfile is not any evidence. this needs to be reproducible on a stock system. |
Just tried the same on my Ubuntu in a fresh conda environment with just python and pip, and then pip installing pinned pandas and numpy, and I get the same error. @thanatos if you want to see more in detail what is happening with pip, you have to put in verbose mode (
and building the pandas wheel against that (instead of against the installed (or to be installed) numpy wheel). So logically giving the "ValueError: numpy.dtype has the wrong size, try recompiling" error. I am no expert in pip/wheels, but AFAIK this is an issue with them, not pandas. If you want to solve this for the older pandas version, I think the easiest is to work in two steps and install numpy and pandas separately (instead of from one requirements file)
this should ensure there is already a numpy present at the moment the pandas archive is downloaded and the pandas wheel is build locally (I think). |
A Dockerfile is a stock system, and is designed to be extremely reproducible: it's a file that describes a machine, starting from the fresh install of some image. In this case, since we start with:
…then the following commands are run starting from a clean, never-before used installation of Ubuntu Xenial. Not my personalized, I've installed some stuff Ubuntu. Clean, pristine, Ubuntu.
Well, this makes sense. You can consider my bug invalid until such a time as when I can reproduce it on the latest pandas. My stuff is unfortunately stuck on 0.16.2. (Upgrading is just so far down the list 😞 ) |
@thanatos docker is simply not acceptable for a cross-platform self-contained testing environment. you would need to boil down your errors to a reproducible example. |
I am not sure why the discussion about the docker as reproduction is reopened, but as I said in my latest comment above (#11383 (comment)): I can perfectly reproduce this on my local ubuntu laptop (so no need for docker). |
And also note that I suggested an easy workaround (if you have the ability to change the dockerfile): installing numpy and pandas consecutive instead of in one command. |
I have been setting up a new CI server and I'm hitting a problem when running our tests on the new server - tests which use Pandas are giving this error:
I have read other threads about the cause of this problem but they don't seem to apply in my case.
We are running in a virtualenv, inside a Docker container (Debian Stretch).
There is no 'system' numpy, pandas or cython installed:
In our requirements.txt we have:
but...
Now, I have found a manual fix, which is to re-install pandas:
The need for
--no-cache-dir
suggests something about the problem, but I don't understand what exactly. I am installing in a 'virgin' docker container, there is no pip cache before I first install requirements.txt:After
pip install -r requirements.txt
:The manual fix is not acceptable as I need to automate the build.
Appreciate if you can suggest other avenues to explore towards a solution.
The text was updated successfully, but these errors were encountered: