Skip to content

BLD: include dll in package_data on Windows #21321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 8, 2018

Conversation

chris-b1
Copy link
Contributor

@chris-b1 chris-b1 commented Jun 5, 2018

Not sure how to test this, but I believe should remove the runtime dependency by static linking against the MSVC++ runtime. @cgohlke any thoughts?

@chris-b1 chris-b1 added Build Library building on various platforms Needs Backport labels Jun 5, 2018
@chris-b1 chris-b1 added this to the 0.23.1 milestone Jun 5, 2018
@codecov
Copy link

codecov bot commented Jun 5, 2018

Codecov Report

Merging #21321 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #21321   +/-   ##
=======================================
  Coverage   91.89%   91.89%           
=======================================
  Files         153      153           
  Lines       49596    49596           
=======================================
  Hits        45576    45576           
  Misses       4020     4020
Flag Coverage Δ
#multiple 90.29% <ø> (ø) ⬆️
#single 41.86% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/reshape/merge.py 94.25% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d79203a...7845eb2. Read the comment docs.

@cgohlke
Copy link
Contributor

cgohlke commented Jun 5, 2018

If I read this correctly, this will not only static link the C++ runtime to one C++ extension, but static link the C multi-threaded runtime library to all Pandas extension modules. That is not a good idea since Python can only load a total of ~127 of such DLLs.

It's probably better to allow distributors to include the C++ runtime DLL,, in the wheels, e.g.

diff --git a/setup.py b/setup.py
index 6febe674f..e161b1534 100755
--- a/setup.py
+++ b/setup.py
@@ -733,7 +733,7 @@ setup(name=DISTNAME,
       maintainer=AUTHOR,
       version=versioneer.get_version(),
       packages=find_packages(include=['pandas', 'pandas.*']),
-      package_data={'': ['data/*', 'templates/*'],
+      package_data={'': ['data/*', 'templates/*', '_libs/*.dll'],
                     'pandas.tests.io': ['data/legacy_hdf/*.h5',
                                         'data/legacy_pickle/*/*.pickle',
                                         'data/legacy_msgpack/*/*.msgpack',

Distributors could then copy (or not) the C++ runtime DLL to the _libs folder before building the wheel.

@jreback
Copy link
Contributor

jreback commented Jun 5, 2018

@chris-b1 is this the fix mpl did?

@chris-b1
Copy link
Contributor Author

chris-b1 commented Jun 5, 2018

No, mpl packages the whole DLL. I thought static linking might be better in terms of limiting ending binary size, but wasn't aware of issue @cgohlke linked.

@cgohlke - wasn't clear to me looking at mpl setup, do you just manually copy DLL over into that directory? I see the stuff adding it to package_data but couldn't figure out how the DLL gets there in the first place.

https://github.com/matplotlib/matplotlib/blob/6ec80eac66c4828f28540f76734cae811787d626/setupext.py#L1770

@cgohlke
Copy link
Contributor

cgohlke commented Jun 5, 2018

wasn't clear to me looking at mpl setup, do you just manually copy DLL over into that directory?

Yes, this way it is up to the distributor to copy any required DLL to the _libs folder before building the wheel. IMO it is not worth automating this in setup.py because there are many possible Visual Studio versions and use cases to consider. Canopy and Anaconda distributions won't need to package any C/C++ runtime DLL.

@chris-b1 chris-b1 changed the title BLD: static link c++ runtime on Windows BLD: include dll in package_data on Windows Jun 6, 2018
@chris-b1
Copy link
Contributor Author

chris-b1 commented Jun 6, 2018

@TomAugspurger - is it correct that the Windows wheels you upload to PyPI are from @cgohlke? If that's the case, @cgohlke does this work for you?

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jun 6, 2018 via email

@cgohlke
Copy link
Contributor

cgohlke commented Jun 6, 2018

does this work for you?

I'm sure it does for Python 3. The build script will copy MSVCP140.DLL and CONCRT140.DLL from the Visual Studio redist directory to pandas\_libs\ before running setup.py, e.g.:

copy /Y /B "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist\x64\Microsoft.VC140.CRT\MSVCP140.DLL" pandas\_libs\
copy /Y /B "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist\x64\Microsoft.VC140.CRT\CONCRT140.DLL" pandas\_libs\

It's probably better not to copy msvcp90.dll for Python 2.7. IIRC the Python installer already installs the C++ runtime unless it's a per user installation. The .pyd file may or may not be missing a manifest to load msvcp90.dll next to it. I do not have any system for testing without the redistributable package installed. I am not aware of any other package that distributes msvcp90.dll, not matplotlib and not pyzmq.

@jreback
Copy link
Contributor

jreback commented Jun 7, 2018

does the MANIFEST.in need updating here?

@jreback
Copy link
Contributor

jreback commented Jun 8, 2018

@chris-b1 can you rebase. ready to go?

@chris-b1
Copy link
Contributor Author

chris-b1 commented Jun 8, 2018

Just rebased, Should be good, I'm not familiar with MANIFEST.in, but seems to only involve source distributions?

https://docs.python.org/3/distutils/sourcedist.html#specifying-the-files-to-distribute

@jreback
Copy link
Contributor

jreback commented Jun 8, 2018

@chris-b1 ok that sounds good then.

@jreback jreback merged commit 324b324 into pandas-dev:master Jun 8, 2018
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Jun 12, 2018
TomAugspurger pushed a commit that referenced this pull request Jun 12, 2018
david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this pull request Jun 18, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pandas 0.23.0 gives ImportError: DLL load failed
4 participants