Skip to content

[Windows] Installing conda package is very slow #1175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fcollonval opened this issue Sep 12, 2018 · 4 comments
Closed

[Windows] Installing conda package is very slow #1175

fcollonval opened this issue Sep 12, 2018 · 4 comments

Comments

@fcollonval
Copy link
Contributor

I really love the new 3.x plotly.py. But its installation from conda package takes ages.

On Windows 7 and 10 64bits with SSD drive, conda install plotly -c conda-forge on a fresh conda environment takes at least 15min. when pip install plotly in the same fresh environment takes about 1min.

As our workflow includes generating often conda environments, this is really annoying.

During the conda installation, the longest step is the latest (Executing transaction). And the bottleneck comes from the byte code generation of all python scripts in plotly. Subprocess looking like the following are run during most of the 15min.:

"...envs\plotly\python.exe" -Wi -m py_compile "...envs\plotly\Lib\site-packages\plotly\validators\scattermapbox\marker\colorbar\tickformatstop\_value.py"

I have no idea if the trouble can be solved through the recipe. And if it is better to post an issue on conda, I'll happily report it there too.

I don't know if the option skip_compile_pyc in the recipe can help.

Reference:

  • conda version : 4.5.11
  • platform : win-64
  • os: Windows/10.0.17134 and Windows/7
  • Python version: 3.7 and 3.6
  • Plotly.py: 3.2.0
@jonmmease
Copy link
Contributor

Hi @fcollonval, glad you're enjoying plotly.py and thanks for reaching out.

TLDR, installation is much faster using the new packages in the plotly anaconda channel (https://anaconda.org/plotly/plotly)

conda install -c plotly plotly

Can you try this out and verify that its significantly faster?

Explanation:
As you noticed, the majority of the installation time for the packages in the main and conda-forge channels is spent byte compiling all of the Python files. One reason this takes a long time is that plotly.py contains thousands of *.py files that are produced using code generation (everything in the plotly.graph_objs package). Another reason this takes a while is that conda isn't particularly efficient at this, as a separate Python process is launched to compile each file individually.

After digging through the conda source code a bit I came to the conclusion that the only way to get conda to skip the *.pyc compilation step during installation is to make sure that the *.pyc files are bundled with the package. But bundling the *.pyc files isn't possible in a noarch package, because *.pyc files are Python version specific.

Because of this, I opted to build a set of OS/architecture specific packages for the plotly anaconda channel as part of our CI build process (See #1154). This way each package includes all of the *.pyc files already. This is much more resource intensive on the build side (pre-byte compiling code for 20 packages vs. building 1 package without byte compiling), but it makes for a much better installation experience. Compared to conda noarch the initial installation is much faster. Compared to installing as a wheel from PyPI, the first import is a bit faster since the *.pyc files are included.

Also, I did look into the skip_compile_pyc flag (as it does sound promising!), but it turns out that it applies only to non-noarch packages and it allows package authors to exclude some *.pyc files from being included in the package in the first place. It has no affect on noarch packages since they already don't include any *.pyc files.

(This is probably more than you wanted to know, but I realized that I hadn't written down the full rational yet)

@fcollonval
Copy link
Contributor Author

@jonmmease thanks for this quick and detailed answer.
I'll try later today the installation through plotly channel and let you know.

In the mean time, do you know if some issues have been reported to the conda issues tracker about this? Some speedup work should be considered.

@fcollonval
Copy link
Contributor Author

@jonmmease Just tested the installation with the plotly channel. It took less than 20sec. User should definitely use your channel.

Long live plotly 😉

@Quetzalcohuatl
Copy link

Thanks for investigating Jon. The solution is: You are forced to use conda install for plotly. If you want to use pip install, be prepared to wait hours because it needs to individually compile multiple Python files into .pyc files. If there is a way to upload a wheel with precompiled files that we can pip install from, that would be great, because then we aren't forced into using conda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants