Skip to content

CI: Use all available cores in the CI #24028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
datapythonista opened this issue Dec 1, 2018 · 9 comments
Closed

CI: Use all available cores in the CI #24028

datapythonista opened this issue Dec 1, 2018 · 9 comments
Labels
CI Continuous Integration

Comments

@datapythonista
Copy link
Member

datapythonista commented Dec 1, 2018

If I'm not wrong, both Travis and Azure provide two cores for the execution of jobs (I think in both cases the number does not necessarily need to be 2, and could be higher, or I guess lower, depending on their load).

So far we are just parallelizing the execution of tests, where we are using 2 processes in most cases (except for the tests marked as single).

We should be able to speed up the CI by detecting the number of cpus available for the job, and parallelize at least (assuming 2 cpus):

  • Building pandas (adding -j2 to python setup.py build_ext --inplace)
  • Building the docs (adding --num-jobs=2 to ./doc/make.py html)
  • Running the tests, with the detected cpus, and where it's not already using multiple processes
  • Not sure if there is anything else that can be run in more than one core (conda, linting, benchmarks...)

Travis specs: https://docs.travis-ci.com/user/reference/overview/#virtualisation-environment-vs-operating-system

UPDATE:

  • Based on tests in my local machine, doesn't seem like the -j argument in setup.py build_ext has any effect. I'd say it's ignored, and the process is always single core.

Time to run the tests not slow and not network in the CI:

1st run:

-j 0 -j auto
azure 1443 sec 804 sec
travis 1362 sec 959 sec

2nd run:

-j 0 -j auto
azure 1461 sec 839 sec
travis 1421 sec 1009 sec
@datapythonista datapythonista added the CI Continuous Integration label Dec 1, 2018
@bhavaniravi
Copy link
Contributor

This sounds interesting. I can take this up next. But not sure where to start exactly?

@datapythonista
Copy link
Member Author

Afaik we're running 3 things in the CI that are worth parallelizing (and I think they weren't when I wrote this issue):

  • The docs build
  • Compiling pandas
  • Running the tests

The parallelization of running the tests is actually very tricky, I've got #26949 to experiment, but let's forget about it first, focus on the others, and we can take care of it later.

The docs are build in the last build in azure if I'm not wrong, and the entry point is doc/make.py, which has a parameter to control the number of parallel sphinx jobs. Not sure what are we calling at the moment, let me know if you can't find it.

For the compilation of pandas, every job is calling ci/setup_env.sh, and this is where it is compiled. Compiling is done by a command like python setup.py build_ext --inplace, which receives a parameter -j for the number of jobs, that if I remember correctly, wasn't working for me locally, was working for some other people, and not sure what was happening in the CI (I think I tried).

Probably the first is pick one of them, see how many jobs are expected to be used, changed the value, and see if the CI is running faster, or what. I think you set up azure for your fork, feel free to experiment there, or directly in pandas, whatever you prefer.

@bhavaniravi
Copy link
Contributor

@jreback and @tomascassidy #18052 has updated the documentation contributing.rst python setup instruction to python setup.py build_ext --inplace -j 4.

I am unable to find the functionality of -j in the codebase. Can you help me with this? I would need to understand how it is implemented to work on improving the CI performance

@datapythonista
Copy link
Member Author

build_ext is a standard thing from setuptools. Not sure if there is much documentation about it, but should be in their docs (see the mention here: https://pythonhosted.org/an_example_pypi_project/setuptools.html#using-setup-py)

setup.py build_ext --help should provide some information, but -j should be the number of cores to use.

@bhavaniravi
Copy link
Contributor

@datapythonista what's your take on adding caching to the build process? Have we tried anything in those lines before? matplotlib/matplotlib#1507

@datapythonista
Copy link
Member Author

There was a bit of discussion about it before, and the preference was to keep things simple in the CI.

Personally, to make would make sense to build a docker image or something every night with the latest version of the source code and the conda dependencies, so the builds for every PR just download what it's needed on top of that. But not an expert, I may be missing something.

The best would be to create an issue, so everybody interested can give their opinion, and share our ideas and experience. Feel free to open it (probably better with a specific proposal so the discussion is more focused).

@bhavaniravi
Copy link
Contributor

build_ext is a standard thing from setuptools. Not sure if there is much documentation about it, but should be in their docs (see the mention here: https://pythonhosted.org/an_example_pypi_project/setuptools.html#using-setup-py)

setup.py build_ext --help should provide some information, but -j should be the number of cores to use.

@datapythonista it is not a part of setuptools the only other place i see it in whole of internet is in numpy docs. On looking through their code they have a custom implementation for parallel processing. We should be doing the same for "-j to work"

https://github.com/numpy/numpy/blob/42ac121a0b547bc7c218c7ea318caf6c0a420034/numpy/distutils/command/build_ext.py#L34

@datapythonista
Copy link
Member Author

build_ext is a command defined in distutils (part of the Python standard library), and extended in setuptools. I don't think there is much documentation, but there is some in the Python docs: https://docs.python.org/3/distutils/configfile.html

Here you've got where the -j is defined: https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Lib/distutils/command/build_ext.py#L87

But if I'm not wrong, setuptools wraps it, so you may need to check it there if this is not the code that is actually used.

@jreback jreback added this to the Contributions Welcome milestone Oct 6, 2019
@datapythonista
Copy link
Member Author

I think this was already implemented with the PYTEST_WORKERS variable: https://github.com/pandas-dev/pandas/blob/master/ci/run_tests.sh#L23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants