Skip to content

Execute all checkout steps inside docker #5673

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
stsewd opened this issue May 7, 2019 · 8 comments · Fixed by #6436
Closed

Execute all checkout steps inside docker #5673

stsewd opened this issue May 7, 2019 · 8 comments · Fixed by #6436
Labels
Accepted Accepted issue on our roadmap Improvement Minor improvement to code

Comments

@stsewd
Copy link
Member

stsewd commented May 7, 2019

Right now we execute the checkout step (clone, pull) in the build itself (outside docker). The rest of commands (install deps, build html, etc) are executed inside docker.

We can't execute all the commands inside docker because we are using gitpython to parse branches/tags and other operations.

Possible solutions:

  1. Go back to parse the git output ourselves and drop gitpython (we basically only uses gitpython to list branches/tags/submodules).
  2. Use gitpython inside docker and stream the results back. We can have the results back using pickle (maybe it can break due different python versions) or using something else like json (we only need branches/tags/submodules, so it can be just one call to generate a json file and read from there)
@stsewd stsewd added Improvement Minor improvement to code Needed: design decision A core team decision is required labels May 7, 2019
stsewd added a commit to stsewd/readthedocs.org that referenced this issue May 7, 2019
The title isn't 100% true,
because it depends on the settings (if docker is enabled).
There are still commands being executed outside docker readthedocs#5673.
@ericholscher
Copy link
Member

I think we should definitely we trying to run everything in docker, so #2 sounds the best to me. I think we can do it in stages though. Let's move the clone inside docker first, then we can figure out the best approach for the other commands after that.

@humitos
Copy link
Member

humitos commented Jun 10, 2019

What's exactly the idea behind 2? If I understand correctly, we call gitpython from inside Docker querying for all the info we need (branches, tags, commit hash, etc) and save all of that into a JSON file. Then, from the Host we read that file. Am I right?

I wouldn't use pickle as it's known to have security issues

@stsewd
Copy link
Member Author

stsewd commented Jun 10, 2019

@humitos yeah, something like mounting the script in rtd/bin/get_vcs_information.py

@humitos
Copy link
Member

humitos commented Jun 10, 2019

I'm thinking that could be tricky. That script would need a Python environment with gitpython installed to be executed, right?

Another approach could be mixture of this: query git repository data from the host (using gitpython), and use this data from the container to run the git commands inside it.

@ericholscher
Copy link
Member

I don't think installing gitpython in the build env is too much. We already have lots of other Python dependencies there. I think the bigger question is around architecture, and if we want more running in the build env. I think that seems like a reasonable approach, and we should at least put together a proof of concept, it shouldn't be too hard.

@stsewd
Copy link
Member Author

stsewd commented Jul 12, 2019

So, looks like it doesn't matter if we install gitpython in the user env, bc we can't. We have a chicken/egg problem. We do all the git steps (get branches and tags) and after that we create the python env.

@stsewd
Copy link
Member Author

stsewd commented Jul 12, 2019

But I think we can use the global python interpreter or create an env just to execute gitpython

@stsewd
Copy link
Member Author

stsewd commented Jul 12, 2019

I have this silly POC master...stsewd:stsewd/get-git-info-from-container

It works(1) in the local environment and docker, there are some things to decide:

  • Use a virtualenv to install gitpython (we can refactor our current Virtualenv class to work without version, config etc) or just install it in the current virtualenv (local environment gets polluted with gitpython here)
  • How to copy the script. In the POC I'm mounting the script with docker, but we can make the script an installable package
  • Define the error codes from the script. Right now it only exists with 1 for all errors, but I think we want to have better error msgs to show to the user.
  1. Only creates a json object form the output, but it would be easy to refactor the other methods to make use of it.

stsewd added a commit to stsewd/readthedocs.org that referenced this issue Dec 4, 2019
Closes readthedocs#5673

There is a couple of things with this:

- Git commands using gitpython will use the git binary from the host
- Git commands using git directly, will use the git binary from the
  container.

This may bring some surprises, since the git versions will be different.
But I'm not really worried about it, it should work.

Another thing is: To make a clone we will spin a build image, which are
big, probably we could add an image that only has git in it.

This should be merged after we have integrated this with the ssh-agent
in .com
@stsewd stsewd added Accepted Accepted issue on our roadmap and removed Needed: design decision A core team decision is required labels Jan 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Accepted issue on our roadmap Improvement Minor improvement to code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants