Skip to content

Restructure of Docker command execution and large build refactor #1524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Aug 12, 2015

Conversation

agjohnson
Copy link
Contributor

Note: this is ongoing, review for design decisions for now

I was not happy with the pattern used to execute commands in containers. The original flow for executing tasks in Docker was:

  • Get build/project/version state from API
  • Serialize JSON
  • Execute django admin command in docker, pass JSON into forked docker command via STDIN, docker command pipes to django admin command
  • Django admin command deserialized JSON
  • Execute build task from admin command, passing in project/version data
  • Build task executes
  • Serialize response, output STDOUT from django admin
  • docker command returns STDOUT to original task

This was deficient is several ways:

  • It required our full stack in a container - this was a very large container that needed to be re-rolled every deploy
  • It is very fragile, requiring several layers of piping STDIN/STDOUT through commands.
  • Breakage was difficult to track down when it did happen

So, this aimed to tackle that and turned into a larger refactor, as we could be more intelligent about how we were executing commands. My first inclination was to use proper IPC to make this less fragile, but @ericholscher brought up the point that we would treat the container as a dumb container as well. The goals of this PR will be:

  • Wrap all command execution in stateful Environment objects, so all forked commands are tracked.
  • A LocalEnvironment provides local command execution
  • Remove the need for shell=True command execution
  • Remove passing around results object in task -- Environment object tracks whether a command failed and has state on output/etc
  • Make docker image a lightweight shell, with only python dependencies
  • Docker commands execute through exec mechanism instead of a single command
  • Handle all docker commands through docker-py
  • Docker commands have no stdin/stdout, so output will be combined. This seems like a good time to drop the weird pattern of splitting stderr/stdout. I have thoughts on how to finally get combined stdout/stderr with formatting on stderr, but it's a surprisingly hard problem to solve.

Side effects of this refactor so far include:

  • Reduce duplicated code and build complexity greatly by removing magical PDF builds. Refactoring the commands executed, we were duplicating efforts to create a secondary PDF build. This build is ofter confusing for users, and shouldn't exist.
  • Put all commands for PDF/ePub builds in main build, fail task but don't halt on these errors. This will take some handling of exceptions during the build process and rewrap them if they are critical
  • Build output page is now a list of commands and output, not broken up stdout/stderr per build stage
  • No more build "steps" -- I'm still contemplating whether this would be useful, or if it would be useful in a differently applied method than it is now.
  • Task update_docs is now a subclassed Task object, allowing us to track state for the task. This could also be a separate class, but seemed more applicable as a full Task object with build/version/project state.

Finally, some of the design decisions here:

  • Environments (LocalEnvironment/DockerEnvironment) are used to track build state, instead of passing around results and other states through method calls on project build
  • Commands no longer require running in a with context
  • Environments create these command objects, tracking and executing them
  • Environment wrap all calls with exception handling. If an unhandled exception bubbles up to the Environment it is reported.
  • A special exception is used to report errors to the user via builds page, BuildEnvironmentError. This will add error state to the command on the build page

Thoughts on where to take this next, and I think this work sets us up for an easy transition:

  • Per-command output tracking -- this might make sense to include here, as the output is changing significantly already
  • Additional environments for execution -- these will just be wrappers around executing commands, but could be anything (spin up ec2 instance, build, destroy/etc)

@agjohnson agjohnson added Needed: design decision A core team decision is required Needed: documentation Documentation is required Needed: tests Tests are required PR: work in progress Pull request is not ready for full review labels Jul 31, 2015
@agjohnson agjohnson self-assigned this Jul 31, 2015
@agjohnson agjohnson force-pushed the build-refactor-and-docker branch from 8a4751a to 75dd2df Compare July 31, 2015 20:15
@@ -20,4 +20,6 @@ def handle(self, *args, **options):
project_data = api.project(slug).get()
p = tasks.make_api_project(project_data)
log.info("Building %s" % p)
tasks.update_docs(pk=p.pk, docker=True)
# TODO just move this to update_api.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should decide what management commands are worth sticking around. Either way, build vs docker build should just be a parameter to the build comand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. We should figure out what makes sense to keep around, and document the useful ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe we consider implementing them as admin functions instead. Cases like build docs for a version from the admin are more useful for rtd.com, but allowing non ops folks to trigger some of the management functionality from the admin might be a win

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, they should all be super light wrappers around util methods, so we could expose them both places.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made #1526 for this.

@agjohnson
Copy link
Contributor Author

Outstanding issues that I am finishing up on:

  • On docker command timeout, the container goes away as expected, but we should tell the user their build timed out on build output page
  • Ensure container is cleaned up on exceptions, or be smarter about creating containers. We don't have access to the --rm option for run through the docker api for some reason, so the container sticks around on faulty builds
  • Build output page is still awful, will address separately so this PR can continue
  • Where to put Dockerfile -- this is maybe a good deploy step, though it overlaps more with server provisioning. Dockerfile should be public

if len(args):
for slug in args:
project_data = api.project(slug).get()
p = tasks.make_api_project(project_data)
log.info("Building %s" % p)
tasks.update_docs(pk=p.pk)
update_docs = tasks.UpdateDocsTask()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we don't just put this in the tasks.py so it's importable like previously?

These objects will allow for multiple command execution patterns without
changing the actual commands we're executing. This allows us to use Docker as a
thin wrapper around container execution, instead of treating the Docker
containers as a black box. This works towards per-command tracking of commands
as well, giving us a far better UI around build state.
Fixes and adds more testing on build tests, including tests for pdf build
failure states.
This cleans up some testing of environments and adds tests
to handle extended character return gracefully.
This allows for delving into the passing behavior of a build, but the meat and
potatoes is the error output.
This catches errors that did not emanate from a command
This addresses exiting due to timeout and memory usage
This will remove stale containers and throw an exception that would back out of
the buildenv/task if there is a collision
Missing build data, and subsequent mocked out update_build method on envs
caused a done state to not be detected, and thus success (with done state
implied) was false.
@agjohnson agjohnson added PR: ready for review and removed Needed: design decision A core team decision is required Needed: documentation Documentation is required Needed: tests Tests are required PR: work in progress Pull request is not ready for full review labels Aug 8, 2015
@agjohnson
Copy link
Contributor Author

Okay, to wrap this up, I've added some documentation and cleaned up the last remaining issues that I had with the code. I've push up a new repo with the docker file required to start with this build env: https://github.com/rtfd/readthedocs-docker-images

It would be worth throwing some more projects at this system for integration testing. I've been through many iterations of testing on this all, but would be good to revisit initial setup issues/etc.

I will follow this up with another PR that makes the UX on output not suck. It currently is a first pass and is quite noisey.

==================

Read the Docs uses container virtualization to encapsulate documentation build
processes. Each build spins up a new virtual machinge using our base image,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type: "machinge" -> "machine"

@ericholscher
Copy link
Member

:shipit:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants