Skip to content

Build: skip build based on commands' exit codes #9649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Nov 14, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions docs/user/build-customization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ There are some caveats to knowing when using user-defined jobs:
* Environment variables are expanded in the commands (see :doc:`environment-variables`)
* Each command is executed in a new shell process, so modifications done to the shell environment do not persist between commands
* Any command returning non-zero exit code will cause the build to fail immediately
(note there is a special exit code to `cancel the build <cancel-build-based-on-a-condition>`_)
* ``build.os`` and ``build.tools`` are required when using ``build.jobs``


Expand Down Expand Up @@ -104,6 +105,70 @@ To avoid this, it's possible to unshallow the clone done by Read the Docs:
- git fetch --unshallow


Cancel build based on a condition
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When a command exits with code ``183``,
Read the Docs will cancel the build immediately.
You can use this approach to cancel builds that you don't want to complete based on some conditional logic.

.. note:: Why 183 was chosen for the exit code?

It's the word "skip" encoded in ASCII.
Then it's taken the 256 modulo of it because
`the Unix implementation does this automatically <https://tldp.org/LDP/abs/html/exitcodes.html>`_
for exit codes greater than 255.

.. code-block:: python

>>> sum(list('skip'.encode('ascii')))
439
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Belated comment: Do you still need this explanation now that you no longer use 439 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It explains where the number 183 comes from. Isn't it clear with this note? I'm happy to update it

>>> 439 % 256
183


Here is an example that cancels builds from pull requests when there are no changes to the ``docs/`` folder compared to the ``origin/main`` branch:

.. code-block:: yaml
:caption: .readthedocs.yaml

version: 2
build:
os: "ubuntu-22.04"
tools:
python: "3.11"
jobs:
post_checkout:
# Cancel building pull requests when there aren't changed in the docs directory.
# `--quiet` exits with a 1 when there **are** changes,
# so we invert the logic with a !
#
# If there are no changes (exit 0) we force the command to return with 183.
# This is a special exit code on Read the Docs that will cancel the build immediately.
- |
if [ $READTHEDOCS_VERSION_TYPE = "external" ];
then
! git diff --quiet origin/main -- docs/ && exit 183;
fi


This other example shows how to cancel a build if the commit message contains ``skip ci`` on it:

.. code-block:: yaml
:caption: .readthedocs.yaml

version: 2
build:
os: "ubuntu-22.04"
tools:
python: "3.11"
jobs:
post_checkout:
# Use `git log` to check if the latest commit contains "skip ci",
# in that case exit the command with 183 to cancel the build
- (git --no-pager log --pretty="tformat:%s -- %b" -1 | grep -viq "skip ci") || exit 183


Generate documentation from annotated sources with Doxygen
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
40 changes: 40 additions & 0 deletions docs/user/builds.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,46 @@ The following are the pre-defined jobs executed by Read the Docs:
it's possible to run user-defined commands and :doc:`customize the build process <build-customization>`.


When to cancel builds
---------------------

There may be situations where you want to cancel a particular running build.
Cancelling running builds will allow your team to speed up review times and also help us reduce server costs and ultimately,
our environmental footprint.

Consider the following scenarios:

* the build has an external dependency that hasn't been updated
* there were no changes on the documentation files
* many other use cases that can be solved with custom logic

For these scenarios,
Read the Docs supports three different mechanisms to cancel a running build:

:Manually:

Once a build was triggered,
project administrators can go to the build detail page
and click the button "Cancel build".

:Automatically:

When Read the Docs detects a push to a branch that it's currently building the documentation,
it cancels the running build and start a new build using the latest commit from the new push.

:Programatically:

You can use user-defined commands on ``build.jobs`` or ``build.commands`` (see :doc:`build-customization`)
to check for a condition and exit it with the code ``183`` if you want to cancel the running build or ``0``, otherwise.

In this case, Read the Docs will communicate to your Git platform (GitHub/GitLab) that the build succeeded (green tick ✅)
so the pull request is in a mergeable state.

.. tip::

Take a look at :ref:`build-customization:cancel build based on a condition` section for some examples.


Build resources
---------------

Expand Down
12 changes: 9 additions & 3 deletions readthedocs/doc_builder/constants.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
# -*- coding: utf-8 -*-

"""Doc build constants."""

import structlog
import re

import structlog
from django.conf import settings


log = structlog.get_logger(__name__)

PDF_RE = re.compile('Output written on (.*?)')
Expand All @@ -30,3 +28,11 @@
DOCKER_OOM_EXIT_CODE = 137

DOCKER_HOSTNAME_MAX_LEN = 64

# Why 183 exit code?
#
# >>> sum(list('skip'.encode('ascii')))
# 439
# >>> 439 % 256
# 183
RTD_SKIP_BUILD_EXIT_CODE = 183
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I brought up the point that perhaps we want support for more explicit codes here. If so, perhaps we should choose a less arbitrary value here and allocate a block of exit codes.

183 is in a safe range for posix standards at least though:
https://tldp.org/LDP/abs/html/exitcodes.html

But, do we want something like:

  • Exit code 180: skip build immediately
  • Exit code 181: fail build immediately
  • Exit code 182: retry build later?
  • etc

Not to say we're implementing everything now, but if we will have more than one exit code, I'd suggest starting from a value that doesn't seem so arbitrary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

183 is in a safe range for posix standards at least though

Yeah, I checked for this as well and I was happy that "skip" was in that range 😄

I brought up the point that perhaps we want support for more explicit codes here

This looks interesting, but right now I'm not seeing a clear usage for other useful exit codes. We can keep thinking about this and see if we find some other useful cases. In any case, we can reserve the range 180-185 and start with 183 for this one; there is no need to make them consecutive.

5 changes: 4 additions & 1 deletion readthedocs/doc_builder/environments.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,9 @@
DOCKER_SOCKET,
DOCKER_TIMEOUT_EXIT_CODE,
DOCKER_VERSION,
RTD_SKIP_BUILD_EXIT_CODE,
)
from .exceptions import BuildAppError, BuildUserError
from .exceptions import BuildAppError, BuildUserError, BuildUserSkip

log = structlog.get_logger(__name__)

Expand Down Expand Up @@ -468,6 +469,8 @@ def run_command_class(
project_slug=self.project.slug if self.project else '',
version_slug=self.version.slug if self.version else '',
)
elif build_cmd.exit_code == RTD_SKIP_BUILD_EXIT_CODE:
raise BuildUserSkip()
else:
# TODO: for now, this still outputs a generic error message
# that is the same across all commands. We could improve this
Expand Down
5 changes: 5 additions & 0 deletions readthedocs/doc_builder/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,11 @@ class BuildUserError(BuildBaseException):
)


class BuildUserSkip(BuildUserError):
message = gettext_noop("This build was manually skipped using a command exit code.")
state = BUILD_STATE_CANCELLED


class ProjectBuildsSkippedError(BuildUserError):
message = gettext_noop('Builds for this project are temporarily disabled')

Expand Down
19 changes: 16 additions & 3 deletions readthedocs/projects/tasks/builds.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
BuildCancelled,
BuildMaxConcurrencyError,
BuildUserError,
BuildUserSkip,
MkDocsYAMLParseError,
ProjectBuildsSkippedError,
YAMLParseError,
Expand Down Expand Up @@ -280,6 +281,7 @@ class UpdateDocsTask(SyncRepositoryMixin, Task):
YAMLParseError,
BuildCancelled,
BuildUserError,
BuildUserSkip,
RepositoryError,
MkDocsYAMLParseError,
ProjectConfigurationError,
Expand All @@ -289,6 +291,7 @@ class UpdateDocsTask(SyncRepositoryMixin, Task):
exceptions_without_notifications = (
BuildCancelled,
BuildMaxConcurrencyError,
BuildUserSkip,
ProjectBuildsSkippedError,
)

Expand Down Expand Up @@ -450,8 +453,8 @@ def on_failure(self, exc, task_id, args, kwargs, einfo):
)
# Known errors in the user's project (e.g. invalid config file, invalid
# repository, command failed, etc). Report the error back to the user
# using the `message` attribute from the exception itself. Otherwise,
# use a generic message.
# using the `message` and `state` attributes from the exception itself.
# Otherwise, use a generic message and default state.
elif isinstance(exc, BuildUserError):
if hasattr(exc, 'message') and exc.message is not None:
self.data.build['error'] = exc.message
Expand Down Expand Up @@ -490,11 +493,21 @@ def on_failure(self, exc, task_id, args, kwargs, einfo):
version_type = None
if self.data.version:
version_type = self.data.version.type

# NOTE: autoflake gets confused here. We need the NOQA for now.
status = BUILD_STATUS_FAILURE
if isinstance(exc, BuildUserSkip):
# The build was skipped by returning the magic exit code,
# marked as CANCELLED, but communicated to GitHub as successful.
# This is because the PR has to be available for merging when the build
# was skipped on purpose.
status = BUILD_STATUS_SUCCESS

send_external_build_status(
version_type=version_type,
build_pk=self.data.build['id'],
commit=self.data.build_commit,
status=BUILD_STATUS_FAILURE,
status=status,
)

# Update build object
Expand Down