Skip to content

Design document for new Docker images structure #7566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 1, 2021
Merged
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
284 changes: 284 additions & 0 deletions docs/development/design/build-images.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,284 @@
Build Images
============

This document describes how Read the Docs uses the `Docker Images`_ and how they are named.
Besides, it proposes a path forward about a new way to create and name our Docker build images to allow sharing as many image layers as possible
and support installation of OS level packages as well as extra requirements.

.. _Docker Images: https://github.com/readthedocs/readthedocs-docker-images


Introduction
------------

We use Docker images to build user's documentation.
Each time a build is triggered, one of our VMs picks the task
and go through different steps:

#. run some application code to spin up a Docker image into a container
#. execute ``git`` inside the container to clone the repository
#. analyze and parse files (``.readthedocs.yaml``) from the repository *outside* the container
#. spin up a new Docker container based on the config file
#. create the environment and install docs' dependencies inside the container
#. execute build commands inside the container
#. push the output generated by build commands to the storage

*All* those steps depends on specific commands versions: ``git``, ``python``, ``virtualenv``, ``conda``, etc.
Currently, we are pinning only a few of them in our Docker images and that have caused issues
when re-deploying these images with bugfixes: **the images are not reproducible in time**.

.. note::

The reproducibility of the images will be better once these PRs are merged,
but OS packages still won't be 100% the exact same versions.

* https://github.com/readthedocs/readthedocs-docker-images/pull/145
* https://github.com/readthedocs/readthedocs-docker-images/pull/146

To allow users to pin the image we ended up exposing three images: ``stable``, ``latest`` and ``testing``.
With that naming, we were able to bugfix issues and add more features
on each image without asking the users to change the image selected in their config file.

Then, when a completely different image appeared and after testing ``testing`` image enough,
we discarded ``stable``, old ``latest`` became the new ``stable`` and old ``testing`` became the new ``latest``.
This produced issues to people pinning their images to any of these names because after this change,
*we changed all the images for all the users* and many build issues arrised!


Goals
-----

* release completely new Docker images without forcing users to change their pinned image
* allow users to stick with an image "forever" (~years)
* use a ``base`` image with the dependencies that don't change frequently (OS and base requirements)
* ``base`` image naming is tied to the OS version (e.g. Ubuntu LTS)
* allow us to add/update a Python version without affecting the ``base`` image
* reduce size on builder VM disks by sharing Docker image layers
* allow users to specify extra dependencies (apt packages, node, rust, etc)
* automatically build & push *all* images on commit
* deprecate ``stable``, ``latest`` and ``testing``
* new images won't contain old/deprecated OS (eg. Ubuntu 18) and Python versions (eg. 3.5, miniconda2)


Non goals
---------

* allow creation/usage of custom Docker images
* allow to execute arbitraty commands via hooks (eg. ``pre_build``)


New build image structure
-------------------------

.. Taken from https://github.com/readthedocs/readthedocs-docker-images/blob/master/Dockerfile

* ``ubuntu20-base``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ubuntu20 is a confusing name. We should be explicit. Is this Ubuntu 20.04? If so, it should be ubuntu-20.04-base or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use only Ubuntu LTS versions, so ubuntu20 is Ubuntu 20.04 and ubuntu22 will be Ubuntu 22.04.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could use the docker structure for labels and name this ubuntu-base:20.04 or ubuntu-base:20

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, this is already the "version" part of the images, I'm not sure if it's allowed to use : again here. Or maybe we can start naming the images as readthedocs/build-ubuntu-base:20.4 or maybe take some inspiration from circle https://circleci.com/docs/2.0/circleci-images/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ubuntu-base:20.04 is the most conventional and understandable in my opinion. Even if you only use LTS versions, the version of Ubuntu is not 20 but 20.04, it is an unnecessary and confusing convention reducing it to 20 and also Ubuntu can change their naming standards as they wish.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super worried about how we will tag our own images, the most important thing here is how we will expose them to users via the config file (build.image).

I'm fine take some inspiration from circleci and tag them as rtd/ubuntu20:py37. The main point here is that it will be exposed as ubuntu20:py37 or ubuntu20-py37 which is almost the same to the end user.


* labels
* environment variables
* system dependencies
* install requirements
* LaTeX dependencies (for PDF generation)
* other languages version managers (``pyenv``, ``nodenv``, etc)
* UID and GID

The following images all are based on ``ubuntu20-base``:

* ``ubuntu20-py*``

* Python version installed via ``pyenv``
* default Python packages (pinned versions)
* pip
* setuptools
* virtualenv
* labels

* ``ubuntu20-conda*``

* same as ``-py*`` versions
* Conda version installed via ``pyenv``
* ``mamba`` executable (installed via ``conda``)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏽😄


Note that all these images only need to run ``pyenv install ${PYTHON_VERSION}``
to install a specific Python/Conda version.

.. Build all these images with Docker

docker build -t readthedocs/build:ubuntu20-base -f Dockerfile.base .
docker build -t readthedocs/build:ubuntu20-py39 -f Dockerfile.py39 .
docker build -t readthedocs/build:ubuntu20-conda47 -f Dockerfile.conda47 .

Check the shared space between images
docker system df --verbose | grep -E 'SHARED SIZE|readthedocs'

Initial Dockerfile.* as example for this are pushed in this PR
https://github.com/readthedocs/readthedocs-docker-images/pull/166


Specifying extra user's dependencies
------------------------------------

Different users may have different requirements. We were already requested to install
``swig``, ``imagemagick``, ``libmysqlclient-dev``, ``lmod``, ``rust``, ``poppler-utils``, etc.

People with specific dependencies will be able to install them as APT packages or as extras
using ``.readthedocs.yaml`` config file. Example:

.. code:: yaml

build:
image: ubuntu20
python: 3.9
system_packages:
- swig
- imagemagick
extras:
- node==14
- rust==1.46

Important highlights:

* users won't be able to use custom Ubuntu PPAs to install packages
* all APT packages installed will be from official Ubuntu repositories
* not specifying ``build.image`` will pick the latest OS image available
* not specifying ``build.python`` will pick the latest Python version available
* Ubuntu 18 will still be available via ``stable`` and ``latest`` images
* all ``node`` (major) pre-compiled versions on ``nodenv`` are available to select
* all ``rust`` (minor) pre-compiled versions on ``rustup`` are available to select
* knowing exactly what packages users are installing,
could allow us to prebuild extra images: ``ubuntu20-py37+node14``

.. admonition:: Implementation

We talked about using a ``Dockerfile.custom`` and build it on every build.
However, at this point it requires extra work to change our build pipeline.
We decided to install OS packages from the application itself for now using
Docker API to call ``docker exec`` as ``root`` user.

This reduces the amount of work required but also allows us to add this feature
to our current existing images (they require a rebuild to add ``nodenv`` and ``rustup``)


Updating versions over time
---------------------------

How do we add/upgrade a Python version?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Python patch versions can be upgraded on the affected image.
As the ``base`` image won't change for this case, it will only modify the layers after it.
All the OS package versions will remain the same.

In case we need to *add* a new Python version, we just need to build a new image based on ``base``:
``ubuntu20-py310`` that will contain Python 3.10 and none of the other images are affected.
This also allow us to test new Python (eg. 3.11rc1) versions without breaking people's builds.


How do we upgrade system versions?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We usually don't upgrade these dependencies unless we upgrade the Ubuntu version.
So, they will be only upgraded when we go from Ubuntu 18.04 LTS to Ubuntu 20.04 LTS for example.

Examples of these versions are:

* doxygen
* git
* subversion
* pandoc
* swig
* latex

This case will introduce a new ``base`` image. Example, ``ubuntu22-base`` in 2022.
Note that these images will be completely isolated from the rest and don't require them to rebuild.
This also allow us to test new Ubuntu versions without breaking people's builds.

How do we add an extra requirement?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In case we need to add an extra requirement to the ``base`` image,
we will need to rebuild all of them.
The new image *may have different package versions* since there may be updates on the Ubuntu repositories.
This conveys some small risk here, but in general we shouldn't require to add packages to the base images.

Users with specific requirements could use ``build.system_packages`` and/or ``build.extras`` in the config file.

How do we remove an old Python version?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

At some point an old version of Python will be deprecated (eg. 3.4) and will be removed.
To achieve this, we can just remove the Docker image affected: ``ubuntu20-py34``,
once there are no users depending on it anymore.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider raising some sort of warning in the build that surfaces on the build page. We shouldn't support these old unsupported versions forever just because somebody has forgotten they're using it.


We will know which projects are using these images because they are pinning it in the config file.
We could show a message in the build output page and also send them an email with the EOL date for this image.

Deprecation plan
----------------

It seems we have ~50Gb free on builders disks.
Considering that the new images will be sized approximately (built locally as test):

* ``ubuntu20-base``: ~5Gb
* ``ubuntu20-py27``: ~150Mb
* ``ubuntu20-py36``: ~210Mb
* ``ubuntu20-py39``: ~20Mb
* ``ubuntu20-conda47``: ~713Mb

which is about ~6Gb in total, we still have plenty of space.

We could keep ``stable``, ``latest`` and ``testing`` for some time without worry too much.
New projects shouldn't be able to select these images and they will be forced to use ``ubuntu20``
if they don't specify one.

We may want to keep the two latest Ubuntu LTS releases available in production.
At the moment of writing this they are:

* Ubuntu 18.04 LTS (our ``stable``, ``latest`` and ``testing`` images)
* Ubuntu 20.04 LTS (our new ``ubuntu20``)

Once Ubuntu 22.04 LTS is released, we should deprecate Ubuntu 18.04 LTS,
and give users 6 months to migrate to a newer image.


Work required
-------------

There are a lot of work to do here.
However, we want to prioritize it based on users' impact.

#. allow users to install packages with APT

* update config file to support ``build.system_packages`` config
* modify builder code to run ``apt-get install`` as ``root`` user

#. allow users to install extras via config file

* update config file to support ``build.extras`` config
* modify builder code to run ``nodenv install`` / ``rustup install``
* re-build our current images with pre-installed nodenv and rustup
* make sure that all the versions are the same we have in production
* deploy builders with newer images

#. pre-build commands (not covered in this document)

#. new structure

* update config file to support new image names for ``build.image``
* automate Docker image building
* deploy builders with newer images


Conclusion
----------

I don't think we need to differentiate the images by its state (stable, latest, testing)
but by its main base differences: OS and Python version.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

The version of the OS will change many library versions,
LaTeX dependencies, basic required commands like git and more,
that doesn't seem to be useful to have the same OS version with different states.

Allowing users to install system dependencies and extras will cover most of the support requests we have had in the past
It also will allow us to know more about how our users are using the platform to make future decisions based on this data.
Exposing users how we want them to use our platform will allow us to be able to maintain it longer,
than giving them totally freedom on the Docker image.