|
| 1 | +Build Images |
| 2 | +============ |
| 3 | + |
| 4 | +This document describes how Read the Docs uses the Docker Build Images and how they are named. |
| 5 | +Besides, it proposes a new way to create and name them to allow |
| 6 | +sharing as many image layers as possible to support more customization while keeping the stability. |
| 7 | + |
| 8 | + |
| 9 | +Introduction |
| 10 | +------------ |
| 11 | + |
| 12 | +We use Docker images to build user's documentation. |
| 13 | +Each time a build is triggered, one of our VMs picks the task |
| 14 | +and go through different steps: |
| 15 | + |
| 16 | +#. run some application code to spin up a Docker image into a container |
| 17 | +#. execute git inside the container to clone the repository |
| 18 | +#. analyze and parse files from the repository *outside* the container |
| 19 | +#. create the environment and install docs' dependencies inside the container |
| 20 | +#. execute build commnands inside the container |
| 21 | +#. push the output generated by builds commands to the storage |
| 22 | + |
| 23 | + |
| 24 | +*All* those steps depends on specific commands versions: ``git``, ``python``, ``virtualenv``, ``conda``, etc. |
| 25 | +Currently, we are pinning only a few of them in our Docker images and that have caused issues |
| 26 | +when re-deploying these images with bugfixes: **the images are not reproducible in time**. |
| 27 | + |
| 28 | +.. note:: |
| 29 | + |
| 30 | + The repoducibility of the images will be fixed once |
| 31 | + https://github.com/readthedocs/readthedocs-docker-images/pull/145 and |
| 32 | + https://github.com/readthedocs/readthedocs-docker-images/pull/146 |
| 33 | + get merged. |
| 34 | + |
| 35 | +To allow users to pin the image we ended up exposing three images: ``stable``, ``latest`` and ``testing``. |
| 36 | +With that naming, we were able to bugfix issues and add more features |
| 37 | +on each image without asking the users to change the image selected in their config file. |
| 38 | + |
| 39 | +Then, when a completely different image appeared and after testing ``testing`` image enough, |
| 40 | +we discarded ``stable``, old ``latest`` became the new ``stable`` and old ``testing`` became the new ``latest``. |
| 41 | +This produced issues to people pinning their images to any of these names because after this change, |
| 42 | +*we changed all the images for all the users* and many build issues arrised! |
| 43 | + |
| 44 | + |
| 45 | +Goals |
| 46 | +----- |
| 47 | + |
| 48 | +* release a completely new Docker image without forcing users to change their pinned image |
| 49 | +* allow users to stick with an image "forever" (~years) |
| 50 | +* use a ``base`` image with the dependencies that don't change frequently (OS and base requirements) |
| 51 | +* reduce size on builder VM disks by sharing Docker image layers |
| 52 | +* deprecate ``stable``, ``latest`` and ``testing`` |
| 53 | +* allow use custom images for particular users/customers by sharing most layers |
| 54 | +* create a small ``nopdf`` image version without LaTeX dependencies for local development |
| 55 | + |
| 56 | + |
| 57 | +New build image structure |
| 58 | +------------------------- |
| 59 | + |
| 60 | +.. Taken from https://github.com/readthedocs/readthedocs-docker-images/blob/master/Dockerfile |
| 61 | +
|
| 62 | +* ``ubuntu20-base`` |
| 63 | + * labels |
| 64 | + * environment variables |
| 65 | + * system dependencies |
| 66 | + * install requirements |
| 67 | + * user requirements |
| 68 | + * plantuml, imagemagick, rsgv-convert, swig |
| 69 | + * sphinx-js dependencies |
| 70 | + * rust |
| 71 | + * UID and GID |
| 72 | + |
| 73 | +* ``ubuntu20-pdf`` (from ``ubuntu20-base``) |
| 74 | + * PDF/LaTeX dependencies |
| 75 | + |
| 76 | +* ``ubuntu20`` (from ``ubuntu20-pdf``) |
| 77 | + * all Python versions (2, 3.6, 3.7, 3.8, 3.9) |
| 78 | + * conda |
| 79 | + * future extra user requirements |
| 80 | + * labels |
| 81 | + |
| 82 | +We will also build a ``nopdf`` version to allow quick testing in local development: |
| 83 | + |
| 84 | +* ``ubuntu20-nopdf`` (from ``ubuntu20-base``) |
| 85 | + * same as ``ubuntu20`` but based on ``ubuntu20-base`` instead |
| 86 | + |
| 87 | +.. note:: |
| 88 | + |
| 89 | + I don't think it's useful to have ``ubuntu20-py37`` exposed to users, |
| 90 | + since the Python version is selected by using the config file's ``python.version`` keyword, |
| 91 | + we only update patch versions and we don't remove them (unless together with OS changes). |
| 92 | + |
| 93 | +.. Build all these images with Docker |
| 94 | + docker build -t readthedocs/build:ubuntu20-base -f Dockerfile.base . |
| 95 | + docker build -t readthedocs/build:ubuntu20-nopdf -f Dockerfile.nopdf . |
| 96 | + docker build -t readthedocs/build:ubuntu20-pdf -f Dockerfile.pdf . |
| 97 | + docker build -t readthedocs/build:ubuntu20 -f Dockerfile . |
| 98 | +
|
| 99 | + Check the shared space between images |
| 100 | + docker system df --verbose | grep -E 'SHARED SIZE|readthedocs' |
| 101 | +
|
| 102 | +
|
| 103 | +Custom images |
| 104 | +------------- |
| 105 | + |
| 106 | +There are some dependencies that are not easy to update and keep compatibility with all the users at the same time. |
| 107 | +Upgrading ``nodejs`` may make lot of old projects expecting the older version to start failing all their builds. |
| 108 | +On the other hand, sticking with an old version avoid users requiring a newer version to build their documentation. |
| 109 | +To handle this case and others, we have been thinking on supporting custom Docker images. |
| 110 | + |
| 111 | +It's not clear to me how it would be the implementation of this, but I see different paths to discuss and explore: |
| 112 | + |
| 113 | +#. Allow a ``build.dockerfile`` config pointing to a ``Dockerfile`` |
| 114 | + * ``FROM readthedocs/build:ubuntu20`` is required to be a valid image (to share layers) |
| 115 | + * the image is build each time a build is triggered consuming build time |
| 116 | +#. Create a branch per custom image in ``readthedocs-docker-images`` repository |
| 117 | + * use ``ubuntu20`` as base image and add the custom extra requirements |
| 118 | + * build the image using our current process (Docker Hub) |
| 119 | + * add the custom image to our ``-ops`` repository |
| 120 | + * re-build builders to pull down the new custom image |
| 121 | + * set the project to use this custom image, eg. ``readthedocs/build:<project-slug>`` |
| 122 | + |
| 123 | + |
| 124 | +Updating versions over time |
| 125 | +--------------------------- |
| 126 | + |
| 127 | +How do we add/upgrade a Python version? |
| 128 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 129 | + |
| 130 | +Python patch versions can be upgraded and backported to all the images without problems. |
| 131 | +There is only needed to rebuild ``ubuntu20`` and most of the layers will remain shared with ``-base`` and ``-pdf``. |
| 132 | + |
| 133 | +In case we need to *add* a new Python version, the situation is similar. |
| 134 | +We can add the new version by using ``pyenv`` and rebuilding the ``ubuntu20`` image. |
| 135 | + |
| 136 | + |
| 137 | +How do we upgrade system versions? |
| 138 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 139 | + |
| 140 | +We usually don't upgrade these dependencies unless we upgrade the Ubuntu version. |
| 141 | +So, they will be only upgraded when we go from Ubuntu 18.04 LTS to Ubuntu 20.04 LTS for example. |
| 142 | + |
| 143 | +Examples of these versions are: |
| 144 | + |
| 145 | +* doxygen |
| 146 | +* git |
| 147 | +* subversion |
| 148 | +* pandoc |
| 149 | +* nodejs / npm |
| 150 | +* swig |
| 151 | +* rust |
| 152 | + |
| 153 | + |
| 154 | +How do we add an extra requirement? |
| 155 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 156 | + |
| 157 | +If a user asks for a new requirement (eg. azure CLI, ``az`` command) it should go into the |
| 158 | +"user requirements" section in the ``ubuntu20-base`` image. |
| 159 | +However, that will force us to rebuild all the images. |
| 160 | + |
| 161 | +We could use the section named as "future user extra requirements" for this, |
| 162 | +and it will force us to only rebuild the ``ubuntu20`` image. |
| 163 | + |
| 164 | +Both approaches will require to rebuild all the custom docker images from our users/customers |
| 165 | +that are based on the ``ubuntu20`` image. |
| 166 | + |
| 167 | + |
| 168 | +How do we remove an old Python version? |
| 169 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 170 | + |
| 171 | +At some point an old version of Python will be deprecated (eg. 3.4) and will be removed from our Docker images. |
| 172 | +These versions should only be removed when the OS in the ``base`` is upgraded (eg. from ``ubuntu20`` to ``ubuntu22``). |
| 173 | + |
| 174 | + |
| 175 | +Deprecation plan |
| 176 | +---------------- |
| 177 | + |
| 178 | +It seems we have ~50Gb free on builders disks. |
| 179 | +Considering that the new images will be sized approximately (built locally as test): |
| 180 | + |
| 181 | +* ``base``: ~2.5Gb |
| 182 | +* ``nopdf``: ~5.5Gb |
| 183 | +* ``pdf``: ~1.5Gb |
| 184 | + |
| 185 | +which is about ~10Gb in total, we will still have space to support multiple custom images. |
| 186 | + |
| 187 | +We could keep ``stable``, ``latest`` and ``testing`` for some time without worry too much. |
| 188 | +New projects shouldn't be able to select these images and they will be forced to use ``ubuntu20`` |
| 189 | +or any other custom image. |
| 190 | + |
| 191 | +We may want to keep the three latest Ubuntu LTS releases available in production. |
| 192 | +At the moment of writing this they are: |
| 193 | + |
| 194 | +* Ubuntu 16.04 LTS (we are not using it anymore) |
| 195 | +* Ubuntu 18.04 LTS (our ``stable``, ``latest`` and ``testing`` images) |
| 196 | +* Ubuntu 20.04 LTS (our new ``ubuntu20``) |
| 197 | + |
| 198 | +Once Ubuntu 22.04 LTS is released, we should deprecate Ubuntu 16.04 LTS, |
| 199 | +and give users 6 months to migrate to a newer image. |
| 200 | +User with custom images based on Ubuntu 16.04 LTS will be forced to migrate as well. |
| 201 | + |
| 202 | + |
| 203 | +Conclusion |
| 204 | +---------- |
| 205 | + |
| 206 | +I don't think we need to differentiate the images by its state (stable, latest, testing) |
| 207 | +but by its main base difference: OS. The version of the OS will change many library versions, |
| 208 | +LaTeX dependencies, basic required commands like git and more, |
| 209 | +that doesn't seem to be useful to have the same OS version with different states. |
| 210 | + |
| 211 | +Also, splitting images by Python version sounds complicated to maintain. |
| 212 | +Each time we need to make a small change into one of the base layers, we will end up rebuilding many images. |
| 213 | +Besides, the key ``python.version`` won't make sense anymore and bring confusions. |
| 214 | + |
| 215 | +Custom images is something that needs more exploration still, |
| 216 | +but both proposals seem doable in weeks as an initial proof of concept. |
0 commit comments