readthedocs · humitos · May 24, 2022 · May 18, 2021 · Oct 20, 2021 · May 23, 2022
diff --git a/docs/development/design/future-builder.rst b/docs/development/design/future-builder.rst
@@ -0,0 +1,230 @@
+Future Builder
+==============
+
+.. contents::
+   :local:
+   :depth: 2
+
+This document is a continuation of Santos' work about "`Explicit Builders`_".
+It builds on top of that document some extra features and makes some decisions about the final goal,
+proposing a clear direction to move forward with intermediate steps keeping backward and forward compatibility.
+
+.. _Explicit Builders: https://github.com/readthedocs/readthedocs.org/pull/8103/
+
+
+Goals
+-----
+
+* Keep the current builder working as-is
+* Keep backward and forward (with intermediate steps) compatibility
+* Define a clear support for newbie, intermediate and advanced users
+* Allow users to override a command, run pre/post hook commands or define all commands by themselves
+* Remove the Read the Docs requirement of having access to the build process
+* Translate our current magic at build time to a defined contract with the user
+* Provide a way to add a command argument without implementing it as a config file (e.g. ``fail_on_warning``)
+* Define a path forward towards supporting other tools
+* Re-write all ``readthedocs-sphinx-ext`` features to post-processsing HTML features
+* Reduce complexity maintained by Read the Docs' core team
+* Make Read the Docs responsible for Sphinx support and delegate other tools to the community
+* Eventually support upload pre-build docs
+* Allow us to add a feature with a defined contract without worry about breaking old builds
+* Introduce ``build.builder: 2`` config (does not install pre-defined packages) for these new features
+* Motivate users to migrate to ``v2`` to finally deprecate this magic by educating users
+
+
+Steps ran by the builder
+------------------------
+
+Read the Docs currently controls all the build process.
+Users are only allowed to modify very limited behavior by using a ``.readthedocs.yaml`` file.
+This drove us to implement features like ``sphinx.fail_on_warning``, ``submodules``, among others,
+at a high implementation and maintenance cost to the core team.
+Besides, this hasn't been enough for more advanced users that require more control over these commands.
+
+This document proposes to clearly define the steps the builder ran and allow users to override them
+depending on their needings:
+
+- Newbie user / simple platform usage: Read the Docs controls all the commands (current builder)
+- Intermediate user: ability to override one or more commands plus running pre/post hooks
+- Advanced user: controls *all the commands* executed by the builder
+
+The steps identified so far are:
+
+#. Checkout
+#. Expose project data via environment variables (\*)
+#. Create environment (virtualenv / conda)
+#. Install dependencies
+#. Build documentation
+#. Generate defined contract (``metadata.yaml``)
+#. Post-process HTML (\*)
+#. Upload to storage (\*)
+
+Steps marked with *(\*)* are managed by Read the Docs and can't be overwritten.
+
+
+Defined contract
+----------------
+
+Projects building on Read the Docs must provide a ``metadata.yaml`` file after running their last command.
+This file contains all the data required by Read the Docs to be able to add its integrations.
+If this file is not provided or malformed, Read the Docs will fail the build and stop the process
+communicating to the user that there was a problem with the ``metadata.yaml`` and we require them to fix the problem.
+
+.. note::
+
+   There is no restriction about how this file is generated
+   (e.g. generated with Python, Bash, statically uploaded to the repository, etc)
+   Read the Docs does not have control over it and it's only responsible for generating it when building with Sphinx.
+
+
+The following is an example of a ``metadata.yaml`` that is generated by Read the Docs when building Sphinx documentation:
+
+.. code:: yaml
+
+   # metadata.yaml
+   version: 1
+   tool:
+     name: sphinx
+     version: 3.5.1
+     builder: html
+   readthedocs:
+     html_output: ./_build/html/
+     pdf_output: ./_build/pdf/myproject.pdf
+     epub_output: ./_build/pdf/myproject.epub
+     search:
+       enabled: true
+       css_identifier: #search-form > input[name="q"]
+     analytics: false
+     flyout: false
+     canonical: docs.myproject.com
+     language: en
+
+.. warning::
+
+   The ``metadata.yaml`` contract is not defined yet.
+   This is just an example of what we could expect from it to be able to add our integrations.
+
+
+Config file
+-----------
+
+As we mentioned, we want all users to use the same config file and have a clear way to override commands as they need.
+This will be done by using the current ``.readthedocs.yaml`` file that we already have by adding two new keys:
+``build.jobs`` and ``build.commands``.
+
+If neither ``build.jobs`` or ``build.commands`` are present in the config file,
+Read the Docs will execute the builder we currently support without modification,
+keeping compatibility with all projects already building successfully.
+
+When users make usage of ``jobs:`` or ``commands:`` keys we are not responsible for them in case they fail.
+In these cases, we only check for a ``metadata.yaml`` file and run our code to add the integrations.
+
+
+``build.jobs``
+~~~~~~~~~~~~~~
+
+It allows users to execute one or multiple pre/post hooks and/or overwrite one or multiple commands.
+These are some examples where this is useful:
+
+- User wants to pass an extra argument to ``sphinx-build``
+- Project requires to execute a command *before* building
+- User has a personal/private PyPI URL
+- etc
+
+.. code:: yaml
+
+   # .readthedocs.yaml
+   build:
+     builder: 2
+     jobs:
+       pre_checkout:
+       checkout: git clone --branch main https://github.com/readthedocs/readthedocs.org
+       post_checkout:
+       pre_create_environment:
+       create_environment: python -m virtualenv venv
+       post_create_environment:
+       pre_install:
+       install: pip install -r requirements.txt
+       post_install:
+       pre_build:
+       build:
+         html: sphinx-build -T -j auto -E -b html -d _build/doctrees -D language=en . _build/html
+         pdf: latexmk -r latexmkrc -pdf -f -dvi- -ps- -jobname=test-builds -interaction=nonstopmode
+         epub: sphinx -T -j auto -b epub -d _build/doctrees -D language=en . _build/epub
+       post_build:
+       pre_metadata:
+       metadata: ./metadata_sphinx.py
+       post_medatada:
+
+
+.. note::
+
+   *All these commands* are executed passing all the exposed environment variables.
+
+If the user only provides a subset of these jobs, we ran our default commands if the user does not provide them
+(see :ref:`Step ran by the builder`).
+For example, the following YAML is enough when the project requires running Doxygen as a pre-build step:
+
+.. code:: yaml
+
+   # .readthedocs.yaml
+   build:
+     builder: 2
+     jobs:
+       # https://breathe.readthedocs.io/en/latest/readthedocs.html#generating-doxygen-xml-files
+       pre_build: cd ../doxygen; doxygen
+
+
+``build.commands``
+~~~~~~~~~~~~~~~~~~
+
+It allows users to have full control over the commands executed in the build process.
+These are some examples where this is useful:
+
+- project with a custom build process that does map ours
+- specific requirements that we can't/want to cover as a general rule
+- build documentation with a different tool than Sphinx
+
+
+.. code:: yaml
+
+   # .readthedocs.yaml
+   build:
+     builder: 2
+     commands:
+       - git clone --branch main https://github.com/readthedocs/readthedocs.org
+       - pip install -r requirements.txt
+       - sphinx-build -T -j auto -E -b html -d _build/doctrees -D language=en . _build/html
+       - ./metadata.py
+
+
+Intermediate steps for rollout
+------------------------------
+
+#. Remove all the exposed data in the ``conf.py.tmpl`` file and move it to ``metadata.yaml``
+#. Define structure required for ``metadata.yaml`` as contract
+#. Define the environment variables required (e.g. some from ``html_context``) and execute all commands with them
+#. Build documentation using this contract
+#. Leave ``readthedocs-sphinx-ext`` as the only package installed and extension install in ``conf.py.tmpl``
+#. Add ``build.builder: 2`` config without any *magic*
+#. Build everything needed to support ``build.jobs`` and ``build.commands`` keys
+#. Write guides about how to use the new keys
+#. Re-write ``readthedocs-sphinx-ext`` features to post-process HTML features
+
+
+Final notes
+-----------
+
+- The migration path from ``v1`` to ``v2`` will require users to explicitly specify their requirements
+  (we don't install pre-defined packages anymore)
+- We probably not want to support ``build.jobs`` on ``v1`` to reduce core team's time maintaining that code
+  without the ability to update it due to projects randomly breaking.
+- We would be able to start building documentation using new tools without having to *integrate them*.
+- Building on Read the Docs with a new tool will require:
+  - the user to execute a different set of commands by overriding the defaults.
+  - the project/build/user to expose a ``metadata.yaml`` with the contract that Read the Docs expects.
+  - none, some or all the integrations will be added to the HTML output (these have to be implemented at Read the Docs core)
+- We are not responsible for extra formats (e.g. PDF, ePub, etc) on other tools.
+- Focus on support Sphinx with nice integrations made in a tool-agnostic way that can be re-used.
+- Removing the manipulation of ``conf.py.tmpl`` does not require us to implement the same manipulation
+  for projects using the new potential feature ``sphinx.yaml`` file.