Skip to content

Design docs for the yaml file #3878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 3, 2018
158 changes: 158 additions & 0 deletions docs/design/yaml-file.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
YAML Configuration File
=======================

Backgroud
---------

The current YAML configuration file is in beta state.
There are many options and features that it doesn't support yet.
This document will serve as a design document for discuss how to implement the missing features.

Scope
-----

- Finish the spec to include all the missing options
- Have consistency around the spec
- Proper documentation for the end user
- Allow to specify the spec's version used on the YAML file
- Show the YAML file on the build process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should collect and show metadata on the yaml config and build configuration, not just display the yaml config.

- Show/suggest a YAML file at the project creation if it hasn't one
- Have one source of truth for global configurations

RTD settings
------------

No all the RTD settings are applicable to the YAML file,
others are applicable for each build (or version),
and others for the global project.

Not applicable settings
~~~~~~~~~~~~~~~~~~~~~~~

Those settings can't be on the YAML file because:
may depend for the initial project setup,
are planned to be removed,
security and privacy reasons.

- Project Name
- Repo URL
- Repo type
- Privacy level (this feature is planned to be removed [#privacy-level]_)
- Project description (this feature is planned to be removed [#project-description]_)
- Single version (*)
- Default branch
- Default version (*)
- Domains (*)
- Active versions (*)
- Translations
- Subprojects
- Integrations
- Notifications

.. note::
The items marked with ``(*)`` can be considered to be global settings.
But aren't too relevant right now or aren't related to the builds.

Global settings
~~~~~~~~~~~~~~~

Those settings will be read from the YAML file on the ``default branch``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the part that I'm the most unsure of. It feels quite magical to have the "default branch" be the place to set global project settings -- but I don't have a better option except to just keep those settings in the database, that I can think of.

It's really tricky to think about the best way to define redirects, for example. I could see reasons for them to be per-branch, but also wanting them to expand across all existing branches.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, there aren't too many global configurations (relevant ones). Analytics code and redirects are the most relevant here in my opinion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking this for redirects on the yaml file #2904 (comment)

What could be the case for per-version redirects? Just to keep them grouped? or maybe for the redirect page option? I think that could be a nice feature then, allow to specify the lang and version on the redirect page option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of reading the global settings from the default branch. This is the solution I was proposing in another discussion. I think we do need a place different that the DB to keep track/history for these settings and the best place I can think of is the default branch.

Also, I think we shouldn't put here any setting related to the build process. I'd like to be able to build very old versions of the documentation using the specific branch/tag without any failure. For this case, these global settings shouldn't affect the build process.

I'd say that global options that need to be overriden in a specific version for any reason (I don't have one, but Eric mentioned this case) could just have the same section of the default branch and override the particular setting: we can merge the global file with the version specific one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question here is: are we planning to remove all the data from the DB or we want to use the YAML from the default branch to populate our DB each time we receive a webhook?

On building process of stable we need,

  • check out default branch and pull
  • load the YAML in memory
  • check out stable
  • show YAML in build commands output
  • use that YAML to build

looks there are some coming and going here

or,

  • populate our DB when we receive a webhook on default branch (previously done)
  • check out stable
  • build using DB (as we currently do)

here we loose the ability to print the YAML

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, didn't understand well, are you proposing to use two files? (one for the global options and other for the regular ones)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, didn't understand well, are you proposing to use two files?

No. Just one file with different sections as we are doing now (maybe those sections can be overriden in the per-version yml in case it's needed)

Copy link
Member

@ericholscher ericholscher Apr 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What could be the case for per-version redirects?

Say you rename a file in v2:

  • intro -> getting-started

When a user goes to /en/v1/intro/ you don't want them to get redirected to /en/v1/getting-started/, because it will 404. But on v2, that's exactly what you want.

This is currently solved by us only doing redirects on a 404, but there are likely other examples of design decisions that you want "from now until the future, but not in the past". So perhaps redirects are version specific, but then we need to keep state for every version somewhere, probably a copy of the YAML file at the root of the build, so the web server can read it and take into account redirects?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to keep track of the YAML files, putting them in the DB for each Version object isn't a bad idea, but either way we need to keep that state in a persistent way.

I think the bigger question is relying on the default_branch to effect other builds. It feels really confusing as a user, and there will be times when we get into a state where the default_branch syncing has failed at some point, and there isn't a good way to debug it. I think global state should probably continue to live in the DB itself, since that is 100% obvious and consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to echo @ericholscher opinion here. Project level state should always be in the database, and we aren't looking to get rid of database storage of this data. We should focus on per-version options only. YAML state should not populate database state, only supplement it.

I think we should store config metadata on the build, not just the YAML file. That is, we should store metadata on what we determine the build settings were through YAML and db setting merge at build.

I think we can move redirects to per-version state and remove changes to global state from this spec. Redirects should still exist as project-level options as well, but I think a YAML config spec can supplement more per-version redirects. We can talk more on this implementation in a later design decision, but having it in the spec would be good. Again, we wouldn't alter user-viewable state with per-version redirects, but we'd have to hide these from users and just use them in our redirect logic.


- Language
- Programming Language
- Project homepage
- Tags
- Analytics code
- Redirects

Local settings
~~~~~~~~~~~~~~

Those configurations will be read from the YAML file in the current version that is being built.

Several settings are already implemented and documented on
https://docs.readthedocs.io/en/latest/yaml-config.html.
So, they aren't covered with much detail here.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I included it here anyway or maybe just do this on the spec?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this document is just the intended changes to the spec. It's not important to enumerate the entire spec here. We'll do this in a programmatic way next.


- Documentation type
- Project installation (virtual env, requirements file, sphinx configuration file, etc)
- Additional builds (pdf, epub)
- Python interpreter

Configuration file
------------------

Format
~~~~~~

The file format is based on the YAML spec 1.2 [#yaml-spec]_
(latest version on the time of this writing).

The file must be on the root directory of the repository, and must be named as:

- ``.readthedocs.yml``
- ``readthedocs.yml``

Conventions
~~~~~~~~~~~

The spec of the configuration file must use this conventions.

- Use `[]` to indicate an empty list
- Use `null` to indicate a null value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay for using YAML spec operators properly! :)

- Use `all` (internal string keyword) to indicate that all options are included on a list with predetermined choices.
- Use `true` and `false` as only options on boolean fields

Spec
~~~~

The current spec is documented on https://docs.readthedocs.io/en/latest/yaml-config.html
and https://github.com/rtfd/readthedocs-build/blob/master/docs/spec.rst
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the spec.rst file should be completely removed since it's obsolete and a duplication of the yaml-config.html (which is way better)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we should remove the old spec.


Copy link
Member Author

@stsewd stsewd Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add the new spec here or maybe on another doc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you can write a general idea of each of the new fields here with the default/available options here.

Then we can discuss more specific details on each issue and update this document, probably.

But I think this document should reflect where we are going.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next step is to write schema so we can lay out our ideas before making code changes. This will be easier to parse, test, and verify than writing documentation.

There is some python tooling to help, ie: https://github.com/23andMe/Yamale

Bonus points if we can find a way to automate generating documentation/examples from our schema or validation models.

Versioning the spec
-------------------

TODO
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the more simple solution is to maintain a separated library for this (like https://github.com/rtfd/readthedocs-build/, but is to be moved to the rtd core, so I'm not sure).


Adoption of the configuration file
----------------------------------

When a user creates a new project or it's on the settings page,
we could suggest her/him an example of a functional configuration file with a minimal setup.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll also want some on boarding steps that point to the settings page.


Main source for global configurations
-------------------------------------

There are some global settings that are needed for the build process (like language).
So it's needed to read this configurations from one source of truth before each build.
This source can be taken from the ``default branch`` setting.
RTD will checkout to this branch and read this configurations before the real build process starts.

That solves one problem, but RTD still need to know when to update the others global settings.
Would be a waste of resources to made a new build each time a global setting is updated for it to take effect.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that maybe it's fine as it is, if the webhook is configured on the project, it will triggered a build anyway.

Currently, RTD keeps a dedicated local repository for each version, which is updated before a build.
RTD could have a central repository for this operations [#one-checkout]_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to remove global settings altogether. Project-level settings should remove database level settings.


The build process
-----------------

- The repository is updated
- Checkout to the default branch and read the global settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global settings step can be skipped with removal of this requirement

- Checkout to the current version and read the local settings
- Before the build process the YAML file is shown (similar to ``cat config.py`` step).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike this pattern, and dislike cat conf.py as well. I'm -1 on this, but we should collect metadata on the actual config we use to build. We can selectively show users this config

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cat conf.py looks more a debugging output than a final user output.

I think it's easier to collect metadata from the YAML file, but the conf.py could be more error prone to do. Although, we may just need things like theme_options html_context or similar that we are interested in. That's another topic, though.

- Try to parse the YAML file (the build fails if there is an error)
- The version is built according to the settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an additional step of first bringing in settings from the database, and then merging the yaml config on top of this.


Dependencies
------------

Current repository which contains the code related to the configuration file:
https://github.com/rtfd/readthedocs-build
Copy link
Member Author

@stsewd stsewd Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Footnotes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to put a section for the footnotes because Sphinx rendered it on the place were they are declared.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, still are rendered on the same place :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps its a character issue? All of these have -, the spec doesn't mention valid characters though.

---------

.. [#privacy-level] https://github.com/rtfd/readthedocs.org/issues/2663
.. [#project-description] https://github.com/rtfd/readthedocs.org/issues/3689
.. [#yaml-spec] http://yaml.org/spec/1.2/spec.html
.. [#one-checkout] https://github.com/rtfd/readthedocs.org/issues/1375