Delay `builder.move` until after `post_build` job #9179

agjohnson · 2022-05-10T00:52:49Z

I hit this in testing and was raised again by a user in #9172.

The issue is that files are copied to the artifact path immediately after the build process completes. This negates any actions on the built files in post_build, as these changes won't be copied to the artifact path.

I believe the code in question is:
https://github.com/readthedocs/readthedocs.org/blob/main/readthedocs/doc_builder/director.py#L359-L360

We should only move/copy files to the artifact path after post_build job. Even with a static/configurable output path, it's a sufficiently confusing that changes to docs/_build/... don't update the hosted docs.

The text was updated successfully, but these errors were encountered:

humitos · 2022-05-10T07:40:07Z

Why we are moving the files to an intermediate location in the first place? Why we don't upload the docs/_build/ directory directly to S3?

agjohnson · 2022-05-10T18:53:15Z

This is maybe a better question. I don't know the history here actually.

agjohnson · 2022-11-22T01:27:34Z

The original issue that this solved was when we used to do multiple builds in the same path. Seems we can use a hardcoded, known path instead here now.

ericholscher · 2022-11-30T22:48:44Z

Adding this to an upcoming sprint, so we can prioritize it 👍

This feels like something we can fix before we do any other build process refactoring, so probably worth doing first, or as part of the migration to using _readthedocs/html as the output dir.

humitos · 2022-12-01T12:05:07Z

Yeah, I think at this point this has to be solved directly by outputting the files from the default build process into _readthedocs/html so move together to the standardization of this.

Always expose the final artifacts under `_readthedocs/<format>` directory, where `<format>` can be one of the following: - html - htmlzip - pdf - epub - json This allows all users of the platform to have the same behavior. No matter if they are using the default Sphinx/MkDocs builders or a custom one by using `build.commands`. Besides, this commit removes the "intermediary" directory used _before_ uploading the files to S3. This allows users to modify the output files directly from `_readthedocs/<format>` using `build.jobs.post_build`, for example. Some important notes: - `BaseBuilder.type` is removed since it was not required anymore. - There is no `old_artifact_path` anymore since there is no intermediary folder. - The `sphinx-build` command is executed from path where the repository is cloned, instead from the the directory where the `conf.py` lives. This is to simplify the code and be able to use relative path for the build output directory (e.g. `sphinx-build -b html ... docs/ _readthedocs/html`). - A new `BaseSphinx._post_build` method was added to cleanup the output directory (leave only one .epub, .html and .pdf file before moving forward). - These changes should allow us to support _all formats_ for `build.commands`. We would need to check if these standard directories exist, set these formats in the version and upload their content. - Upload step now validate there is one and only one PDF, ePUB and ZIP file when they are built. Currently, we only support one file per version. In the future we could remove this restriction. It fails silently for now, which is not great. We need to refactor this code a little to fail the build properly and communicate this to the user. I think it's _the correct way_ of doing it. - This could also allow to support other ways to build PDF (SimplePDF or `tectonic`, see #9598) and also ZIP (e.g. zundler, see readthedocs/meta#77 (comment)). Closes #9179

Always expose the final artifacts under `_readthedocs/<format>` directory, where `<format>` can be one of the following: - html - htmlzip - pdf - epub - json This allows all users of the platform to have the same behavior. No matter if they are using the default Sphinx/MkDocs builders or a custom one by using `build.commands`. Besides, this commit removes the "intermediary" directory used _before_ uploading the files to S3. This allows users to modify the output files directly from `_readthedocs/<format>` using `build.jobs.post_build`, for example. - `BaseBuilder.type` is removed since it was not required anymore. - There is no `old_artifact_path` anymore since there is no intermediary folder. - The `sphinx-build` command is executed from path where the repository is cloned, instead from the the directory where the `conf.py` lives. This is to simplify the code and be able to use relative path for the build output directory (e.g. `sphinx-build -b html ... docs/ _readthedocs/html`). - A new `BaseSphinx._post_build` method was added to cleanup the output directory (leave only one .epub, .html and .pdf file before moving forward). - These changes should allow us to support _all formats_ for `build.commands`. We would need to check if these standard directories exist, set these formats in the version and upload their content. - Upload step now validate there is one and only one PDF, ePUB and ZIP file when they are built. Currently, we only support one file per version. In the future we could remove this restriction. It fails silently for now, which is not great. We need to refactor this code a little to fail the build properly and communicate this to the user. I think it's _the correct way_ of doing it. - This could also allow to support other ways to build PDF (SimplePDF or `tectonic`, see #9598) and also ZIP (e.g. zundler, see readthedocs/meta#77 (comment)). Closes #9179

* Build: standardize output directory for artifacts Always expose the final artifacts under `_readthedocs/<format>` directory, where `<format>` can be one of the following: - html - htmlzip - pdf - epub - json This allows all users of the platform to have the same behavior. No matter if they are using the default Sphinx/MkDocs builders or a custom one by using `build.commands`. Besides, this commit removes the "intermediary" directory used _before_ uploading the files to S3. This allows users to modify the output files directly from `_readthedocs/<format>` using `build.jobs.post_build`, for example. - `BaseBuilder.type` is removed since it was not required anymore. - There is no `old_artifact_path` anymore since there is no intermediary folder. - The `sphinx-build` command is executed from path where the repository is cloned, instead from the the directory where the `conf.py` lives. This is to simplify the code and be able to use relative path for the build output directory (e.g. `sphinx-build -b html ... docs/ _readthedocs/html`). - A new `BaseSphinx._post_build` method was added to cleanup the output directory (leave only one .epub, .html and .pdf file before moving forward). - These changes should allow us to support _all formats_ for `build.commands`. We would need to check if these standard directories exist, set these formats in the version and upload their content. - Upload step now validate there is one and only one PDF, ePUB and ZIP file when they are built. Currently, we only support one file per version. In the future we could remove this restriction. It fails silently for now, which is not great. We need to refactor this code a little to fail the build properly and communicate this to the user. I think it's _the correct way_ of doing it. - This could also allow to support other ways to build PDF (SimplePDF or `tectonic`, see #9598) and also ZIP (e.g. zundler, see readthedocs/meta#77 (comment)). Closes #9179 * Build: decideif there is an output type based on the path existence * Test: update build tests to match changes * Test: check if the file exist before continue * Build: simplify the code by running the commands inside the container * Test: add checks for more commands * Storage: use constants to make explicit artifact types * PDF builder: raise an error if the PDF file is not found * Build: use `relative_output_dir` and `absolute_output_dir` * Builder: execute `sphinx-build` from the same directory as `conf.py` I proposed to changed this CWD to be the one where the repository was cloned, but that could lead to some unexpected behavior. So, it's better to be conservative here and keep the CWD as it was. There are some small downsides for this decision, but it's better than breaking people's builds :) * HTMLZip build: use `zip` inside the Docker container to build it * Minor changes about docstring and final dot in a sentence :) * Test: adapt them to match thew path and arguments changes * pre-commit missing changes * Sphinx builder: better default * Update readthedocs/doc_builder/backends/sphinx.py Co-authored-by: Eric Holscher <[email protected]> * Update readthedocs/projects/tests/test_build_tasks.py Co-authored-by: Eric Holscher <[email protected]> * Minor changes to fix tests * Docstring: explain why there is an exception handling at this place Co-authored-by: Eric Holscher <[email protected]>

agjohnson added the Improvement Minor improvement to code label May 10, 2022

humitos added the Accepted Accepted issue on our roadmap label May 10, 2022

humitos added this to 📍Roadmap May 30, 2022

humitos moved this to Planned in 📍Roadmap May 30, 2022

This was referenced Aug 25, 2022

Zipped HTML includes unnecessary files #9068

Closed

Build: improve upload step #9448

Closed

Also use _readthedocs/html path for normal builds #9554

Closed

agjohnson mentioned this issue Nov 22, 2022

Why the option build.jobs.post_build in the .readthedocs.yaml doesn't effect the published page? #9172

Closed

humitos mentioned this issue Dec 12, 2022

Modifications to built HTML in a post_build job do not show up in the site preview #9797

Closed

choldgraf mentioned this issue Dec 12, 2022

Testing hidden folders with RTD 2i2c-org/team-compass#585

Closed

ericholscher assigned humitos Jan 4, 2023

humitos moved this from Planned to In progress in 📍Roadmap Jan 11, 2023

humitos mentioned this issue Jan 11, 2023

Build: standardize output directory for artifacts #9888

Merged

humitos moved this from In progress to Needs review in 📍Roadmap Jan 11, 2023

humitos closed this as completed in #9888 Jan 19, 2023

github-project-automation bot moved this from Needs review to Done in 📍Roadmap Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delay `builder.move` until after `post_build` job #9179

Delay `builder.move` until after `post_build` job #9179

agjohnson commented May 10, 2022

humitos commented May 10, 2022

agjohnson commented May 10, 2022

agjohnson commented Nov 22, 2022

ericholscher commented Nov 30, 2022 •

edited

Loading

humitos commented Dec 1, 2022

Delay builder.move until after post_build job #9179

Delay builder.move until after post_build job #9179

Comments

agjohnson commented May 10, 2022

humitos commented May 10, 2022

agjohnson commented May 10, 2022

agjohnson commented Nov 22, 2022

ericholscher commented Nov 30, 2022 • edited Loading

humitos commented Dec 1, 2022

Delay `builder.move` until after `post_build` job #9179

Delay `builder.move` until after `post_build` job #9179

ericholscher commented Nov 30, 2022 •

edited

Loading