Skip to content

Delay builder.move until after post_build job #9179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
agjohnson opened this issue May 10, 2022 · 5 comments · Fixed by #9888
Closed

Delay builder.move until after post_build job #9179

agjohnson opened this issue May 10, 2022 · 5 comments · Fixed by #9888
Assignees
Labels
Accepted Accepted issue on our roadmap Improvement Minor improvement to code

Comments

@agjohnson
Copy link
Contributor

I hit this in testing and was raised again by a user in #9172.

The issue is that files are copied to the artifact path immediately after the build process completes. This negates any actions on the built files in post_build, as these changes won't be copied to the artifact path.

I believe the code in question is:
https://github.com/readthedocs/readthedocs.org/blob/main/readthedocs/doc_builder/director.py#L359-L360

We should only move/copy files to the artifact path after post_build job. Even with a static/configurable output path, it's a sufficiently confusing that changes to docs/_build/... don't update the hosted docs.

@agjohnson agjohnson added the Improvement Minor improvement to code label May 10, 2022
@humitos
Copy link
Member

humitos commented May 10, 2022

Why we are moving the files to an intermediate location in the first place? Why we don't upload the docs/_build/ directory directly to S3?

@humitos humitos added the Accepted Accepted issue on our roadmap label May 10, 2022
@agjohnson
Copy link
Contributor Author

This is maybe a better question. I don't know the history here actually.

@agjohnson
Copy link
Contributor Author

The original issue that this solved was when we used to do multiple builds in the same path. Seems we can use a hardcoded, known path instead here now.

@ericholscher
Copy link
Member

ericholscher commented Nov 30, 2022

Adding this to an upcoming sprint, so we can prioritize it 👍

This feels like something we can fix before we do any other build process refactoring, so probably worth doing first, or as part of the migration to using _readthedocs/html as the output dir.

@humitos
Copy link
Member

humitos commented Dec 1, 2022

Yeah, I think at this point this has to be solved directly by outputting the files from the default build process into _readthedocs/html so move together to the standardization of this.

@humitos humitos moved this from Planned to In progress in 📍Roadmap Jan 11, 2023
humitos added a commit that referenced this issue Jan 11, 2023
Always expose the final artifacts under `_readthedocs/<format>` directory, where
`<format>` can be one of the following:

- html
- htmlzip
- pdf
- epub
- json

This allows all users of the platform to have the same behavior. No matter if
they are using the default Sphinx/MkDocs builders or a custom one by using
`build.commands`.

Besides, this commit removes the "intermediary" directory used _before_
uploading the files to S3. This allows users to modify the output files directly
from `_readthedocs/<format>` using `build.jobs.post_build`, for example.

Some important notes:

- `BaseBuilder.type` is removed since it was not required anymore.
- There is no `old_artifact_path` anymore since there is no intermediary folder.
- The `sphinx-build` command is executed from path where the repository is
  cloned, instead from the the directory where the `conf.py` lives. This is to
  simplify the code and be able to use relative path for the build output
  directory (e.g. `sphinx-build -b html ... docs/ _readthedocs/html`).
- A new `BaseSphinx._post_build` method was added to cleanup the output
  directory (leave only one .epub, .html and .pdf file before moving forward).
- These changes should allow us to support _all formats_ for `build.commands`.
  We would need to check if these standard directories exist, set these formats
  in the version and upload their content.
- Upload step now validate there is one and only one PDF, ePUB and ZIP file when
  they are built. Currently, we only support one file per version. In the future
  we could remove this restriction. It fails silently for now, which is not
  great. We need to refactor this code a little to fail the build properly and
  communicate this to the user. I think it's _the correct way_ of doing it.
- This could also allow to support other ways to build PDF (SimplePDF or
 `tectonic`, see #9598) and
  also ZIP (e.g. zundler, see
  readthedocs/meta#77 (comment)).

Closes #9179
humitos added a commit that referenced this issue Jan 11, 2023
Always expose the final artifacts under `_readthedocs/<format>` directory, where
`<format>` can be one of the following:

- html
- htmlzip
- pdf
- epub
- json

This allows all users of the platform to have the same behavior. No matter if
they are using the default Sphinx/MkDocs builders or a custom one by using
`build.commands`.

Besides, this commit removes the "intermediary" directory used _before_
uploading the files to S3. This allows users to modify the output files directly
from `_readthedocs/<format>` using `build.jobs.post_build`, for example.

- `BaseBuilder.type` is removed since it was not required anymore.
- There is no `old_artifact_path` anymore since there is no intermediary folder.
- The `sphinx-build` command is executed from path where the repository is
  cloned, instead from the the directory where the `conf.py` lives. This is to
  simplify the code and be able to use relative path for the build output
  directory (e.g. `sphinx-build -b html ... docs/ _readthedocs/html`).
- A new `BaseSphinx._post_build` method was added to cleanup the output
  directory (leave only one .epub, .html and .pdf file before moving forward).
- These changes should allow us to support _all formats_ for `build.commands`.
  We would need to check if these standard directories exist, set these formats
  in the version and upload their content.
- Upload step now validate there is one and only one PDF, ePUB and ZIP file when
  they are built. Currently, we only support one file per version. In the future
  we could remove this restriction. It fails silently for now, which is not
  great. We need to refactor this code a little to fail the build properly and
  communicate this to the user. I think it's _the correct way_ of doing it.
- This could also allow to support other ways to build PDF (SimplePDF or
 `tectonic`, see #9598) and
  also ZIP (e.g. zundler, see
  readthedocs/meta#77 (comment)).

Closes #9179
@humitos humitos moved this from In progress to Needs review in 📍Roadmap Jan 11, 2023
@github-project-automation github-project-automation bot moved this from Needs review to Done in 📍Roadmap Jan 19, 2023
humitos added a commit that referenced this issue Jan 19, 2023
* Build: standardize output directory for artifacts

Always expose the final artifacts under `_readthedocs/<format>` directory, where
`<format>` can be one of the following:

- html
- htmlzip
- pdf
- epub
- json

This allows all users of the platform to have the same behavior. No matter if
they are using the default Sphinx/MkDocs builders or a custom one by using
`build.commands`.

Besides, this commit removes the "intermediary" directory used _before_
uploading the files to S3. This allows users to modify the output files directly
from `_readthedocs/<format>` using `build.jobs.post_build`, for example.

- `BaseBuilder.type` is removed since it was not required anymore.
- There is no `old_artifact_path` anymore since there is no intermediary folder.
- The `sphinx-build` command is executed from path where the repository is
  cloned, instead from the the directory where the `conf.py` lives. This is to
  simplify the code and be able to use relative path for the build output
  directory (e.g. `sphinx-build -b html ... docs/ _readthedocs/html`).
- A new `BaseSphinx._post_build` method was added to cleanup the output
  directory (leave only one .epub, .html and .pdf file before moving forward).
- These changes should allow us to support _all formats_ for `build.commands`.
  We would need to check if these standard directories exist, set these formats
  in the version and upload their content.
- Upload step now validate there is one and only one PDF, ePUB and ZIP file when
  they are built. Currently, we only support one file per version. In the future
  we could remove this restriction. It fails silently for now, which is not
  great. We need to refactor this code a little to fail the build properly and
  communicate this to the user. I think it's _the correct way_ of doing it.
- This could also allow to support other ways to build PDF (SimplePDF or
 `tectonic`, see #9598) and
  also ZIP (e.g. zundler, see
  readthedocs/meta#77 (comment)).

Closes #9179

* Build: decideif there is an output type based on the path existence

* Test: update build tests to match changes

* Test: check if the file exist before continue

* Build: simplify the code by running the commands inside the container

* Test: add checks for more commands

* Storage: use constants to make explicit artifact types

* PDF builder: raise an error if the PDF file is not found

* Build: use `relative_output_dir` and `absolute_output_dir`

* Builder: execute `sphinx-build` from the same directory as `conf.py`

I proposed to changed this CWD to be the one where the repository was cloned,
but that could lead to some unexpected behavior. So, it's better to be
conservative here and keep the CWD as it was.

There are some small downsides for this decision, but it's better than breaking
people's builds :)

* HTMLZip build: use `zip` inside the Docker container to build it

* Minor changes about docstring and final dot in a sentence :)

* Test: adapt them to match thew path and arguments changes

* pre-commit missing changes

* Sphinx builder: better default

* Update readthedocs/doc_builder/backends/sphinx.py

Co-authored-by: Eric Holscher <[email protected]>

* Update readthedocs/projects/tests/test_build_tasks.py

Co-authored-by: Eric Holscher <[email protected]>

* Minor changes to fix tests

* Docstring: explain why there is an exception handling at this place

Co-authored-by: Eric Holscher <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Accepted issue on our roadmap Improvement Minor improvement to code
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants