-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Is it possible to control the "force" resp. "-E" option for Sphinx? (turns out it's actually the "-d" option) #4363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
mmm, I always thought that option is for reuse with the same builder only (not with different builders). |
No, Sphinx parses all source files and stores the "document" (and the "environment") in a representation that's independent of the builder. Normally that doesn't make a big difference, since parsing the source files is quite fast, but in my case it's not. But creating the HTML output and the LaTeX output is very quick if the "environment" is unchanged. In my case that could easily save several minutes build time. |
I'd like to take a deeper look at this, but at first sight, I think it's at least, weird :). I already wanted to remove the Today, I found that the only place where So, I suppose that if you trigger a build manually I've seen some projects having problems with memory limits in the past weeks and maybe this could be one of the reasons (#4403) |
Even if we remove the "First time" build will happen regularly since RTD cleans it cache after a couple of hours if the project didn't triggered a build. Also, we currently have 4 build servers, so if the task is executed in 1 and then in 2, those two will be "first time" builds. That said, I'm not sure that removing this option will change the building time/memory used too much but just make RTD builds fail randomly instead. |
@humitos since the source files are preserved in the first builder (html), the further builders will reuse that (pdf, json, singlehtml, etc), that would save some time in the same build process each time. |
@stsewd that's true, but does the same apply for the |
TL;DR: I think I found the solution: all builders should use the same
Sure, I should have provided those in my original comment. It's https://splines.readthedocs.io/, an example build is https://readthedocs.org/projects/splines/builds/7518379/. I didn't notice this before, but it looks like the After having a closer look, I now think I know what's the problem: each Sphinx call on RTD seems to use a separate That means each builder uses their own "environment" and therefore it can never be re-used between builders! I think this is a big waste of resources, isn't it? Here is another example: https://readthedocs.org/projects/nbsphinx/builds/7532399/
If done correctly, this should cause a significant reduction in building time. I think it makes sense to re-build the "environment" for each new version of the project, because problems could arise with certain changes of the configuration. I guess the current usage of But it should be safe to re-use the "environment" for all builders of one given version. While the current usage of
They should re-use the "environment" of the first builder, but apparently they don't.
|
oh, I see, I think rtd have separate build dirs to be able to copy each generated resource in a clean way, but I'm not sure if that is really necessary, maybe we can reuse the same directory. |
@stsewd Please note that the "environment" directory (given with the I guess it makes sense to have separate "output" directories for the builders in order to easily copy files around, but I don't see why it would make sense to have separate "environment" directories. |
At first sight, it makes sense to me what you said regarding sharing the same
I'd like to have more context on why this was done like this (independent environment for each step) and thinking a little more if it's something that can be really shared or it could cause some kind of conflict. It seems it was using a shared direction for doctrees some years ago and it was changed here: a5d9d45 There is no much info there, maybe @ericholscher or @agjohnson can help us here. To me, it seems like a reasonable thing revert and use a shared path. From the docs http://www.sphinx-doc.org/en/master/man/sphinx-build.html#cmdoption-sphinx-build-d
|
I don't have any background on why this would have been changed. The change seems to imply we directly had some issues sharing the doctree environment though. If I had to guess, I'd say we're safe to reuse the doctree, though I think we don't want to reuse the existing doctree on the first build maybe? I haven't dug too deep here, but it looks like we force a new doctree env on the first builder (unless it's the only builder?), and we don't force on subsequent build steps, though we are using one-off doctree envs. It seems like what we want is to always wipe the doctree env and share that between builds instead? If there isn't a good reason to separate the doctree envs, I think this makes sense. |
Yea, I don't 100% remember why this was done. I'm guessing because of build caching of changed files. So:
I'd be curious to see benchmarks here, and some real testing to see if this is an issue while running. I'm guessing it will lead to issues. |
I don't think this will happen because the documentation says that it can be shared among all the builders. Although, it definitely worth a test.
This is also that needs some testing and could help us to decide what to do here. I wouldn't take any action before having at least a couple of tests shared here for different projects. Otherwise, we will end up with strange behavior on production and we won't know why. |
This is perhaps a good candidate for a feature flag. We can test a small sample with this change. It is probably a slight speed up for builds, so might be a good addition. I'm putting this a few versions out though, it probably isn't a pressing priority right now. If we notice build bugs, we can re-evaluate priority. |
Just putting some tests here Separated doctrees
Shared doctrees
|
👍 we should go in this direction that is not harmful. |
I'm curious if the projects in prod are still active. We can probably just ping each 4 of them and ask them to configure formats, and be done with it. |
@ericholscher wrong issue I guess p: |
Ha yea, I was wondering where that comment went :D |
We are reducing build time with this. And also saving a little of space in the builders. According to sphinx, we are safe doing this: > the doctrees can be shared between all builders http://www.sphinx-doc.org/en/master/man/sphinx-build.html#cmdoption-sphinx-build-d Closes readthedocs#4363
My builds take quite a bit of time and memory because I'm using a custom Sphinx source parser that executes Jupyter notebooks (http://nbsphinx.readthedocs.io/).
I've seen that the
-E
option is used when calling Sphinx, which means that for every output format (HTML, JSON, PDF, ...) the source files are parsed (and in my case the Jupyter notebooks are executed) again and again.Parsing the source files multiple times doesn't (and shouldn't!) change anything, it just burns some of my (and your) precious build time and memory.
Is it possible to disable this on a per-project basis?
I guess there is some reason to do this in the first place, but if not, it should probably be deactivated globally?
I've looked through the source code and found that this is controlled by the
force
argument to the builder class and just about everywhere the default isforce=False
.But it seems that the actual build process is somehow started with
force=True
.The text was updated successfully, but these errors were encountered: