-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
detect removed/unavailable/404 repository and take generated output offline #8570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
That would actually help us with some spam issues we're having 🙂 But detecting if a project is "gone" might be very resource-intensive, I assume? (Like, polling every repository of every project on RTD with some cadence). A different case is checking when the project admin changes the URL. I think @humitos proposed something along these lines. But still, doesn't solve your original request, I think. |
RTD Stats 2020 says 240k projects... WEW I did not expect that many. once every 3 months for everything? that'd be ~3k to check every day. or going by how long ago last activity was noted? that should keep all those active ones out of the set to be checked. I'm guessing you usually get all of that triggered via web hooks or something, so the "last modified" information would at least take care of itself. I see there is a |
So, we can't just delete docs if the repo is not accessible, it could be a temporal problem or an error. Deleting docs is an irreversible operation that should be taken by the user, not automatically. And also, some people point to an invalid repo as a way to mark the project as abandoned/disabled. See also #8143. |
And, if this is about take the over the project, we have a policy for that https://docs.readthedocs.io/en/stable/abandoned-projects.html |
No, this is about that: And I think this might be a general pattern, also for other forked projects, and abandoned projects with a live fork, so I'm explicitly not here for this one instance. There is an Abandoned Projects policy, so this could be a utility to detect those |
Still, we can't just delete those projects, I understand that this is a problem, but we don't dictate the content users should publish. If this is just to mark or identify those type or projects, we already have #3382 open. The abandoned project policy is used per-user request, isn't something we would do automatically. There were some other ideas about having a "verified" status for projects, so they have more priority over forks/clones. |
@crackwitz Hi! You can probably reduce this problem a lot by enabling Pull Request builder (see https://blog.readthedocs.com/pull-request-builder-general-availability/) if you haven't already. That way, any person forking the project doesn't have the need to import it under Read the Docs because their PR will automatically build on RTD under the official (your) project. |
We definitely can't delete people's documentation "automatically" or "semi-automatically" based on these rules. It's easy to do it wrong and too risky. Besides, even if the repository linked was deleted/moved/anything the documentation may still be relevant/important --we can't make the assumption "since the repository was deleted, the documentation should be deleted as well" |
Yea, I'm going to close this issue. We are working towards making a policy for removing unofficial, outdated docs, which is what I think this issue is mostly about. So this will be solved via human judgement, not automated systems 👍 |
Details
Expected Result
I think that if a source repo is removed, so should the generated documentation on RTD. If such an action is delayed for some "grace period", I think that's reasonable.
Actual Result
Generated output is still there (see URL above) even though the repo was taken offline by the owner because it's a stale fork of another repo that was upstreamed into the actual open source project.
The text was updated successfully, but these errors were encountered: