Skip to content

Make repositories inaccessible to search engines? #7462

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jd41 opened this issue Sep 6, 2020 · 6 comments
Closed

Make repositories inaccessible to search engines? #7462

jd41 opened this issue Sep 6, 2020 · 6 comments

Comments

@jd41
Copy link

jd41 commented Sep 6, 2020

The use case I am interested in (which I guess is not unprecedented) is that I want to test new stuff in a fork of an OSS project's documentation, but do not want that fork's documentation to be found online (lest people searching for documentation are confused when they hit upon my unmaintained fork in 3 years, rather than the original - another such fork already exists and is in the search index). So I am fine with my fork being public on GitHub, but not fine with the readthedocs documentation pages being in the Google/Bing/... search index. I read that there used to be an option to set repositories to "private", so they wouldn't be found with Google anymore. I guess this option would have been exactly what I wanted, but it seems to be gone. Is there an alternative today?

@jd41
Copy link
Author

jd41 commented Sep 6, 2020

(ironically enough, I was confused for a bit searching and not finding this "private" setting in repositories - after I read about it on a PDF created on RTD of an old fork of the Readthedocs documentation, which was not updated but is still in the Google search index)

@humitos
Copy link
Member

humitos commented Sep 7, 2020

Read the Docs community (readthedocs.org) does not support PRIVATE versions, Read the Docs for Business (readthedocs.com) does.

However, if you just want to avoid search engines to index your documentation, you can use a robots.txt file. Take a look at https://docs.readthedocs.io/en/latest/hosting.html#custom-robots-txt-pages

I think that should be enough for your use case, right?

@humitos humitos added the Needed: more information A reply from issue author is required label Sep 7, 2020
@jd41
Copy link
Author

jd41 commented Sep 7, 2020 via email

@no-response no-response bot removed the Needed: more information A reply from issue author is required label Sep 7, 2020
@humitos
Copy link
Member

humitos commented Sep 8, 2020

Yes. You can mark them all as hidden and double check that the robots.txt file for your project is being generated properly by RTD.

@jd41
Copy link
Author

jd41 commented Sep 8, 2020 via email

@humitos
Copy link
Member

humitos commented Sep 8, 2020

I guess so. I'm not sure how that affect crawlers. If that's a problem, we should open another issue and track it there. Another user already reported this at #5391 (comment). Please, subscribe there to keep updated. I'm going to close this one since the original question was answered. Thanks!

@humitos humitos closed this as completed Sep 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants